ID GYPSY68-LTR_AG repbase; DNA; ANG; 108 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY68-LTR_AG is an LTR of retrotransposon GYPSY68_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY68_AG; GYPSY68-I_AG; GYPSY68-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-108 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY68_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 176-176 (2004). XX DR [1] (Consensus) XX CC GYPSY68-LTR is a long terminal repeat of GYPSY68_AG (its CC internal portion is deposited as GYPSY68-I_AG). XX SQ Sequence 108 BP; 25 A; 12 C; 41 G; 30 T; 0 other; tgtggtatgt gagagtagca gtgtgggtgt gcgggcagta ggaggttgga cggaataaag 60 agcagacgtg tgttactgta gttccggtgt tttcgatgga atatcaca 108 // ID BEL1-I_AG repbase; DNA; ANG; 7959 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 19-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE BEL1-I_AG is an internal portion of the BEL1_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL1-I_AG; BEL1-LTR_AG; BEL1_AG; Bel clade; PHD zinc finger; env; KW integrase; reverse transcriptase. XX NM BEL1-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7959 RA Kapitonov V.V. and Jurka J.; RT "BEL1_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 8-8 (2003). XX DR [1] (Consensus) XX CC BEL1_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL1-I_AG, an internal portion of BEL1_AG is flanked by CC BEL1-LTR_AG CC LTRs. The BEL1-I_AG consensus sequence was reconstructed based on CC multiple alignment of ~20 copies. CC The consensus sequence encodes two proteins: a 1771-aa CC BEL1-I_AG-ORF1p CC (positions 1204-6516) and 502-aa BEL1-I_AG-ORF2p (positions CC 6452-7957). BEL1-I_AG_ORF1p is composed of the PDH domain (aa CC positions CC 13-54), reverse transcriptase (aa positions 745-900) and CC integrase CC (aa positions 1421-1569). CC BEL1-I_AG-ORF2p is a putative env-like protein. It is distantly CC similar to the env-like proteins encoded by Tom and Ted CC retrotransposons CC from Drosophila ananassae and Trichoplusia ni, respectively. CC Some copies of BEL1_AG are nearly identical to each other. CC Therefore, CC BEL1_AG can be still active. XX FH Key Location/Qualifiers FT CDS 1204..6516 FT /product="BEL1-I_AG-ORF1p" FT /translation="MDIIFKSNPKGNCKLCKNPDEWDTQVNCIECDRWLHL FT KCLKLEGPVKKYVCPKCYTIAEERKGNREALMQTERLLKEKTEAEKRTREE FT NERCEKEIERLEDILRNEEIHNQSDTTHLQDDLQTLTTNVNKMANLGFAPH FT KKTVLKLPDFYGNYRTWPRFKLLFEETTRTEKFSNLENLTRLQIHLKGDAL FT RSVSGLMLNPSNVDAILERLGRLYGNPVSIFNALLKDLMVVKRASLENPSS FT IIEFCNALNNMVENMTMLNQTEYLMDQRLLTDLVAKLSPDLKTRWLRDSLN FT EEGDKIKTLKDFSKWLKPTEDVAITLLAMEGGQRDRPARLNTHYSASHQIS FT NKGCLICSRPHETISCYKLKNASVNERWKMLKEKNVCTNCCKFSNHAAINC FT RSRPQCTVDGCGRRHNTILHEEKFNSMGAASKAHLNFHQNSEQYLFQVLPI FT TVYNENNSIETFALIDPGSSTSLMTESLRQKLNLHGPRKPLTLSWTNGCNQ FT VEDTSTSVSLKLRGPNGRLLYVKDIRTVKELDLPTQSINANVLKRKFSHLK FT TVNISSYKNAKPTILLGLPHAYYTQAVESKSGAPNEPVAHKTRIGWVVFGK FT CRDGDAKENQHLFTIQDKKEEEEKSMRDLMKRFFSTEEFGVRETKFTPKSK FT DHERALSVMNDTLKYTNNQYEIGLLWKDPNVSLPSSYAQALRRLESQERKM FT KGNDEMKTWYKNQITDYVQKGYARKLTPFELLNRDPKINYIPHFMVINPNK FT PTPKPRLVFDAAAKNEGISLNSTLLSGPDATTSIFGVLIRFREYPIACSGD FT IKEMFHQIRIRKEDQVAQRFLFRDNPRNEPQVYVMNVMTFGATCSPACAQF FT VKNENALKYKDKYPTAVEAIVKNHYVDDYLDSFRTINDAIKTINEVCLIHD FT RAHFFMRNFVSNCQEVIRSIPDDRSSQQELLHISNKDMNFEKILGQYWDKT FT NDVLRYKLKHTPCSIISKREMLAYLMKIYDPLGLAANYTTQAKVIIQEIWK FT TELDWDSPVPERIMEQWQRWKERIKELEHIQIPRCYSVASNIEVTELHTFV FT DASEKAFAAVVYLRTLTEKGIDVNIVAAKTRVAPIKPLSIPKLELQAAVLG FT VRLAETVKEELRITTDRDYYWSDSKTVLGWINADPQKYKQFVAVRIGEILD FT TTNASQWKWVSSESNPADEATKVVTRKSIWLNGPVFLKQREIEYRDPKLII FT THEEIRPNLMIKTIEKRTFNFIKTEWCSNWLRLKRSLAINLKYIEFLKSKV FT KRLAFSPIVEKENLDKAEKLLLQKAQWEIYEDDLVQLSLNGQVSKNSTIKN FT LNPQVIEGLLRARGRLANICYLSDDVKQPIILPKRHHVTELIIQHYHERYM FT HKKMEAVIAAIRQRFWVIDLRAVVRSVISKCQRCKNERARPIAPMMAPLPE FT SRAAVFKKPFTHTGVDYFGPMTVSIGRRVEKRWGAIFTCMTTRAIHLEIAK FT DLSTNSFIMCLKNVQHRRGKICHIYSDNGTNFVGANRQITELVERCATNGI FT KWHFNPPAAPHFGGVWERMVREVKSLLPNNDNMPEEVLRSAFIEIEFILNN FT RPLTHIPLETEDDEPLTPFHFLIGCSGEAEPTPAGISAAEASRNNWKKAQV FT ITQNYWERWLKEYLPTLAKREKWIERSDPIQPDDIVVFPDEQRVGRWLKGR FT VVEVYPAKDGQVRSAKIKVENGEYKRPVINLSVLEVKGKKIADVPSWGVKR FT PVNIAYVKKLAEQLKTPPAKRRKHLVKPYNGPVSMHYKPVSRMETNRQSFS FT " FT CDS 6452..7957 FT /product="BEL1-I_AG-ORF2p" FT /translation="MARYLCIISLLAAWKLTDNLSVKPVEEAGIFFDHEGT FT LLLKRGVWETTFHTKIHPENDTETLLTMEKEVTTVFKALSDMDTNLLNLKL FT TLQQNIRHALQLSQTAVKRRTKRSSGIFGFLKGILFGEDDIDEQLALFRAS FT EDQKLKHISEDMTHKIKQGDRLRNKLNMKIDHMNEGIKSLNKSFNENKKNV FT LIKHVTETIMLAEDIVQYITTRYLELEIQPLSIFDSTKISEKIQSRLPDGY FT TILDHPRISSKELFRGEIIVHIENVIVSQERFEIFHITVIPNLKNFTTLDL FT DENVIAINDIHYIYPTDITRYNSTHHVSSDVAVRRDLDCISSSFRHIKTEC FT LCGIKPIKNSITKFVKLSQPNKILYYSSHPNEIYLKCNKTLTHPAYQAGVI FT TLSQDCKIQTKNIEIQPTMKIEAVETKMYFKPLAKILNLSAEQKEETNMDQ FT LYLIIITSTIACVATLILGITIAFIIKQVRAKMYTLRPPPFKPSPNSNPST FT RNYGGQ" XX SQ Sequence 7959 BP; 2873 A; 1648 C; 1673 G; 1764 T; 1 other; tggtggctcc agagaggact agtagagacc tcgcaacgta caagtcggag gttaacttcc 60 gatcttgaaa cccagagtcg gcaggaacat acaagtcgga ggttagcttc cgatcttgac 120 gaacgtattc ctgaccgata gagaaaaaca accagcgctc caacggagga aggttcgtaa 180 aagcggaaaa catcgcgaat gtataatagt atccaaaaac gtaaagtgca acaaaccaaa 240 cgagtgcaaa aaacgcctga aacagtgctt aagaagtcca aggtacagag acagaattca 300 gtaaggcgcc gtacgatcac cgatatattg tgattgaata aagacaccgg tggcaaaaaa 360 aaaaaacgac gttcgctata tcgtttcggt cgtaccacag gtttgtaaac aaaacaaaaa 420 gggtgtgtgt gtgcagtgaa aaaaaaaaaa aaaaaaaaaa aacgacgttc gctatatcgt 480 ttcggtcgta ccacaggttt gtaaacaaaa caaaaagggt gtgtgtgtgc agtgaaacgc 540 aacctcagaa agggcgttta aaacggacga cctgagaaca tctgatcgaa aatcattacg 600 tagcgtacta gtaacaaagt gcagtgtaaa caattacgcg tgtggcacac attcgcagtg 660 agaacgtaat aaaacccaca agacgctaac gaccgcggtt tacggacgtt cccgtaaagc 720 accggcagtg tactccaatt acaacaacac aagtaaatgt gcggatcaag aaaagatcaa 780 cagttgggga gtgtaacaat acgggacgtg catacaaaaa agaaatcttc gagctaaacc 840 gtatccggct ttagcatttc cgggaaacca caaacataaa ggtgttgata agagtgcgaa 900 ctagtggagt agtacgtgtc cagcttattg acgtgtgttg atacggtgat cggtagccaa 960 tggtcatcgc cccggaagac ccgattatcc aacgaaaaag cgtggcggca atactaccca 1020 agcagcggat ctgaaggcga ccagaagaca ggggtactga cgacaccgcg ggggatcacc 1080 tgaagacgac ctcgaggagg acaagcaggg gcgttgtaag agcagcaacg gcacgacgac 1140 acccatccag atcggcagcg acagcccagt tactttatcc accggttacg caaagtgagt 1200 agtatggata tcatatttaa aagcaacccg aaaggcaatt gcaagctatg caagaacccc 1260 gatgagtggg acacacaagt taattgtatt gagtgtgaca gatggttaca tctcaaatgt 1320 ctgaagctag aaggtcccgt taaaaaatat gtgtgtccaa aatgctacac aatagctgag 1380 gaacgcaagg gaaataggga ggccttaatg caaacagaga ggctactaaa agaaaaaact 1440 gaagcggaaa aaagaactag agaagaaaac gaaaggtgtg agaaggaaat cgaaagacta 1500 gaagacatat taagaaatga agaaatacat aaccaatccg acacaactca cctacaagac 1560 gacctacaga cacttacaac aaacgtaaat aaaatggcaa acttgggttt tgctccacac 1620 aagaagacag ttttaaaact tccggatttc tatggcaatt atagaacatg gcctcgtttt 1680 aaactactgt tcgaagaaac cactcgaaca gaaaaatttt cgaatttgga aaaccttaca 1740 cggctccaaa ttcaccttaa gggagatgca ttgcgatccg ttagcgggtt gatgttgaac 1800 ccaagtaacg tggatgcaat cttggaaaga ttggggaggt tatatggcaa cccagttagc 1860 atttttaacg ccttactaaa agaccttatg gtggttaaac gggcatcttt ggaaaatcca 1920 tcttcgatta ttgagttctg taacgcactg aataacatgg tggaaaatat gaccatgttg 1980 aaccaaacgg agtacttgat ggaccaaagg ctccttacag atctggtcgc aaaactctct 2040 ccggacctta aaaccaggtg gctcagagat tcacttaacg aggaaggtga caaaatcaaa 2100 accttgaaag atttcagcaa atggttgaaa ccaacagaag acgtggcgat cacacttctt 2160 gctatggaag gtggtcaaag agacagaccg gcgaggctga atactcacta ttcagccagc 2220 catcaaattt caaataaagg ctgtctaatt tgcagccgtc ctcatgaaac catatcttgt 2280 tacaagctaa agaatgcctc agtaaacgaa agatggaaaa tgctgaagga gaaaaacgtt 2340 tgcactaact gctgcaaatt ctctaaccat gcggccatta actgtcgctc aaggccgcag 2400 tgtacagtgg atggttgtgg acgacgacac aacaccatat tgcatgaaga aaaattyaac 2460 tcaatgggcg cggcgtcaaa agcacattta aattttcatc aaaactcgga acaataccta 2520 tttcaagttc tgccaataac tgtctataac gaaaacaact ccatcgaaac atttgcattg 2580 atagacccag gatcctcaac gagcctcatg acagaaagcc taagacagaa actaaatctg 2640 catggcccaa ggaagccgtt aacactctcg tggacaaatg gatgcaacca ggtagaggat 2700 acaagcacgt cggtatctct gaaactcaga ggtccaaacg gcaggctgct ttatgtcaag 2760 gacattagga cagtaaaaga actggaccta cccactcaaa gcatcaatgc aaacgtgttg 2820 aagagaaaat tttcccactt aaagacggta aatatttcaa gctacaaaaa tgctaaaccc 2880 accattctgt taggacttcc acatgcttat tacacgcaag ctgtggagtc caaatcagga 2940 gcgcccaatg aaccagtggc acacaaaaca cgcattggtt gggtcgtatt tggaaagtgc 3000 agagatggtg atgcaaaaga aaatcaacat cttttcacaa tacaggataa gaaagaggag 3060 gaagaaaagt caatgaggga cctgatgaaa aggttttttt caacagaaga atttggcgta 3120 agggaaacca aattcacccc aaaatcaaag gaccatgaaa gagccctaag tgtaatgaat 3180 gacacactga aatatacaaa taatcagtat gagattggcc tactttggaa agatcccaat 3240 gtatccttac caagcagcta cgcacaggcg ctaagaaggc tcgaaagtca agaacggaaa 3300 atgaaaggta atgacgagat gaaaacctgg tataaaaatc aaattactga ttatgttcag 3360 aaaggttacg ctcgtaaact aacaccattt gaattgctga atagagatcc aaagatcaat 3420 tacattcccc attttatggt catcaatcca aataagccaa ctccaaaacc aagactggtt 3480 ttcgatgcag ctgcaaagaa cgaagggatt tcacttaact ctactctctt gtccggacca 3540 gacgccacta cgtcaatttt tggagtatta atccgctttc gcgaataccc tatcgcctgt 3600 tcaggggaca tcaaggagat gttccatcag atacggatcc gcaaagaaga tcaagtggct 3660 caacgatttt tattcaggga taatccacgc aacgaacccc aagtatacgt tatgaacgtc 3720 atgaccttcg gtgccacatg ctctcctgct tgtgcccagt tcgttaaaaa tgaaaacgcc 3780 ttaaaatata aagacaaata ccccactgca gtggaagcaa tagtaaaaaa ccactatgtt 3840 gatgattatc ttgatagttt tcggactatc aatgatgcaa tcaagactat taacgaggtt 3900 tgcctcatac atgatagagc gcatttcttt atgagaaatt tcgtttctaa ctgtcaggag 3960 gtaataagaa gcatcccaga tgatagatcc tcacaacaag agctgctgca catctctaat 4020 aaagatatga attttgagaa aattttgggg caatactggg acaaaacaaa cgatgtgtta 4080 aggtataagc ttaagcatac cccgtgttcc ataatttcaa aaagagaaat gctagcctac 4140 ttgatgaaaa tatatgatcc attggggcta gcggcaaact atactacgca agcgaaggtc 4200 atcatccaag aaatttggaa aacagaactg gattgggata gcccagtacc agaacgcata 4260 atggaacaat ggcaaagatg gaaggaaaga ataaaggaac tagaacacat acaaatacct 4320 agatgctact cggtggctag caatatcgaa gtaactgagt tacacacttt cgttgacgct 4380 tcggagaaag cgttcgcagc agtagtgtac ttaagaacat taacagaaaa ggggattgac 4440 gtaaacatag tggcggcaaa aacgagagtg gcaccaataa aaccactctc aattcctaag 4500 ctggaacttc aagcagcagt actcggagtc agactcgccg agactgttaa agaggaatta 4560 agaattacca ctgataggga ctattattgg tcagattcca aaaccgtcct aggatggatc 4620 aatgccgatc cacaaaagta caaacaattt gtggcggtaa gaattggaga gattttagat 4680 actaccaatg ctagtcaatg gaagtgggtc tcctccgaaa gtaatcctgc agacgaagca 4740 accaaggtag ttacaagaaa atctatatgg ctgaatggcc cagtatttct taaacaaagg 4800 gaaattgaat atagggaccc caagctaatc attactcatg aagaaatccg tccaaatctt 4860 atgattaaaa ccatagagaa gagaacattc aactttataa aaaccgaatg gtgttcaaat 4920 tggctaagac tgaagagatc actggcaatc aatttaaaat atatagaatt tttgaaaagc 4980 aaggtcaagc gattagcatt ttccccgata gtagaaaagg aaaacctgga taaagcagaa 5040 aaactcctat tgcaaaaggc acaatgggag atatacgaag atgatttagt tcagctttca 5100 ctcaatggac aagtctctaa aaacagcaca ataaagaatc tcaatccaca agtaatagaa 5160 ggactactac gagcaagggg acgattagca aatatatgct acctctccga tgacgtgaaa 5220 caacccataa tattacctaa gaggcatcac gtgacagaat tgatcataca gcattatcat 5280 gaacgctata tgcataaaaa aatggaagca gttattgcgg caatccggca aaggttttgg 5340 gtaatcgacc ttagggccgt ggtaagaagc gtgatcagca aatgccagcg ttgcaaaaat 5400 gaacgcgcac gtcccattgc cccgatgatg gccccccttc cagaaagccg agccgctgtg 5460 ttcaaaaaac ccttcaccca tacaggtgta gattactttg gacccatgac agtgtcaatc 5520 ggaagaaggg tagaaaaaag atggggagcg atattcacgt gtatgacaac gcgcgctata 5580 catttagaaa tcgctaaaga cttaagtaca aattccttca taatgtgcct aaaaaatgtg 5640 cagcataggc gtggaaagat ttgtcacata tacagtgaca atggtacaaa cttcgttggg 5700 gcaaacaggc aaataacgga actcgtcgaa agatgtgcaa ccaacggtat caaatggcac 5760 ttcaatcccc cggccgcgcc tcactttgga ggtgtatggg agagaatggt ccgagaggtc 5820 aagagcttgc tgccaaataa tgataatatg ccagaagaag tattaagatc ggcctttatc 5880 gagatcgaat ttattctcaa taatagacct cttactcaca tccccctcga aactgaagac 5940 gacgaacctc tcacaccgtt tcacttcttg atagggtgtt ccggagaggc cgaacctacg 6000 ccagccggga tttcagcagc tgaagctagc agaaacaact ggaagaaggc acaagttatc 6060 acccaaaact attgggaacg ttggttgaag gagtacctcc caacattagc caaacgagaa 6120 aagtggatag aacgctcaga cccaatacaa cctgatgaca tagtcgtctt cccagacgaa 6180 caacgcgtgg gtaggtggtt aaagggccgg gtagtagaag tttatcccgc taaagacggg 6240 caagttagat ccgcaaagat taaagttgaa aacggcgaat acaaacgccc tgttatcaac 6300 ttatcagtac tagaagtaaa gggcaagaaa attgcagacg taccttcgtg gggagttaaa 6360 agaccggtca acatcgccta cgtcaagaaa ttagctgaac aattaaaaac tcctcctgca 6420 aaaaggagga agcatctcgt aaaaccttat aatggcccgg tatctatgca ttataagcct 6480 gttagccgca tggaaactaa cagacaatct ttcagttaaa ccagttgaag aagctggtat 6540 attcttcgac cacgaaggaa cgcttctctt gaaaaggggt gtgtgggaaa caaccttcca 6600 cacgaaaata caccccgaaa atgacacaga gactttactg acaatggaga aagaggtgac 6660 cacagtattc aaggcactga gtgacatgga cactaatctt ctaaatttga aattgacatt 6720 acaacaaaac attcgacacg cacttcaact ctcacagact gcagtaaaaa gacgaaccaa 6780 acgatctagc ggcatatttg gatttttgaa aggtattcta tttggagaag acgatattga 6840 tgaacagtta gccctcttta gagcttctga agaccagaaa ttgaaacata tatcggaaga 6900 catgactcat aaaatcaagc agggtgacag acttagaaac aaactaaata tgaaaataga 6960 ccacatgaac gaaggtatta agagtcttaa taaaagcttc aatgaaaaca aaaaaaacgt 7020 acttataaag catgtcacag aaacaatcat gctagctgaa gacatagtac aatacattac 7080 gacaaggtat ctagagttag aaatccagcc ccttagcata ttcgactcga cgaaaatttc 7140 cgagaagata caatcaaggt tacccgatgg gtatacaatt ctagaccacc cccgaatttc 7200 tagcaaagag ttatttaggg gagaaataat agtacatatt gaaaacgtca tcgtttcgca 7260 agagagattc gaaatattcc atatcactgt aataccaaac ctaaaaaact ttacaactct 7320 ggacttagat gaaaatgtaa tagctataaa cgatatacac tatatatacc ctacagatat 7380 tacgcgatac aatagcactc atcacgtgtc ctccgatgta gccgttagaa gagatttgga 7440 ttgtatatcc tcatctttca gacatataaa aacagaatgc ttgtgtggta taaaaccaat 7500 taaaaactca ataaccaaat ttgtaaagct ctcccagccg aacaaaatcc tatactactc 7560 ttctcatcct aacgaaatat acctcaaatg taacaaaaca ctgacacatc ccgcgtatca 7620 agctggggta ataacactta gccaagactg taaaatacag accaagaaca tagaaatcca 7680 gccaaccatg aaaattgaag ctgtagaaac aaaaatgtat ttcaagccgc tggccaaaat 7740 actgaattta agcgcagagc aaaaagaaga aacaaatatg gatcagctct acctcataat 7800 aattacaagt acaatagcct gcgtcgccac tttaatatta ggaataacca tagcatttat 7860 tatcaaacaa gtacgagcta aaatgtacac tttacgcccc ccgccgttta aaccgtcacc 7920 aaatagtaac ccctctacga ggaattacgg gggtcagga 7959 // ID BEL2-LTR_AG repbase; DNA; ANG; 458 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE BEL2-LTR_AG is a long terminal repeat from the BEL2_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL2-I_AG; BEL2-LTR_AG; BEL2_AG; Bel clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-458 RA Kapitonov V.V. and Jurka J.; RT "BEL2_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 11-11 (2003). XX DR [1] (Consensus) XX CC BEL2-LTR_AG is a long terminal repeat from BEL2_AG CC retrotransposon. CC There are more than 50 copies of BEL2-LTR_AG in the genome. CC They are ~1% divergent from the consensus sequence. CC See comments for BEL2-I_AG. XX SQ Sequence 458 BP; 128 A; 106 C; 96 G; 128 T; 0 other; tgttacgcac gcgattctaa atttcttttt tttaccaatc taatcgtagt aaccattgtt 60 tatccttcca agtcaaaata agatgtttct ggaaacacga attatcccgt ccgaacggat 120 attcgggtaa acttgggagg ataattttag tataaaagta ggccatattt tgaataaagc 180 aggattcaaa cgcagaagct cttggagtaa agaggtttaa ttctaaatat ccgaaatagg 240 acagagcccc tggatttccg agccaaacac ccccttccgt ctgcataggc ttcccaacct 300 tggatcagaa gcaactagcg ggaaggactg gttagtaggc gaagtatcaa atactctgtc 360 cactctgctg ctgtttgttt ctcctccgtt acaaccctgc aactcttgct gagggccaaa 420 ccttcgcgtg gcctgccgga gaagctcggt tcgtaaca 458 // ID GYPSY1-LTR_AG repbase; DNA; ANG; 146 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY1-LTR_AG is an LTR of the GYPSY1_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY1-I_AG; GYPSY1-LTR_AG; GYPSY1_AG; Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-146 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "GYPSY1_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 74-74 (2003). XX DR [1] (Consensus) XX CC GYPSY1-LTR is a long terminal repeat of GYPSY1_AG (it internal CC portion is deposited as GYPSY1-I_AG). XX SQ Sequence 146 BP; 39 A; 27 C; 29 G; 51 T; 0 other; tgtggtggat tgcttatagc aatgtcttag ttgtaatcaa gtagccattg gttacagcaa 60 ccgtggttac actgaatatc gaataaatgt agttagtctt agttcactct tagaacggtt 120 tacatcgtct tttaatcgct cccaca 146 // ID Copia-7_AG-LTR repbase; DNA; ANG; 168 BP. XX AC . XX DT 01-SEP-2010 (Rel. 15.09, Created) DT 01-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Copia-7_AG-LTR. XX KW Copia; LTR Retrotransposon; Transposable Element; Copia-7_AG-LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-168 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1459-1459 (2010). XX DR [1] (Consensus) XX CC LTR consensus sequence of Copia-7. CC The LTRs in all the sequences are identical. XX SQ Sequence 168 BP; 64 A; 26 C; 32 G; 46 T; 0 other; tgtaaacgaa gtagccgcgt gtaatcgaag tagtctttaa tagatttggg aacgatcgtt 60 tcattacaag ttagtcgcaa ataaaccatt gacgagaacg gtcgcattaa agagaaactt 120 gttaaaataa aaaacttcca aagtatttca aagaacagag ttccttta 168 // ID MTANGA_I repbase; DNA; ANG; 4046 BP. XX AC AF387862; XX DT 09-MAR-2006 (Rel. 11.02, Created) DT 09-MAR-2006 (Rel. 11.02, Last updated, Version 1) XX DE Anopheles gambiae mtanga retrotransposon (internal portion). XX KW Copia; LTR Retrotransposon; Transposable Element; KW Interspersed repeat; MTANGA_LTR; internal portion; MTANGA_I. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4046 RA Rohr C.J., Ranson H., Wang X. and Besansky N.J.; RT "Structure and evolution of mtanga, a retrotransposon actively RT expressed on the Y chromosome of the African malaria vector RT Anopheles gambiae."; RL Mol Biol Evol 19(2), 149-162 (2002). XX DR EMBL/GenBank/DDBJ; AF387862; Positions 120 4165. XX FH Key Location/Qualifiers FT CDS 46..1473 FT /product="MTANGA_I_1p" FT /translation="MDGGKFCIEKLRSGGYETWRFKVEMLLVRENLWKFES FT TAAPESLTETWKEGDAKARATIALLVDDCQHPLIRDCKTAKGTWDALQKHH FT QKTTMSTKVSLLKKLCKAEYDESGDMEAHLFRMDELFSSLMNAGQELDSSL FT KVAMVLKSMPESYDHLTTTLETRSDDDLTMELVKSKLLDEAQKRMEKSHQS FT ESILRVGPEKKITCHRCRKPGHMKRDCPMESNNTPTSTTMRDYSRKNENCS FT SSGGQRESLKPKPKGKVAKSSEFSFTVGVVSKKREQPKSWIIDSGATSHMC FT CDRAFFDELKQSTGVSITLADGNETVVKGVGSGHLYYYEENGDRRKITLND FT VYYVPELESNLISVGKLVNKGAKVTFDETRGCVVECEGILATVGQWKHEGC FT SSHERLHPFMASQAASGTGTLTQFSELRVKAWRREFLSKNATFSRPVSVVL FT KGRLLENPSRRLRNDKQREFWTWYTRTSVVP" FT CDS 1473..4043 FT /product="MTANGA_I_2p" FT /translation="MNTVTSGGSRYFLTMIDDFSRYTTVYFLKRKSEAAEV FT IEEYVTMVHNRFGRNPIVIRSDQGGEYKSKRLGQFYRAKGIVPQFTAGYSP FT QQNGVAERKNRTLVEMARCMLIDAKLGYRFWAEAINAAVYLQNISSSRSIE FT KTPFELWYGKQPDYSNLHIFGCSAIVHVPAEKRSKLDPKGKKLIFVGYADN FT HKAFRFVDPTSNKITLSRDAKFIEMEDFEHAAINRSKKTNPQIVEYEFDDD FT FPFDDDSDFDDDVGDRLESEEEDSTDETLIEEELTDTDSSMCDSTNEDDGD FT DGHTLERAAEELTVRRSSRSTKGVPPQRFRETTGMVRIFLNERILITQEYC FT EPRTFEEAMSCPDRDLWKRAMEEEIKSLHENATWEIASLPKDRKAVGSKWV FT FKRKMDGDGKIVQYKARLVAKGFSQVYGADYDEVFAPVAKQTTFRTLLSIA FT ARRKLIVKHVDVKSAYLYGDLAETIYMKQPTGFEIGSKNDVCLLKKSLYGL FT KQAGRVWNQTITEVLRSLGFHSSEADPCLFVKNKRDRWSFILLYVDDMLVA FT CSEDREYEDIENTLKRHFKITTLGDVRNYLGIRIERGQNGEYLLDQASYIR FT RIAKRFGQEDARPSRIPMDPGYPKIQQKEEKPMPRNDDFQSLVGALLYVAV FT NTRPDIAISASILGRKVSNPCQADWTEAKRTLRYLNSTADLKLKLGEDGQL FT EAYVDADWAGDHQDRKSNSGFIFHLGGPISWSARKQQCVTLSSTEAEYVAL FT AEACRELLWLQKLMKDVGEKTTGPIVIREDNQSCLAMLPAEGGCRRTKHID FT TRYNFIRDLVNNNVIQVQYCPSERMIADALTKPLSKVKLVTCRKKLGLQSL FT QLEE" XX SQ Sequence 4046 BP; 1196 A; 808 C; 1064 G; 978 T; 0 other; ggttatgggc ccagacaagt attttagaag caagtaatat ttgaaatgga tggcggaaag 60 ttttgtatcg agaagctgcg tagcggtgga tatgaaacat ggcgattcaa agtcgaaatg 120 ttgctagtcc gggaaaacct gtggaagttc gagtcaacgg ccgcccctga atcattgacc 180 gaaacctgga aggaaggtga cgctaaagcc cgggcaacga tcgctcttct tgtggatgac 240 tgccagcacc ctttgatccg tgactgcaag acggccaaag gtacgtggga tgcactccag 300 aaacatcatc aaaaaacgac gatgtctact aaggtatcgc tcctgaagaa actctgcaag 360 gcggagtacg acgagagcgg tgacatggag gcgcatttgt ttcgaatgga tgaacttttc 420 tcaagtctga tgaacgcggg ccaggaactg gattcgagct tgaaagtggc catggttctg 480 aaaagtatgc cagaatccta cgatcatctc acaacgacgt tggaaactcg ttcggacgac 540 gacctgacca tggaattggt gaaaagtaag ctgttagatg aagcgcagaa aaggatggag 600 aaatcccatc aaagcgaatc catcttacgc gttggtcctg agaagaagat tacgtgtcat 660 cgatgtcgaa aacccgggca tatgaaacgt gattgtccga tggaatcgaa taatacgcct 720 acatctacta cgatgcgtga ttattcgagg aaaaatgaaa attgttcttc ttccggtgga 780 caaagagagt ctttgaagcc caagcctaag ggtaaagtag caaaatcttc tgaattttcg 840 tttacagtag gcgtggtatc caagaaacga gagcaaccga agtcttggat aattgactct 900 ggagccacct cacatatgtg ttgcgatcga gcattctttg atgagctcaa acaaagcaca 960 ggtgtgagca tcactctagc cgatggcaat gagacggttg tcaagggtgt tggttccggt 1020 catttgtact actatgagga aaatggagat cgacggaaga ttactctgaa cgacgtgtat 1080 tatgttccgg aattagagtc caatttgatt tctgttggaa aactcgtcaa caaaggagcg 1140 aaggtaacgt ttgatgaaac ccgaggctgt gtagtggaat gcgaaggtat cctagcaacg 1200 gttggacaat ggaagcatga aggttgttca tcacacgaaa gactgcatcc attcatggca 1260 tcgcaagctg caagcggcac cgggacccta acgcaattca gcgagttgcg cgtgaaggct 1320 tggcgaaggg aatttctatc aaaaaatgcg acattttcca gacctgtgag tgttgtgttg 1380 aagggaagat tgctcgaaaa cccttcccgc cgattacgga acgacaaaca acgagagttt 1440 tggacctggt acacacggac atctgtggtc ccatgaatac agtgacgtca ggtggatcac 1500 gctacttctt aacaatgatc gatgatttta gtcgctatac aacggtgtat tttttgaaac 1560 gaaaatctga agcagctgaa gtgatcgaag agtatgtcac tatggtccat aatcgttttg 1620 gaaggaatcc aattgtcatc cgctcagacc aaggcggtga gtacaaaagc aaacgtttgg 1680 gtcaattcta tcgggccaaa ggaatcgttc cgcagtttac agcaggttac tccccgcaac 1740 aaaatggagt tgctgaacga aaaaatcgga cgttggtaga aatggctcgc tgcatgttga 1800 ttgatgcgaa actagggtat cgtttttggg ccgaagcaat caatgcagca gtatatctcc 1860 aaaacatttc gtcatcgagg tctatcgaga agacgccgtt tgaattatgg tatgggaagc 1920 aaccggacta cagtaatctg cacatctttg gttgctcagc tattgttcat gtgccagctg 1980 agaagcgcag taaactcgat ccaaagggga agaagcttat atttgtggga tatgctgaca 2040 atcataaagc gtttcgtttt gtagatccta cctccaataa gataaccctt agccgtgatg 2100 cgaagtttat cgaaatggaa gatttcgaac atgccgcgat taatcgttcg aaaaagacga 2160 atcctcaaat tgttgaatac gaattcgacg atgattttcc tttcgatgac gattccgatt 2220 tcgatgacga tgtcggtgat cgtctagaat ctgaagaaga agacagcacc gatgaaactc 2280 taattgagga ggagctaacg gatacggatt ccagtatgtg cgattctacg aacgaagatg 2340 acggtgatga tggacacact ctcgaaaggg cagctgagga gctgacggtt cgacgatcat 2400 cccgtagcac aaagggtgta ccacctcaac gtttccgaga aacaactgga atggtaagaa 2460 ttttcttaaa tgaaagaatt ttaatcaccc aggagtactg cgagccccgt acatttgaag 2520 aagccatgag ttgcccagat cgagacttgt ggaagcgtgc aatggaggag gagatcaagt 2580 ccttgcacga aaatgccaca tgggaaatcg cttctttgcc caaagaccgc aaagctgtgg 2640 gcagtaagtg ggtttttaaa agaaagatgg acggcgacgg caaaatagtc caatacaagg 2700 caaggcttgt tgcaaaaggt ttcagccaag tgtatggtgc ggactatgac gaagtttttg 2760 cccccgtggc gaagcagacc acttttcgca ctttgttatc aatcgctgct agaagaaagt 2820 tgattgtgaa acatgtggac gttaagtcgg catacttgta tggagacctg gcagagacga 2880 tctatatgaa acaaccgact ggatttgaaa ttggttcaaa aaatgacgtg tgtctgctga 2940 aaaagagtct atatggtttg aagcaagcag gaagagtatg gaaccaaact ataaccgaag 3000 tgcttcggag tttggggttc cattcttcag aagcagatcc ttgtctgttt gtgaagaaca 3060 agcgtgatcg atggtcattc atactgttgt atgtggatga catgcttgtt gcttgctctg 3120 aggacagaga gtatgaagac atcgaaaaca ctttgaaacg acattttaag ataaccaccc 3180 tgggcgatgt aaggaattat ctgggtatac gaatcgaacg aggacagaat ggggaatatt 3240 tgttagatca agcctcgtac attcgtagga ttgctaaaag attcggacaa gaagatgcaa 3300 gaccatcccg tatcccaatg gaccccggtt acccgaaaat acagcaaaag gaggagaaac 3360 caatgcctcg aaacgacgat ttccaaagct tggtgggtgc tttgctgtac gttgcggtca 3420 acacgcggcc cgacattgct atcagtgcat ccatcctggg gcgcaaggtg agcaatcctt 3480 gccaagctga ttggacggaa gccaagagaa cattgcgata tctgaattca acagcagatc 3540 tgaaattgaa acttggtgaa gatgggcagc ttgaagcata cgttgatgcc gattgggcgg 3600 gcgaccatca agatcgaaaa tcgaactcgg gattcatttt tcatttaggt ggaccgattt 3660 cgtggtcagc tcggaagcag cagtgcgtga ccctttcgtc gaccgaagcg gaatatgtcg 3720 ctttagcgga ggcctgtcga gagctgctct ggttgcaaaa acttatgaaa gatgtaggtg 3780 agaagaccac tggacctatc gttatccgtg aagataacca gagctgcctg gcgatgttgc 3840 cagccgaagg gggatgtaga aggactaagc acatcgacac ccggtacaat tttatccggg 3900 acctggtgaa taacaacgtt atacaggttc aatactgccc ttccgaacgt atgattgctg 3960 acgcgttgac aaagccattg tcaaaagtga agttggtaac ttgtcgtaag aagttgggac 4020 ttcaatcgtt gcagcttgag gaggag 4046 // ID GYPSY32-LTR_AG repbase; DNA; ANG; 260 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY32-LTR_AG is an LTR of retrotransposon GYPSY32_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY32_AG; GYPSY32-I_AG; GYPSY32-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-260 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY32_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 57-57 (2004). XX DR [1] (Consensus) XX CC GYPSY32-LTR is a long terminal repeat of GYPSY32_AG (its internal CC portion is deposited as GYPSY32-I_AG). XX SQ Sequence 260 BP; 58 A; 59 C; 50 G; 93 T; 0 other; tgtaatatct tagacatatg accacgatgc ggccactgtg cttcgtgtgt caaacttgat 60 catgaccgct gtcatgatca tcgtgcgcgc ggcaatgaac aataaacgtc attcgtttac 120 ggatcgtcgt accggcaaca tcttcagtct tcttctgtgt cggtttattt cttctgcctt 180 tattcttctg ccttcggact gtgctcagtt tgagttcatc ggttgatttg aattaattca 240 aattaattct aatcattaca 260 // ID PegasusA repbase; DNA; ANG; 3696 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE PegasusA is a hAT-like autonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; 8-bp TSD; KW Autonomous DNA transposon; MITE; Pegasus; PegasusA; KW hAT superfamily; transposase. XX NM PegasusA. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-159 RA Besansky J.N., Mukabayire O., Bedell A.J. and Lusz H.; RT "Pegasus, a small inverted repeat transposable element found in RT the white gene of Anopheles gambiae."; RL Genetica 98, 119-129 (1996). XX RN [2] RP 1-3696 RA Kapitonov V.V. and Jurka J.; RT "PegasusA: a family of autonomous hAT-like DNA transposons from RT African malaria mosquito."; RL Repbase Reports 3(5), 97-97 (2003). XX DR [2] (Consensus) XX CC PegasusA is a family of autonomous DNA transposons that belongs CC to the hAT superfamily. The PegasusA consensus sequence was CC built from 4 copies ~3% diverged from it. PegasusA encodes the CC 499-aa PegasusAp hAT-like transposase. CC Nonautonomous derivate of PegasusA was described as PEGASUS [1]. CC PegasusA is flanked by imperfect 16-bp terminal inverted repeats CC (4 mismatches). XX FH Key Location/Qualifiers FT CDS join(312..602,1346..2053,2317..2817) FT /product="PegasusAp" FT /note="hAT-like transposase" FT /translation="MSKTSGEKRKLREIIAEGCVGFENGHFVCTIDEVNGC FT KYRQKNEKYEPGNFIRHIRSMHPDLAKSRGLLQEDGEVPIKKRKVSKVPVA FT IDRQKLLEGGRPQENCSLPNHYLDLTEAIEKEFYQYITLVRCAVHTMQLSV FT VDVVKTFDGEIRKCTAVSNNCRKIMYKTVFNNESLPPLYSKTRWGGIFEML FT NHFFKQEDFFNELGQQHSELGEIYYINEKVFPYLNISFFSDLTEQWDFIKE FT YVVAFVPVYISTKAMQAKHTSLNDFYLSWMKTILKVGSIPGNRFVEPLTKA FT LKNRLKNLKESIVFKAALLLDPRFNYLNSKFFNQEEKEEIRSFIISTSERI FT HLCKAKVQPSSSSSPSNAPNGTLNSSSDFDLYFTELYGGSLPEQNAEEIQT FT SANKILRQLISLDAEAHQNSHFDVWNFWLMRKSTHPELYEVATLLLSVPSN FT QVSVERAFSALGLVLSDKRTRMNDDTLENILLIKLNQPLLEKILPDLYEWN FT KEDI" XX SQ Sequence 3696 BP; 1274 A; 635 C; 645 G; 1141 T; 1 other; cagtgttgcg aacggtgata gtcgtcgaca caaaatttta caaaatgata ctttttgaag 60 cagcgcattc tagatacacg ctttcgttcg acaatcatgg aaaaaaatat catttgacac 120 gatgatattt ttttgtagta ttcatttgat tgtgattttc ccgggataca agcaacttag 180 gtcatgttga acgcagttgg ctttgcgttt gtcgtcctaa aatcagtggc gttagtgaac 240 agaacaagct tattctgagt gaaacaagga agtaaagaaa cacattccaa ttctaatatt 300 cttaaataaa aatgtcaaaa acatcaggag aaaagcggaa actccgtgaa attatagcag 360 aaggttgtgt tggtttcgaa aatggacact tcgtttgtac tatcgacgaa gtaaatggtt 420 gtaaatatag gcaaaagaat gaaaaatatg agccgggtaa ttttatacgc catatacgat 480 cgatgcaccc agatttggca aaatcccgag gattgctgca agaggacggt gaagtcccca 540 taaagaagag gaaagtttcc aaagttccgg tagccattga ccggcaaaag ctcctggaag 600 gtatttaata tatttaattt ttttttctgt ggcgattacg tcaaataata tttactcaag 660 ctttattaat aacaataagt attaactaaa tgttatttaa atattaaacg ctctctaaat 720 tgtcgtttct aaaggtatga tgaagctaat atgttgccat aacgtaccta tgatgttcgt 780 cgaatgggac ggcttagggt agatcctgaa acctatttgt gatgcgttga agatgaacct 840 caaccgtgct aatattgttt gtcatcttgg agctgctgct cgaaaaattc gccaggaact 900 taccacaatt ttgaaaggaa aattcttatg cttgaaaatt gactgcgcaa ctcgtcttgg 960 acgccacata ttggggatca acatccaata ttattgtgaa ctacaaaagg atgtcatcat 1020 ttatacaatt ggtaataatc ctcattctca agttaatctt tttaaaaaca cgcatgcgat 1080 ttgcagcgag tttcatgttt tacctatcgt gaaagtattg atgaagtatg aataactctt 1140 tgtttaaatt aaattcaata ggaatggttg agctgaataa tagacacacc ggaaaatttt 1200 tgaaaacaaa gatcctagaa atactcactc aatacgaaat ttcattggaa caaattttca 1260 cggtcacctg tgataatggt gcgaacatga tcgctgccgt taagcatctt caatcagatg 1320 cccaggtcat gttcaaccca ctwgaggacg cccacaagaa aactgttcac ttccgaatca 1380 ttatttagat ttgacagaag ccattgaaaa agaattttac cagtatatca ctcttgtaag 1440 atgtgccgtt cataccatgc agctttcggt agtagatgtg gtcaagacat tcgatggaga 1500 gattcgtaaa tgtactgcag tatcaaataa ctgccggaaa attatgtaca agactgtttt 1560 taacaatgaa tctctcccac cgctctactc aaaaaccaga tggggaggaa ttttcgaaat 1620 gttgaaccat ttttttaaac aggaggattt cttcaacgaa ctgggtcagc agcattcgga 1680 attaggtgag atttattata taaacgaaaa agtttttcca tacctcaaca tttcattttt 1740 ttcagattta acagaacaat gggattttat aaaagaatat gtagtggcgt ttgtgcctgt 1800 ctatatttca acaaaagcca tgcaagccaa acatacatca ctgaacgatt tttatttatc 1860 ctggatgaag accattttga aagtcggttc aattccagga aatcgtttcg ttgaaccgct 1920 tacaaaagcc cttaaaaatc ggttaaaaaa tcttaaagaa agtattgttt ttaaggccgc 1980 gcttctactt gacccaagat tcaattattt gaactctaaa ttttttaatc aagaagaaaa 2040 agaagaaatt cgcgtgagta acatatttca gacttcattc taatttctat attctacatt 2100 acataaaaca atattttaat tacaattact taatgttgcc agcttagcct ttaaaaaata 2160 tgtttgtgac acaaattaaa gcagtacaaa tcgagttgtc ttatgcgtta ggcaataagt 2220 tgatacgtgt gacgtggatc caagatgacc acaaataact caatatttaa attaattttg 2280 tttccatatt aaaatgatat ttacgtataa ttatagagct tcatcatctc tacatcggaa 2340 cgcatccatc tatgtaaagc taaggttcaa ccatcttcat catcatcacc atctaatgct 2400 ccgaatggga cactaaatag ctccagtgat tttgatttgt attttactga actttacgga 2460 ggatcattgc cagaacaaaa tgctgaagaa atacaaacga gcgcgaacaa gattctacgg 2520 cagctgatat cattggacgc agaagcccat caaaactcgc attttgatgt ttggaatttt 2580 tggttaatgc gaaaatccac tcatcctgaa ctatatgaag tagcgacatt acttttgtca 2640 gtaccatcaa accaggtatc tgtggagcgt gcttttagcg cacttggttt ggttttatct 2700 gataaacgga ccagaatgaa cgatgatacg ttggaaaaca ttttactgat caagctgaac 2760 cagcccttgc ttgaaaaaat tctacctgat ttatatgaat ggaataagga agatatttaa 2820 ttactttaag ggggttttaa ttggggttta atattgttac acctaatata atgaaaaaat 2880 ataacagtat aagtataaca ataataaaat aaataaaacc attttttaat tgaattggcc 2940 tttgttacat aagaatttca ttctgcagga aatctttttc aatatcggca tcataaaata 3000 aaaatataac gaaccctcct agaacatttt ttttaaccgg caagtccaaa aaatgcaaat 3060 gtaagcttaa actcacgcta acattcaatt acaactcaca tatattttaa caaaccacat 3120 tgagacattt caaagataaa aaagaaaaat cagaaagggg tgaaaaactc agagttgaaa 3180 aaccttcaaa aactcttcca tgtattcttt ttttatacgg ggccatggtg gcagggaatc 3240 ttggaagtgg gtgctgtttg tgttagccaa taaactacca agcagtttta taagcaacat 3300 ttcgtggatt tctgtaccca agttccttaa aattccactc aataaattgt tttgtaatgc 3360 ctgaaaaatg taccaatctg taaaattaat gaaattgatg ctatgttaaa agcgtacact 3420 gggaaagtta aaatataaaa tttttaaaat tttatgtgaa cacctacagc cgacatcaca 3480 tttcagcgat ttgtatcgag caaaaaaatg tcaaatgaca ttcattctac tctttttgca 3540 atcatgtcat cgcaaaagca ttgattgtta ttgcacaaat gattgtcgct cttggaaagt 3600 cgtgtcatgg taaccatgaa tatcatgatc aaaaaatgat tttgacagcc gcttactgcg 3660 aatatcataa aaaaatgtcg acaattaaca acactg 3696 // ID DNA-2_AG repbase; DNA; ANG; 878 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; DNA-2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-878 RA Jurka J.; RT "Non-autonomous DNA transposons from mosquito."; RL Repbase Reports 10(9), 1427-1427 (2010). XX DR [1] (Consensus) XX CC 2bp tsd. >94% identical to consensus. XX SQ Sequence 878 BP; 229 A; 193 C; 232 G; 224 T; 0 other; gcatccacgc aaacaatgag accccaaatt ttgagtgatt caggccgagg ggtatgagga 60 agcttgagga agcaaggaga aacgcgctgc gatgagagtt cgcattctgt tttacacgcg 120 cgcttcacta acaccaatac aaataccgtg ctttgacgag ccgtctgtga atgtgtgtgt 180 gtcttctttc agcgagctca acaacttgga gctgctcgtc acatcgtgtt gtctcgctta 240 cactacagca tcgaaccatc gctccgtatg tccccccctc ctacgcgtac aaaaccacca 300 gtacagcgcg acgctttgcc ggagatggtt gtgcgcgttt tgttctaagt cccgcgcgca 360 gtgtttgtgc gttgatgcgc atttcgttat aagtctcgtg cgccggtgtg ttttggtgaa 420 gtgtgcagtg aataaacagt gtgcacagac gcacatgcag tgcgtggatc gacaggtaag 480 aatttgctga acagagtcaa cgcagattgt cgacaagttt cccgcagcct ggctcgaccc 540 ggtacttcag tatgacgcca ttcgtgagta tgaggattct aaatgtgtac tttaggtgta 600 ctttatgttt caataaagtg tttaatgtaa taaaaaatga ccgtgtttgt gtttggtttt 660 agaataagtg catgaaagaa aacgagaaac gaaagaaaca actatcttta atgtgttggt 720 agcatggctc gaaagaacac aacacattga gaactgcaga gcgcaccgtg cgttacgggt 780 ggggggggga gggctcgcgc gagggtaact cattgtcgaa gagacactgt tgagcgcact 840 caaatcactc aaaaaataca ctcattttgc cggacggg 878 // ID Clu-15B_AG repbase; DNA; ANG; 460 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-15B_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-460 RA Jurka J.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1438-1438 (2010). XX DR [1] (Consensus) XX CC TA TSD. ~94% identical to consensus. XX SQ Sequence 460 BP; 134 A; 91 C; 89 G; 145 T; 1 other; tacagggttt tcaatgatat tattgacatg ttcagcgagt tgttgattcg ttcgggcata 60 taattgacac tttcagcgag atattgaatt gttcgnaagc aggcgttgac atgttcggcg 120 attctgtagt gctgtcgttt gttttgttta cacactgtgc cgcagggttc aatttttcat 180 tcttagaagt cctaaaagta gctaataatc attccccaag ctgtttaata catgaattag 240 ttaaaacaaa catattttta cgaaaaccgt ttgtttacct tctgatgaga atgatccaag 300 ctgcagtcca aggggtacga aagtgacagc tctatagtat cgccgaactt gtcaacgcct 360 gcttccgaac aattcaatat ctcgctgaaa gtgtcaatta tatgcccgaa cgaatcaaca 420 actcgctgaa cacgtcaata atatcattga aaaccctgta 460 // ID Mariner-N21_AG repbase; DNA; ANG; 357 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 2) XX DE Nonautonomous Mariner DNA transposon - a consensus sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW Mariner-N21_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-357 RA Jurka J.; RT "DNA transposons from African malaria mosquito."; RL Repbase Reports 10(9), 1189-1189 (2010). XX DR [1] (Consensus) XX CC TA TSD. >95% identical to consensus. XX SQ Sequence 357 BP; 132 A; 64 C; 53 G; 107 T; 1 other; tacagtcttt ccccgagtta cgcgacactc gagttacgcg aattcgagtt acgcgaattt 60 tttatttttg acagttcaga tgtcaaatca gtacaatttt ctccatcaat tgtcaaatga 120 aaaataatta gcaatatttc aaaattttaa ttcactaaaa agtcagaaat aactgaattc 180 tccacaccaa ttgaatcaaa taagtgataa attacataaa tgaactaaat tcaatcaaaa 240 atcatgataa akaaagtata tttttgctgt aaaacgtgat attcgactta cgcgaaaatc 300 cgagttacgc gaatgtctcc ggaacgcatt attcgcgtaa ctcggggaaa gactgta 357 // ID GYPSY8-LTR_AG repbase; DNA; ANG; 720 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from GYPSY8_AG DE retrotransposon - a consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; GYPSY8-I_AG; KW GYPSY8-LTR_AG; Long terminal repeat; RETRO23_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-720 RA Jurka J. and Drazkiewicz A.; RT "RETRO23_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 7-7 (2002). XX RN [2] RP 1-720 RA Kapitonov V.V.; RT "GYPSY8-LTR_AG: LTR retrotransposon from Anopheles gambiae."; RL Direct Submission to Repbase Update (30-MAY-2003). XX CC 4 bp target site duplication. Name changed from RETRO23_AG_LTR CC to GYPSY8-LTR_AG [2]. XX SQ Sequence 720 BP; 172 A; 112 C; 179 G; 257 T; 0 other; tgtaatgtga tgatcgtaca actgtacttc gagtaccgtc gcggtaggtc agtgttttac 60 tgacatgagc cgaggtggaa taaaacttcg ttcgaagcgg acttttcgac acttgatcgt 120 ccaggcgatc atagaaacct ttaggaggtt tcccttactg gaaacgttgg agtttagagg 180 ccttaccttg ggtaggggga cataactaaa ctctttaaat gagaagtcga tagatgtagg 240 atgggcatag tcctatcctg acagtccgaa cgaatctgac ttgccatgat cggagttggt 300 tttatttata ctcaaaagaa gttaagggtt tctcacattt ggttccgggt tgtatgttgg 360 aaagttattg ttttcactca gctagttacg tttgtctttt gttgacaaga acgatggttc 420 ttgtcactaa gttcctgttt tgggtttaat gcatatactg ttcctgcttt gggttataat 480 gcataaactg ttcctgctta tgttgtacgt ttcggttggt tctgataagt gtgtcacaat 540 tgtttttata tgaacggtgt ggcctgagct cttccgggtg tgtatggtat gatctgattt 600 actgaatgtg ttataatgaa tgtgttacaa tggtgtggtt tggctttcca aagtgtgtta 660 caatgtgtct atagctgata gtgtgtatta taataagttc tgtatagcag atatgctaca 720 // ID P1_AG repbase; DNA; ANG; 4946 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE P1_AG, a P-like DNA transposon - a consensus sequence. XX KW P; DNA transposon; Transposable Element; P superfamily; P1_AG. XX NM P1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4946 RA Kapitonov V.V. and Jurka J.; RT "P1_AG: a family of P-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 2(11), 21-20 (2002). XX DR [1] (Consensus) XX CC The A. gambiae genome harbors many divergent families of P-like CC DNA transposons. One of those families is P1_AG. CC P1_AG elements are flanked by 8-bp target site duplications. CC Terminal inverted repeats are 11bp long. CC The P1_AG consensus sequence was reconstructed from CC several copies that are only ~1% divergent from each other. CC Presumably, P1_AG copies have multiplied in the genome CC during last 1 million years. CC The P1_AG encodes a 895-aa P-like DNA transposase called P1_AGp. CC Putative exon/intron structure (based on FGENESH). XX FH Key Location/Qualifiers FT CDS join(302..562,623..733,796..2262,2433..3278) FT /product="P1_AGp" FT /note="DNA transposase" FT /translation="MSSNYKCCVAFCKNNRYNVKKVGVNVCFHKFPEGKET FT KQKWIAFCQREISWIPSSSNVVCSQHFLPSDYQLSSSHNTKRGANWLNPEA FT VPSILLPQDTGLNLSQFNEHQQNNNNDESSRAELQGATSSHLLPDENDLIL FT IGKENVCKKCKGLQQKIFNLEAKLNELEERNKQLVTINNKLSDTLKEVNEK FT EKEHLKKLEELDKATKKIKEEWPRANFVQNMKDSLKGSLSSNQIDLILGIK FT KTVRWTKEELSLAFTLRYFSQRAYRYIGDDMKIPVPVPRTLQRYSSKIDLK FT QGILEDILKFIGSYSQTLKPMDRECVLSFDEMKVSRVLEYDPSADEIVGPF FT NYLQVVMMRGLFKQWKQPIFIGFDTKMTKDIIIEIISRLSEKDINVVAIIS FT DNCQTNIGCWKELGARDDVEKPYFPHPKTNKNVYVIPDTPHLLKLLRNWLL FT DHGFEYNGQLIETTNLLRMVAKRMESEMTPLFKLTTSHIDMTPQERQNVRR FT AAELLSRTTAVALRTYFPDDDNAKILANFIEKVDVWFSISNSYTPFAKLDF FT KKAYTASDDQVRALTDMYDIISNMTIPGKNGLQIFQRSIMMQIKSLQMVFA FT DMKLKHDAKFICTHKLNQDVLENFFSQLRQIGGVYDHPSPLSCIYRIRLMI FT LGKTPTILHNQTTTVEAANCEQDEFITTTESNVAGARSNDVFISASMFEKA FT EITPELPDVKAMEEANNFVSQDVSSLSSVSDTSIELPHQEADGLSYILGYL FT AKKHHSQFSHLNLGEHTFKTRIDHNYCQPPTFVHHLSCGGLIEPSDEFLNL FT GKKMEKIFLKMNPDGGLLKGERIVDRITNKIKRHLTELPVEIIRSFAKQRV FT IVRMRYLNLKATAEQLNKMKHKKRKFVTENTKAAKKMKKIIN" XX SQ Sequence 4946 BP; 1701 A; 839 C; 944 G; 1462 T; 0 other; caaggttatt agactgtata caggttacga caaaaccccg ttttgacaca tcttcaataa 60 aaaaatgaat taaaattttg acagacaaat tggaaagttt gtttacctac aaattgcaat 120 ccttccgaac atacagcttt cgcaattgat gagtgcttaa atactaccgt ttttctgcgt 180 ttcaattttc gtttttgtgc ttttttagtt tacggctagt taagtttatt taaggaatta 240 ggtttatcta agctaagtaa ttcattaaat agctataaat tacgaaacaa agcagcgaaa 300 tatgtcgtcc aactacaagt gttgtgtagc attttgtaaa aataatagat ataatgtaaa 360 gaaagttggt gtcaatgtat gcttccataa attcccagaa gggaaagaaa ccaaacaaaa 420 atggatagcc ttttgccaaa gggaaatatc gtggatacct tcctcaagta atgtggtgtg 480 ttcgcaacat ttcttgccat ctgattacca attgagcagc tcgcacaata ccaagcgtgg 540 ggctaattgg ctaaaccctg aaggtgagac gttatcaaac caaacgatgt tgtttcttta 600 taccgatata aactctattt tagctgtccc atctatcctg ttaccacaag acactggttt 660 gaatttgtcg cagtttaatg aacatcagca gaataataac aatgatgaat cttcccgggc 720 ggaactgcaa ggtggtatgt taatgaattg gcgatgttta tttgatccat tctaaaattt 780 atgaatatga tttcagccac ttcatctcat ctcttgccgg acgaaaatga tttaatattg 840 attggaaaag agaacgtttg caagaaatgt aagggattac aacaaaaaat atttaattta 900 gaagcaaaac ttaatgaatt agaagagcgc aacaagcaac tggtcactat aaataacaaa 960 ttaagtgata cactgaagga agtaaacgaa aaagaaaagg aacatctgaa aaagctagaa 1020 gaacttgata aagctacaaa aaaaattaaa gaagaatggc cacgcgctaa ttttgtgcag 1080 aacatgaaag attcgcttaa gggttcgcta tcatcaaatc aaattgacct tattttagga 1140 attaagaaga ctgttcgatg gactaaagaa gaactttcct tagcatttac tcttagatat 1200 tttagtcaaa gggcataccg ttacataggc gatgatatga agataccagt accggtacca 1260 agaactttac aacgatattc ttccaagatt gatctaaaac aaggcattct ggaagatata 1320 ttgaaattca ttggatcata ttcacaaaca ttaaagccta tggatcgaga atgtgtttta 1380 tcgttcgacg agatgaaggt gtctcgagta ttagaatacg atccatcggc agatgaaata 1440 gtgggcccat ttaattatct gcaagtagtc atgatgcgag gattgtttaa acaatggaaa 1500 cagccgatat ttatcggctt tgataccaaa atgacaaagg atatcattat cgaaataatt 1560 tcacgattga gcgaaaaaga tataaatgta gtcgctatca tcagtgataa ttgccaaaca 1620 aacattggat gctggaagga gctaggcgcg cgggatgacg tagaaaagcc gtattttcca 1680 catccaaaaa ctaataagaa tgtgtatgtg atccctgata cacctcattt gctgaagttg 1740 ttaaggaatt ggcttctaga tcatggtttt gagtacaacg gccaacttat cgaaaccacc 1800 aatctgttgc gtatggtagc caaaagaatg gagtcagaaa tgactccttt atttaaactc 1860 acaacatccc atatagacat gactccccaa gaacgacaaa atgttcgaag ggcagcagag 1920 ttattgtctc gtacaaccgc tgtagctctc cgtacatatt ttccggacga tgataatgct 1980 aaaattttgg ctaattttat agagaaggtg gatgtgtggt ttagcatatc taattcttac 2040 acacctttcg caaaattgga ttttaaaaaa gcatatactg ctagcgatga ccaagttaga 2100 gctttaacag atatgtatga cataatatca aacatgacta tcccaggtaa aaatggttta 2160 caaatttttc aacgttctat aatgatgcag attaagtcac tgcaaatggt gtttgcagat 2220 atgaagttaa aacacgatgc caaattcatc tgtacccata aggtaagcaa catgataatt 2280 caattgcaat taattcattg aacattcaat accatatgac tatacaggaa aatctcagct 2340 agagctggtg aattcattgt tacaaaatta atcattttaa ctaatacatt tattgttgtg 2400 tttctttttc tttttttttt ttcattttac agttaaacca agacgtatta gagaattttt 2460 tctctcagct taggcaaatt ggtggtgtat atgaccatcc ctcacctcta agttgcattt 2520 atcgaattcg tcttatgatt ttggggaaaa caccaaccat tttgcacaat caaacaacta 2580 ctgttgaggc tgcaaattgt gagcaagatg aattcattac aacaactgag agcaatgttg 2640 ctggtgctag gagtaatgat gttttcatct cagcttctat gtttgaaaag gccgaaatca 2700 ctcccgaatt accggatgtc aaggcaatgg aagaagcaaa caatttcgta tcacaagatg 2760 taagttcact tagttccgtt agtgatacaa gcattgaact acctcaccaa gaagctgatg 2820 gattgtctta catacttgga tatcttgcaa agaagcatca ctctcagttt tcccatctta 2880 atctgggaga acacacattt aagactagga ttgaccataa ctattgccaa cctcctacat 2940 ttgtacacca tttgtcctgt ggtggactga ttgaaccgtc ggatgaattt ttgaatttag 3000 ggaaaaaaat ggaaaaaata ttcttgaaaa tgaatccaga cggaggatta ttaaaaggag 3060 aacgaatcgt agatcgaatc acaaacaaga taaagcgaca tctaacagaa cttccggttg 3120 agataattcg ttcttttgca aaacagcgtg taattgttcg tatgcgctat ttgaacttaa 3180 aagcaacagc tgagcagcta aacaaaatga aacataagaa aagaaaattc gtgacagaaa 3240 atacaaaagc tgccaaaaag atgaaaaaaa ttattaatta aagtttatgg taatgtaaga 3300 ggtagcatat agcactcatt aataacgatt aagcatatat ataatacata ggactttatt 3360 ataggaataa aaatgacaat gtgataaaaa tttcatactc ctctaaattt aagttttcat 3420 tgtattgtag tattaagaag agattattaa atggattgtt aaatagaaaa taaaaaagta 3480 caatttgtgc aaaatataga gtaatttgac agtagagatt cgttgtacaa tgtaatcact 3540 cggaccacat catttttagt ttttatcttc cgtcaataac accaaccttg atatagaatg 3600 ttaatacgat gaaccaacaa aaacacccta actagccatt gctgtgtgac agtactttcc 3660 aaactacagg ccgcatgcag tcccgagtgc ctataatgtg ccccacgatg atttggcaat 3720 ttttcatcca acagattgat cattgttcat atagcagagc ggcctaccaa acagaaagat 3780 caaaagaaaa atgttgaaag ttctactaaa tatatgaacg gttaagagcc gcaatctcaa 3840 aaataaagag ctgcatgcga gccgcacttt gccgttcgct ggtctacaag ttacaaagca 3900 tttatagtac atcgttgctt tcgaaaacat aatggtgtaa tttatcatta tctgtatcat 3960 gaaaatagaa tgcaactata atcaaattaa aggaaacgtt attgatgtgt tctattccta 4020 cacgacacct tttcaacgat agtcgctacc gaaattttgt ataaagcgta accacatgtt 4080 ttttttatat gaagtttccc tgaagaaacg tcgttcgacc ggtgcatcct agagattaat 4140 catgggttag gtgttgttga attcaagctt agggtgttga agcaaagaat taatctgtat 4200 gaaggaaggt gaaggatgta gcttgtggag cagtacggtc aatgggatat tctcgaggtg 4260 taggcttgca gtataaattg gctggggatt gtttttaaaa agaactgctg tcccaccgag 4320 caggtgcttg taagatgact ctgatttatc gttgtctggc tgatgtaggt tgtccggcat 4380 tggtcacata acacagtgag ctcgatggtt catggatcaa ctagtcaatc tgtcggatga 4440 tggatgcaga cgagaggtag gaggcagtca tgaaccgacg ttaggtgcta atggagcagg 4500 acgtgctcag cagtgattgg agataaaaca aagagagggt gaaattataa aactttgcca 4560 cagtatgagt agtattgatg gtttgaaagc aacatacaac ttgcattatg ccttaattct 4620 cctatacacc tgattgtgct ctgcgatgtt cgatgtcgat ggaccttgca gaataggtca 4680 acaatttggg gatgtttttc acctggaatc agctcactca ctagaattac ctggagtctt 4740 tcagctgcag aaaaaaaatc gagaaattta ttatgcttac tacatttgtt tcgtatttta 4800 aattctaatc ttcgtcattc gtgaacagca taagctgatc aaaacaaagc cggcgtgatt 4860 tgacagacaa attgtaaaca atataaccaa catggttcca gcaaaaacga agtgtcttaa 4920 cctcagtaga atatataata accttg 4946 // ID GYPSY19-I_AG repbase; DNA; ANG; 4176 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY19-I_AG is an internal portion of retrotransposon GYPSY19_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY19-I_AG; GYPSY19-LTR_AG; GYPSY19_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4176 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY19_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 3-3 (2004). XX DR [1] (Consensus) XX CC GYPSY19_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, GYPSY23_AG, CC GYPSY24_AG, CC GYPSY25_AG, GYPSY26_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY19-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 1330-aa GYPSY19_AGp gag-pol like CC protein CC (pos. 97-4086). CC The sequence of the LTRs flanking GYPSY19-I is deposited as CC GYPSY19-LTR_AG. XX FH Key Location/Qualifiers FT CDS 97..4086 FT /product="GYPSY19_AGp" FT /translation="MIEEKPTSSGTKQNIAQSEQRTTMAKFDMEPFNKGLM FT QWARWVKRFEGAMSVFEVKSNNKKAMLLHYMGVDSYNLLCDHISPEEPEDK FT TYEQIVKCLDELFDPKPLEMVELWKFRQRLQTEGETVTEFITALQKVAANC FT DFGQYLTKALRNQLVFGVRNPRIRNRLIEERNLTLEKAKQIALAMEAAGDG FT AEVLNSRSAEVLNSRGAEVKEVNIVSNTTKKNVECYRCGESHFAYVCKHKR FT TVCKKCGKIGHLQRVCRTNNRSKSVRLIDEQHQDSEENEEVNSILINNLYQ FT NANHTAKIYITLKVNNTKIQFEVDSGSPFSIISMNDKQRWFKDIPIRESDI FT KLQSYCGGSIKLFGTISVIVENAKTKLTLFVVESKRGPIVGRTWMRDLKFD FT WNELLSKGNSYVNQIVTHSNTNQEVIKKLKEEFEVVFRKSLGEISNIQASL FT ILKENALPIFLKNRTIPFALKESVEKEINDLVNQGILIKVNRSEWATPIVP FT VKKSGNRVRLCGDYKLTVNKNLVIDEFPLPTIEELFANMAGGEKFSKIDLA FT QAYLQMTVKPEHQEFLTLNTHMGLFRPTRLMYGVASAPAIFQREITQILQG FT IPGVSVFLDDVKITAPDDKTHVERLRTVLKRFQDHNMRVNDSKCNFLADCI FT EYCGYRIDKYGIHKMKEKITAIQLMPKPRNKDEVRAFVGLVNYYARFIPNL FT SEKIYPINNLLKNEIPFEWHEGCQDAFEWIKKEMQSERILVHYDPSLPIAL FT AVDASPYGVGAVLSHIYPDGKEYPIQYASQTLSPTQQRYTQVDKEAYAIIF FT GVKKFYRYLYGRKFILITDNKPVSQILSPQKGLPTLSATRMQHYAVFLESF FT NFEIRYRPSKEHGNADGMSRLPIRDIQLEDTEEPDEIELNQIENLPVSVEE FT LSKETSKDINVQLLIDGLNSGRTVSINDRFGIDQTQFSLQKGCLMRGARVY FT VPPQLRNRVLDELHEGHFGISRMKSLARSYCWWKNMDNDIERLSKNCVSCA FT KVRKDPPKVPTHVWKRPQSVFERVHADYAGPFMGIYFLILVDAYSKWPEVK FT ITPDMNTDTTIDKMREIFATFGLPSILVTDRGTQFMSEKFQTFLKSNGITH FT KTGAPYHPATNGQAERYVQTIKDKIKTMQCHKSEIPKKLQNILLAYRKTTH FT PSTGESPSRLMLNRQIRSRLDVMVPKIEKKSNPEVVHKTTRSFAVNERVAA FT RDFLSQTEKWKFGTITKKLGKLHYEIRLDDGRMWKRHINQMRSGPEEISIH FT HSSNQNIEPWIDESVYLPDRMEEFQLPHDSETEMVRSDNINEDLACGEESQ FT TQVRRSTRTRQPPIRYRD" XX SQ Sequence 4176 BP; 1549 A; 739 C; 846 G; 1042 T; 0 other; attggcgacg agtgacaaat tctgagaacc ctggaacgct agcgctacgc agttaaagaa 60 aagctgaaac aatcgggtat aagtggagac agtgtcatga tcgaagagaa accaacaagc 120 agtggtacaa agcagaacat cgcacagagt gaacaacgca ccacaatggc aaaattcgat 180 atggaacctt ttaacaaagg gctcatgcaa tgggcccgct gggtgaaacg cttcgaagga 240 gcaatgtcgg tatttgaggt taaatcgaat aataaaaaag ccatgcttct gcattacatg 300 ggtgttgatt catacaattt attgtgtgat catatttctc cggaagaacc agaagacaaa 360 acatacgaac aaatagtgaa gtgtttggat gagctgttcg acccgaaacc cctcgaaatg 420 gtggaactat ggaagtttcg tcaacgactt cagactgaag gcgaaactgt aacggagttc 480 atcacggctt tgcaaaaagt ggcggctaat tgtgatttcg ggcaatattt gacaaaggcg 540 ctcaggaacc agctagtttt tggtgtacgg aatccaagaa tacgcaaccg gctgattgaa 600 gaaagaaatc taacactgga aaaagctaag caaattgctt tagccatgga agccgccgga 660 gatggcgctg aagtgctgaa tagcagaagt gctgaagtgc tgaatagcag aggtgctgaa 720 gtgaaagaag taaacatcgt gagcaacaca acgaagaaaa acgtcgagtg ttatagatgt 780 ggagaatcac attttgcata tgtatgcaag cacaaaagaa ccgtctgcaa aaagtgcggg 840 aaaattggac atttgcagcg cgtctgtcgc acgaacaata gaagcaaaag tgtgaggtta 900 attgatgaac aacatcaaga cagcgaggag aacgaagagg taaactcaat tttgatcaac 960 aatttatacc aaaatgcaaa ccataccgca aaaatataca taacactaaa agtcaataat 1020 actaaaattc aatttgaagt agattcagga tccccgtttt ctattatcag tatgaacgac 1080 aaacaaagat ggtttaaaga tatccctatt agagaatcag atataaaatt acaaagctat 1140 tgtggaggat ctataaagtt atttggtaca attagtgtta tagtggaaaa tgctaaaaca 1200 aaattaacac tatttgtggt agagtctaaa agaggaccta ttgtaggaag aacatggatg 1260 cgagatttga aatttgattg gaacgaatta ttaagtaagg ggaactcgta cgtaaatcaa 1320 attgtcaccc attctaatac taatcaagaa gtgataaaaa aattaaaaga agaattcgag 1380 gtagtctttc gtaagtcatt aggagagatt tcaaatattc aagcatctct tatcctaaaa 1440 gaaaatgcgt tgccaatatt tctaaaaaat cgtactatac catttgcatt aaaagaaagt 1500 gtcgagaaag aaattaacga tttagtaaat caaggaattt taataaaagt caatcgcagt 1560 gaatgggcta caccaatagt acccgtaaaa aaatcaggaa accgtgtacg attgtgtgga 1620 gattataaat taacggtgaa caaaaacttg gtaatagacg agtttccttt gcccacaata 1680 gaggagcttt ttgctaacat ggctggggga gaaaaattct cgaaaataga tttagcgcag 1740 gcatatttac agatgacagt taaacctgaa catcaggaat ttttaacact gaacactcac 1800 atgggactat tcagaccaac acggttaatg tacggggttg cttctgcccc agccatattc 1860 caaagagaaa tcacacagat tctccaagga attccaggcg tttctgtatt tttagatgat 1920 gttaaaataa cagctcctga tgacaagacc catgtggaaa gactgcgcac tgttttaaaa 1980 cgatttcaag accataatat gagagtgaat gatagcaaat gtaatttcct tgcagattgc 2040 atagaatact gtggatacag gatcgacaag tacggtattc acaaaatgaa ggagaaaatt 2100 actgccattc aactaatgcc aaaaccaagg aacaaagatg aggtaagagc ttttgtaggc 2160 cttgtgaatt attacgcgag atttatccca aatttaagtg aaaagatcta tcccataaat 2220 aatttactta aaaacgaaat tcctttcgaa tggcatgaag gttgccagga cgcatttgaa 2280 tggatcaaaa aggaaatgca atctgaacgg atcttggtac actacgatcc tagcttacca 2340 atagctctag cagtagatgc ttcgccctac ggagttggcg ccgttttaag tcacatctat 2400 ccagatggca aagaatatcc aattcaatac gcatctcaaa cactatcacc aacccaacaa 2460 agatatactc aggtggataa agaagcatat gcgattattt tcggagttaa aaagttttac 2520 cgatatcttt atggaagaaa atttatctta ataaccgata acaagccagt ttctcaaatt 2580 ctatcaccac agaaaggttt gccaacgcta tcagcaactc gcatgcaaca ttacgctgta 2640 tttctagaat cattcaattt tgagattaga tatcgaccat caaaagaaca cggaaatgct 2700 gacggaatgt cacgattgcc aatccgagac attcagctgg aggatacaga agaacccgat 2760 gaaattgaac taaatcaaat agaaaatctt cccgtatcag tagaagaact tagtaaagag 2820 actagcaaag acataaatgt tcaattacta atagacggat taaattcagg gagaacagta 2880 tccattaacg atcgttttgg aatcgaccag acacaatttt ctcttcaaaa gggatgtctt 2940 atgagaggcg ctagggtcta cgttccacct caattacgaa atcgagtctt agacgaactt 3000 catgaaggcc atttcggcat atcacgcatg aaatctcttg ctagatctta ttgttggtgg 3060 aaaaatatgg ataacgacat agaaagatta tcaaaaaact gtgtttcttg tgcaaaagta 3120 agaaaagatc ctcctaaagt accaactcat gtatggaagc gtccacaatc agtttttgaa 3180 agagtacatg cggattacgc tgggccattt atgggtatat attttcttat tctagtagat 3240 gcgtacagca aatggcctga agtaaaaata acaccggata tgaatacaga cactactata 3300 gacaaaatgc gagaaatttt tgccactttt ggcttgccat caattttagt gaccgataga 3360 ggtacacaat tcatgtcaga aaaattccaa acattcttaa aatctaacgg aataacccat 3420 aaaacaggag ctccttacca tcccgcaaca aatggtcagg cagaaagata tgtacaaaca 3480 attaaagata aaattaaaac tatgcaatgc cataaatctg aaattccaaa aaagttacaa 3540 aacatactgc tagcgtacag gaaaacaact catccaagca caggggaaag tccctcgcgg 3600 ttaatgttaa acagacagat acggtctcgg cttgatgtta tggtaccaaa aatcgaaaag 3660 aaatccaatc ccgaagtagt acacaaaacg acaagatcat ttgcagtaaa cgaaagagta 3720 gcagcaagag atttcctttc ccagaccgaa aagtggaaat ttggaacgat tacaaagaag 3780 ctaggaaaac tgcattacga aatacgatta gacgatggaa ggatgtggaa aaggcacata 3840 aaccaaatga gatctggtcc tgaagaaata tcaatccacc atagtagtaa ccaaaacata 3900 gaaccttgga tagatgaatc agtttatctg ccagatagga tggaagagtt tcaactccct 3960 catgattcag aaacggagat ggttagatct gataatatca atgaagattt agcttgtgga 4020 gaagaatctc aaactcaagt taggagatca acaagaactc gacaaccacc aattcgatac 4080 cgtgattagg gattattact ttaatttaga atataaatat gaattgtctt tcacacttta 4140 gcaagtaatc actttatatt gttaggagga gagagt 4176 // ID GYPSY2-I_AG repbase; DNA; ANG; 5178 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 20-SEP-2005 (Rel. 10.1, Last updated, Version 3) XX DE GYPSY2-I_AG is an internal portion of the GYPSY2_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW AP protease; GYPSY2-I_AG; GYPSY2-LTR_AG; GYPSY2_AG; Gypsy clade; KW gag; integrase; reverse transcriptase; GYPSY51-I_AG; KW GYPSY51-LTR_AG. XX NM GYPSY2-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5178 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "GYPSY2_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 75-75 (2003). XX DR [1] (Consensus) XX CC GYPSY2_AG is a family of Gypsy-like LTR retrotransposons. CC GYPSY2-I_AG, an internal portion of GYPSY2_AG, is flanked by CC GYPSY2-LTR_AG LTRs. The GYPSY2-I_AG consensus sequence was CC reconstructed based on multiple alignment of 20 copies; they are CC ~1% divergent from the consensus sequence. CC The consensus sequence encodes the 1408-aa Gypsy1_AGp polyprotein CC (predicted by FGENESH) and composed of the putative gag-like CC (pos. 1-300), AP protease (pos. 325-410), reverse transcriptase CC (pos. 514-681), and integrase (pos. 1030-1200) domains. XX FH Key Location/Qualifiers FT CDS join(846..1790,1875..5153) FT /product="GYPSY2_AGp" FT /note="composed of the putative gag-like (pos. FT 1-300), AP protease (pos. 325-410), reverse FT transcriptase (pos. 514-681), and integrase (pos. FT 1030-1200) domains." FT /translation="MLHSPPVRDVSTPDGVTPSADPAASGSKSPHVPTPPV FT PNTPRVPGPSACDAMFMPPESQIDTLNAMQLKPPEMDTTDIQTFFFALENW FT FDAWNITTNQHIRRFNILRTRIPLRVLPELRPLLENIRQYATDRYEVAKRA FT IIEHFEESQRSRLHRLLAEMNLGDRKPSQLLAEMRRAANGAMTDSMLVDLW FT IGRLPPYVQSAVIATNTDTNDRAKVADSVMDSFALYHRTGPYQTIHEVRNE FT DFERLSRHVTELGQRLDAVLSKLNERERARPRSRTRQRQPNQDAVTPSGHC FT YYHTQYGQAARNCRAPCSFNNRRYRLVITDPKTNIKFLIDTGADVSVIPRQ FT HSSVPSKPSTMKLFAANSTPIQVYGESLYTLDLGLRRSFLWNFIIADVGTA FT IIGADFLQHFHLLVDLRKKCLVDALTNVRSTGVPSQNPSEPTVKVCDSTSP FT IATLLKEFPGLTALSTPGTLLQSEVTHRIETTGQPTFARPRRLPPEKYAAA FT RKEFESLVQLGVCRPSNSSWASPLHMTKKADGTWRPCGDYRALNAKTVPDR FT YPLPFLQDFTMHLQDKIIFSKVDLHKAYHQIPIHPDDIAKTAITTPFGLYE FT FTTMPFGLRNAAQTFQRLIHDVLRGLEFVFPYIDDMIVASTSEAEHHEHLR FT QLFERLEKHQLAINPAKCEFYRNEISFLGHLVNASGIRPLPDRVQAISELP FT QPTTIMELKKFLAMINYYRRFLPHALETQGILLEMTPGNKKKDRTPLTWSL FT EASEAFAQCKEQLKRATLLAHPVKNAELSLWTDASDFAAGAVLHQRTNEDL FT QPLGFFSKRLEKAQQKYSTYDRELTAIYLAIRHFRYQLEGREFCIYTDHKP FT LTFAFRQTHDNASPRRARQLDFIGQFSTDIRHIAGKDNVTADLLSRIETVH FT ATPTIDYERLAEEQERDPELSDILSGKIQTDLFLQKTPIPGSPKSLYADCP FT GGIIRPYITRSFRTQLLHAVHDLSHPGARATARLITERFVWLNARKESQDF FT ARNCLACQRAKVGRHVKSPLIPYPATTARFSHINVDIIGPFPISNGNRYCL FT TIIDRFTRWPEAIPISDITASTVVSALLFHWIARFGVPAHVTTDQGRQFES FT SLFKELTKALGTKHIRTTAYHPQANGIIERWHRTLKAAITCKDTARWSEHL FT PLILLGLRTTFKNDINASPAELVYGTTLTIPAEFFIAKPQNALADQSDFAK FT TLEETMSSIRPQSTAWHTNRTPFVHSDLNKCTHVFIRDDTVRPALTTPYHG FT PYKVLTRNPKSFQILLRGQPTLVSIDRLKPAYGAEEEATPAPQCSWEGLTT FT NLLPPTTDHSETLPLPDVQANSDRRDATAASKPTSREQPVRNQTTPAPPSH FT PTTSRQTDRAAVDAPPPSILRRNDQTVSTGVTRSQRKVIIPLRYR" XX SQ Sequence 5178 BP; 1352 A; 1631 C; 1174 G; 1021 T; 0 other; actggtgacc ccgacgtgat cgcgtgcgcg agtgagtgag tggtaacctg acgaacaccg 60 tgtccagccg agaaaaaacg tgtttccatt gttccacggt ccggaccgac ggcaacgttc 120 ccccccatca tcgaggagcg gccgaccacg aaggaggcac cacgcaagcg cagccagcga 180 aaaaaaaccc cgtgcacaaa ccccgaaccc acgtgagtgc aaatcgacac cgaaggtggc 240 cgacagtgag gaacactgtt caggaacatt tttacccgac ggagcgaccg atcctagcgg 300 aaaagtttcc tctcggtgct gagcgatcgc cgaacatttt gctgacacac cccgcgccgt 360 gtcgcacacc cgccgagcat tttggtaccc gtacgtgttt gcgcacccgc cgatcataac 420 ctcacacgta ccgccgagcg cgctccagac ccacgcggtt tttgtgtgtg caccgtgtgt 480 gtgtgtgtgt gtgtgggtga atgtgcgcag gccgacgccg agcggattgc gtcagaattt 540 tgctcgagct acgttcgtca tttttttcga ccgtgcaccg aagacgtcgt cagcgcacgc 600 agccatcgtt ctcttctcgc cgacaccacc gaccgaacgc caccgaagat catcgcccct 660 cgtttctcac accaccggcg tcatcgacga acgcagccaa cgagcgacta atcctaacac 720 gatcgaccgc gtgtgcggat ttttcgtcgc cgaaggatcg acctagccaa cctccagctg 780 gacttgcttg cgcccccgcc actaaggtaa gatccaccct tttttaacta accttagtcg 840 taaggatgtt gcacagtccg ccggtccgcg acgtatcgac tcccgatggc gtaaccccga 900 gtgccgatcc agccgcgagt ggatccaaat cgcctcacgt accaacaccg cccgttccga 960 ataccccgcg cgtaccaggg ccgtccgcct gcgacgccat gtttatgccg cccgaatcgc 1020 agattgacac tttgaatgcc atgcagctga aaccaccgga gatggacacc actgacattc 1080 aaaccttttt cttcgcattg gaaaactggt tcgatgcgtg gaatatcacc acgaaccaac 1140 atattcgccg ttttaacatt cttagaacgc gtataccgct tcgtgtcctt cctgagcttc 1200 gccccctgtt ggagaacatt cgacagtacg ctacggaccg ttacgaggta gcaaagcgtg 1260 caataattga gcactttgaa gagtcgcaac gaagccgctt gcatcgtctg cttgccgaaa 1320 tgaacctcgg ggaccgaaaa ccatcgcagc tattagcgga gatgcgccgc gccgcaaatg 1380 gagcaatgac ggactctatg ctggtagatt tgtggatcgg ccgtctcccg ccatacgtcc 1440 agtccgccgt tattgccact aacacggata ccaacgatcg agctaaagta gcagactctg 1500 ttatggattc gttcgcgtta taccaccgaa cgggcccgta ccaaaccatc cacgaagtac 1560 gcaacgagga cttcgaacgt ctttctcggc acgtaacgga attaggtcag cgcttggacg 1620 ccgtactgag caagctcaac gaacgagaac gcgcgcgacc acgctcacgt acccggcaac 1680 gtcaaccgaa ccaggatgcg gtaacaccca gcggacactg ctattaccac acgcagtacg 1740 ggcaagcagc gcggaactgt cgtgccccct gctccttcaa caatcggcgg cagggtagta 1800 actcggccac tgcttccgat tgacgcttaa ccagaggcca acctcaacag atacacgtac 1860 tttcgaccca tagctatcgt ctcgtaataa ccgatccaaa aactaacatc aaattcttaa 1920 tcgataccgg tgcagacgtt tcagtaatcc ctcgacaaca cagttccgtc ccgagtaaac 1980 cctccaccat gaagctgttc gccgctaatt ctacaccaat ccaggtttac ggagagtcgc 2040 tctatactct cgatttggga cttcgccgat ctttcctttg gaacttcatc atcgcagacg 2100 tggggacagc gattattgga gccgattttc tccaacattt ccatctgctc gtggacttgc 2160 gcaaaaaatg tcttgtcgac gccttaacga acgtacgttc taccggagtg ccgagccaaa 2220 acccgtcgga accaaccgta aaagtatgtg attccacctc accgatcgcc actctcctaa 2280 aggaatttcc cgggttaact gcactatcca ctcctggcac cttactgcag tccgaagtga 2340 cgcaccgaat cgaaacgacg gggcaaccaa cattcgcaag acctcgccga ttaccacccg 2400 aaaagtacgc agctgcccgc aaagagttcg aatcactcgt ccagctcgga gtgtgccgcc 2460 cctcgaatag cagctgggcc agcccgctac atatgacaaa aaaggccgac ggcacctggc 2520 gcccttgtgg tgattaccgc gccctaaatg caaaaaccgt acccgaccgt tatccactac 2580 cgtttttaca ggacttcacg atgcatttgc aagacaagat catattttcc aaggtcgatt 2640 tgcacaaagc ataccaccag ataccaattc atccggatga tatagcgaag acagccatca 2700 cgacaccctt tggactttac gagttcacta ccatgccttt cggattgagg aacgcagcgc 2760 aaacattcca acgccttatc catgatgtcc tacgaggact cgagtttgtt ttcccgtata 2820 tcgacgatat gatcgtagca tcaacgtccg aggcagaaca ccacgaacac ttacgccaac 2880 ttttcgaacg attggagaag caccaactag ccatcaatcc agccaagtgc gagttctacc 2940 ggaacgagat ttcctttctg ggccatctgg tcaacgcttc tggtattcgt cctctccccg 3000 atcgagtcca agccatcagc gagctgccac agccaacgac gattatggag ttgaagaagt 3060 tcctcgccat gataaactac taccgacgtt ttctgccgca cgccctggaa acgcaaggta 3120 tacttctcga gatgactcca ggtaacaaaa agaaggacag aacgccatta acctggtcgc 3180 tagaagcttc cgaagcattc gcccaatgca aagagcaact gaaacgtgca acgttattgg 3240 cacatcccgt gaagaacgcc gaactttctc tatggaccga cgcttcagat ttcgcagccg 3300 gagccgtact tcaccaacgc accaacgaag acctgcaacc actaggcttc ttctcgaaac 3360 gtctcgaaaa ggcacagcaa aagtactcga cctatgaccg agaacttacc gccatctatc 3420 tcgccatacg acacttccga taccagctag agggtcggga attctgtatt tatacagacc 3480 acaagcctct aaccttcgcc ttccgacaaa cgcacgacaa tgcctcacct cgacgagccc 3540 ggcagttaga cttcattggc cagttttcca ccgacatccg tcacatcgcc ggaaaagaca 3600 acgttacagc cgatctgctc tcccgcatag agacagtgca cgcgacaccg accatcgatt 3660 atgagcgatt agcagaagaa caagagcgcg accctgaact ttccgacatt ctcagtggga 3720 aaattcagac ggacttgttc ctgcagaaga caccaatacc gggaagcccc aagtcactct 3780 acgccgactg ccctggaggt atcatcagac cgtacatcac ccgatcgttt cgaacacaac 3840 ttctccacgc cgtacatgat ctcagtcatc ccggagcccg cgccacagct agactaataa 3900 cagagcgttt cgtgtggctc aatgcaagga aggaatccca ggacttcgct cggaactgct 3960 tagcctgcca gcgcgctaag gtaggaaggc acgtcaaaag ccccttgata ccgtaccctg 4020 caacaacagc gaggttcagt catatcaacg tagacatcat tggaccattt cccatcagta 4080 acggtaaccg atactgcctt acgataatcg accgatttac tcgctggcca gaagcaatac 4140 cgatctcgga tatcaccgca tctaccgtcg tatcagcact actattccac tggatcgccc 4200 gattcggagt tccggcgcac gtaacaacgg accaagggag acaattcgaa tcctccttgt 4260 tcaaagagtt gacgaaagcc ctaggaacga aacacatccg tacgacagcc tatcacccgc 4320 aggcaaatgg aataatcgag aggtggcacc gcactcttaa agcagcaatc acctgcaaag 4380 acaccgcaag atggagcgaa cacctaccgc taatactgct tgggctacga accacgttca 4440 aaaatgacat caacgcctcg ccagccgaac ttgtgtatgg aacgacgttg accatcccgg 4500 cagaattctt catcgcgaaa ccgcaaaatg ccctcgccga ccaatccgac ttcgccaaaa 4560 cgttagagga gacgatgagc agcattcgac cacagagcac cgcttggcat accaaccgca 4620 caccgttcgt gcattccgat ctgaacaagt gtactcacgt gttcatacgc gacgacaccg 4680 tccgacctgc actaactaca ccttaccacg gtccatataa ggttcttaca cgcaatccta 4740 agtcttttca gatactccta cgtggacagc caacgctggt ttcgatcgac cgcttaaaac 4800 cagcgtatgg cgcagaagag gaagccaccc cggccccgca gtgctcgtgg gaagggctaa 4860 cgacaaacct gctgccgcca acaaccgacc actcggaaac tctgccgtta ccggacgtcc 4920 aggcaaattc ggaccgcaga gacgccaccg cagcctccaa accgacgtcg cgcgaacaac 4980 cagtgcgtaa tcagacgaca cccgcaccac catcgcaccc gacgacatcg agacaaaccg 5040 accgagccgc cgtcgacgcc ccaccaccct ccatcctacg ccgcaacgac cagacggtat 5100 cgaccggcgt caccaggtct cagcggaagg tcatcatacc tctacgttac cggtgacacc 5160 gctctaggag gggagtac 5178 // ID GYPSY54-LTR_AG repbase; DNA; ANG; 311 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY54-LTR_AG is an LTR of retrotransposon GYPSY54_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY54_AG; GYPSY54-I_AG; GYPSY54-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-311 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY54_AG, a member of the Mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 101-101 (2004). XX DR [1] (Consensus) XX CC GYPSY54-LTR is a long terminal repeat of GYPSY54_AG (its internal CC portion is deposited as GYPSY54-I_AG). XX SQ Sequence 311 BP; 98 A; 46 C; 84 G; 83 T; 0 other; tgtagcgacc ttagccgcaa caagttctct acacaaccgg ctacgtgtag atggagctga 60 aggcggcata catggttcgg gcggctgaca ctcagcgaga acaatttatg agaagagagt 120 gagaagcgat cgagacgagc gaagcgatgg cgaaaaaagg ccattctggt gtcggtgatc 180 gcgggatacg gtgttaaagt agcgggatta taaaaataaa ttaaattagt tatagttcaa 240 ataaatgaaa agttagtttt atttagtagt tagaaatagt ttttatcggt catttagtgc 300 tagttgtttc a 311 // ID PEGASUS repbase; DNA; ANG; 534 BP. XX AC U47019; XX DT 21-AUG-1997 (Rel. 2.07, Created) DT 21-AUG-1997 (Rel. 2.07, Last updated, Version 1) XX DE Putative HAT-like non-autonomous DNA transposon. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW hAT superfamily; Pegasus; TIR; nonautonomous DNA transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-534 RA Besansky J.N., Bedell A.J., Benedict Q.M., Mukabayire O., RA Hilfiker D. and Collins H.F.; RT "Cloning and characterization of the white gene from Anopheles RT gambiae."; RL Insect Mol. Biol 4, 217-231 (1995). XX RN [2] RP 1-534 RA Besansky J.N., Mukabayire O., Bedell A.J. and Lusz H.; RT "Pegasus, a small inverted repeat transposable element found in RT the white gene of Anopheles gambiae."; RL Genetica 98, 119-129 (1996). XX RN [3] RP 1-534 RA Besansky J.N.; RT "PEGASUS."; RL Direct Submission to Genbank (24-JAN-1996)Nora J. Besansky, RL Centers for Disease Control, Division of Parasitic Diseases, 4770 RL Buford Highway, Chamblee, GA 30341, USA. XX DR GenBank; U47019; Positions 78 611. XX CC 8 bp target-site duplication; 8 bp TIRs. XX SQ Sequence 534 BP; 166 A; 98 C; 92 G; 178 T; 0 other; cagtgttgcg aacggtgata gtcgtcgaca taaaattttt caaaatgata ctttttgaag 60 catcgcattc tagatccacg cattcattcg acaatcacgg aaaaaaatat catttgacat 120 gatgatattt tttctgctac aatcatttga ctttttgatt ttgccgtacc tcaagtgaac 180 tagatcttca tcgaaaatgg gtcctttttt cgtgttttcg cgtgaatttc tatactcctg 240 taagagaata ccattttcct ggttcactga gtgaaaatat caagtggtct aatggcttag 300 tatcggtaaa cgcacgatta agcagttctg gattctcact gaaaaatgtc aaatggcatt 360 cattctacat tttttgcaat catgtcatcg caaaagcatt gattgttaat gcacaaatga 420 ttgtcgcttt tggaaagtcg tgtcatggta accatgaata tcatgatcaa aaaatgattt 480 tgacacccgc ttactgcgaa tatcataaaa aaatgtcgac aattaacaac actg 534 // ID Ag-CR1-7 repbase; DNA; ANG; 4332 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 3) XX DE A CR1 clade non-LTR retrotransposon family from Anopheles DE gambiae. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; Ag-CR1-7. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4332 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4332 RA Kojima K.K. and Jurka J.; RT "CR1 clade non-LTR retrotransposons from African malaria RT mosquito."; RL Direct Submission to Repbase Update (29-OCT-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. >99% identical to consensus. XX FH Key Location/Qualifiers FT CDS 233..1228 FT /product="Ag-CR1-7_1p" FT /translation="MADHCLTCARASSSEELTYTCDGFCKRWAHRSCLGVS FT SAAVKDLIKEDSQILWLCHNCSENRRNGHSVMIAELTAALTDLKSQLSAEF FT QTRMDAAAESLKRTMLSSLDQHADTRPASHTSTFTIQRKTSTSKAPVLSST FT PATELSNPAKRRLMDRSPPSVVATSPLLNGTAPIATSAHATLLLPPETPKF FT WLYLSRCRPTATILEVETFVKEQIGIEDVTVFKLVPLNRDVRTLTFSSFKV FT GLSPDYKEKALLCSSWPIGTVFKEFTDRRNAPISTLSTNPLTTTTTAPTNT FT HVPVTTNTDVSIHMETNNNITASTNNTAVSTSPPRSSQQQ" FT CDS 1135..4143 FT /product="Ag-CR1-7_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="CIYSYGNKQQHYCLHQQHCSLDVTSSIITAAIATLLT FT HSHSKSPLKFYYQNTRGLKTKLSDLRISLEVASYDVIILTETWLDDSIPSS FT LFCDRKYTVYRCDRSSANSTCSRGGGVLIACSAKLLSLHIEVSQCSLELLW FT IQIKLNGHSLFVGGVYLPPNLSSNTDTLDSFFNTITLIQSRMKDRDLFVLC FT GDFNQPGLSWSQNEDYFAPLAVNPASSYFIDGLAEFNLRQLSGVVNIFRRQ FT LDLVFVNELVTYSCSPVSCSIDPIVKLDHYHPPLEFVVNVPLPHNGNNVRD FT NKSELNFRKMNLEKFREIITNSDWNFLEIRTDSPSTVDVAVEQFSNIVKAA FT LPLCCPRSKPSSRLPWYDATLRSLKADRTRALKKLGSTPSEYNRRVFKYAA FT SAFRIYNRASYRLYLGRLQCMLYRNPKLFWSYAKKRRNPSSLPTSMSLGSV FT TTDNAADTCSLFAEHFANGLSRPESQPLTESEGIWPGLNSVAWAEGIVNVN FT TVSEALKKLKPSFSPGPDGIPASVLKKSPFLFVPLLVKLFNLSLSSSSFPL FT LWKRAWLVPIHKKGNPSVISNYRGISIQCAITKVFEHIIHTYVFKLTSSTI FT IPQQHGFFPNRSTTTNLMSFTSFVSSQLESVKQVDTIYTDFKSAFDRIPHS FT LLLNKISQLDVNNLFVSWLRSFLCDRSYSVKFNDMFSTPFSCSAGVPQGSV FT LSPLLFIIFINDVRTTLPPECFLCYADDLKIFLPVSSPRDCISLQNILDRF FT SSWCSSNSLTLCPDKCNVISFSRSQSPITYSYVLNSVQVNRVCVVKDLGVL FT LDRGLTFSHHIDSVVNQARKTLGLLKKIACDFSDPMCLKTLFCSLVRSILE FT YCSVVWSPTARTHVERIERVQRSFTRFAICKILGRLSSPIPPYEDRCRQLG FT LELLENRRSHAQSSFIAALLLGNIDSPSLLSSIPFYAPNRVLRNRPPLVIP FT SRRTAVGRNDPLLRAIRRFNCVYSMFDFNLPLSSFRSRIRTLHNPVLP" XX SQ Sequence 4332 BP; 1077 A; 1098 C; 781 G; 1376 T; 0 other; tgaacattcc ttcgcttaca ctcacactga tagcacgtac tacatttcgc tccgctttat 60 ttgcgttttg tgcacttatt ttcgcaaatc ccgatctggt tctatcattg cactgtgttc 120 tgattatcag ttgtgcaagc ctcaataaac tgttcatata ctcaagtgct cgaagcctgt 180 ataagtgata taaagtgtat actccttggc gcggtgagcc tgttgaatcg tcatggctga 240 tcactgtttg acctgcgccc gtgcttcgtc atccgaggaa cttacttaca cttgcgacgg 300 cttctgcaag cgatgggctc atcgttcgtg tcttggtgtg agtagtgccg cagtcaagga 360 tctcatcaaa gaagattctc aaattctgtg gctttgccac aattgttcgg agaatcgccg 420 caacggacac tcagtgatga ttgctgagct aacggcggca ctcaccgatc tcaaatctca 480 gctgtcagct gagttccaaa cacgtatgga tgctgccgcg gaatcgctaa aacgcacgat 540 gctatcctca ctcgatcaac acgcagatac tcgtcctgct tcacacactt ctacattcac 600 cattcaacgc aagacatcga catccaaagc accagttctt tcgtcaacac cagccaccga 660 acttagcaac ccggcaaaac gtcgtttaat ggaccgttct ccgccgtccg ttgtcgccac 720 ctctccgctt ctaaatggca ccgcacccat cgccacttca gctcatgcca cattgctcct 780 accaccagaa acgcctaaat tctggttgta cctgtcgcgc tgccgcccga cagccacaat 840 cttagaagtg gagactttcg tcaaggagca aatcggcatc gaggatgtca ccgtttttaa 900 gctggtgcca cttaaccgcg atgtccgcac tctgactttc tcttcgttta aagtcggttt 960 gtcgccggat tataaggaga aagcattatt atgctcatcg tggccgatcg gaactgtctt 1020 caaagagttc acggatcgta ggaatgctcc tatttctaca ttgtcgacaa atcccctgac 1080 cactaccact acagcaccaa caaatacaca cgtacctgtc accacaaaca ctgatgtatc 1140 tattcatatg gaaacaaaca acaacattac tgcctccacc aacaacactg cagtctcgac 1200 gtcacctcct cgatcatcac agcagcaata gcaacactac ttacacactc gcactctaaa 1260 tctccgttaa aattttacta tcaaaacact cgcggtctca aaaccaaact ttcggatctt 1320 cgcatctctc tcgaagttgc gagctatgat gtcataatac taactgagac atggctggac 1380 gattcgattc cttcatccct cttctgtgac cgaaagtata ccgtctatcg ttgtgatcgt 1440 tcaagcgcta acagtacctg ctctcgcggc ggtggtgtac tcattgcctg ttccgctaag 1500 ctcttgtcgc tgcacattga agtatcccaa tgctcgttag aacttttatg gattcaaatt 1560 aaactgaacg gtcactcact ctttgttggt ggtgtgtatt tacctccaaa cctcagctcg 1620 aatactgaca cgttagactc attcttcaat acaattacat taatacaatc tcgcatgaaa 1680 gatagggatt tgtttgtctt atgcggtgat ttcaatcaac ctggactatc gtggtctcaa 1740 aatgaggatt atttcgcacc gcttgccgtt aaccctgcct catcctactt tatagacgga 1800 ttagctgaat tcaatctgcg tcaactttca ggggtcgtaa atatttttcg gaggcaacta 1860 gatttagttt ttgttaatga gttagtgaca tattcatgtt caccagtttc ttgtagtatt 1920 gaccctattg ttaaattaga tcattatcac ccacctctgg aattcgtagt aaatgttcct 1980 ttaccgcata atggtaataa tgttagagat aacaaatctg agttgaattt tcggaaaatg 2040 aacttagaaa aatttaggga aatcataacc aactctgatt ggaatttttt agaaattaga 2100 accgattctc cctccactgt ggacgtagcg gtggagcaat ttagtaatat tgttaaggca 2160 gctctccctt tatgttgtcc tcgttctaaa ccctcctctc gccttccttg gtatgatgct 2220 acattgcggt cattaaaggc ggataggaca cgtgccctca aaaaactcgg ttcaactcct 2280 tctgagtata atcgtagagt tttcaaatat gccgcttctg cttttcgtat ttataatcgg 2340 gccagttata ggttgtatct tggaaggctt caatgcatgt tataccgtaa tcctaaactt 2400 ttttggtctt atgccaagaa gcgtcgtaat ccctcttccc ttcctacttc catgtcgctc 2460 ggttcggtca ctacggacaa tgctgctgac acctgttccc tattcgctga gcatttcgcc 2520 aatggtctgt ctagaccgga gagtcaacct ttgacggagt ctgagggcat atggcctggt 2580 ttaaactccg tcgcatgggc tgagggtatt gtcaatgtta atacagtctc tgaagctctt 2640 aaaaaactta agccatcctt ctcccctgga ccggatggaa taccagcctc ggttttaaaa 2700 aagtcacctt ttctattcgt ccctctattg gtaaaattgt ttaacctatc tttgtcctcg 2760 tcttcattcc ctttattgtg gaagcgagca tggctagttc ccattcacaa aaaaggtaac 2820 ccgtctgtca tttccaatta ccgtggcatc tcaatacaat gcgcaataac caaggtattt 2880 gagcatatta tccataccta tgtgtttaaa ctcacctctt ctactattat ccctcagcaa 2940 catggctttt ttcctaaccg ttctacaaca actaacctta tgtccttcac ctctttcgta 3000 tcatcccagt tagagtcagt taaacaagtt gatactattt atactgactt caaatctgca 3060 tttgatcgta ttcctcattc cttacttcta aataaaattt cacaacttga cgtcaataat 3120 ctttttgtat cttggttacg ttcttttcta tgtgatcgtt cctacagtgt taaatttaat 3180 gatatgtttt cgactccctt ttcatgtagt gcaggcgtac cgcaaggtag tgttctaagt 3240 cctttgctgt tcataatttt tattaatgat gtccgtacca ccctccctcc tgagtgtttc 3300 ctctgctacg cagacgacct caaaatcttt ctcccagttt cgtctcctag agattgcatt 3360 tccctacaaa atattcttga tcgtttttcg tcttggtgtt ctagtaattc cctcacatta 3420 tgtcctgata agtgtaacgt catttcgttt agtcgttcgc agtctccaat cacttactcg 3480 tacgtcctaa attctgtaca agtcaatcgt gtttgtgtcg taaaagacct cggtgtcttg 3540 cttgacagag gcctcacctt cagccaccac attgactcag ttgtcaatca ggcgcggaaa 3600 acattgggtc tcctaaagaa aattgcgtgc gatttctccg acccaatgtg cctgaagact 3660 ctgttctgct ctctggtcag atcgattttg gagtactgct cagtagtctg gtcccctaca 3720 gctcgaaccc acgtggaacg tatcgagcgg gtgcagcgat cgttcaccag gtttgctata 3780 tgtaaaatcc tgggtagatt gtcctccccc atacctcctt acgaggacag gtgtcggcag 3840 cttggcctgg aactattgga aaaccgccgt tcccatgccc aatcctcctt tattgccgca 3900 ttactccttg gtaatattga ttctccttcc cttctttctt ctatcccttt ctacgcccct 3960 aaccgtgttc ttcgaaaccg ccctccccta gtgatccctt cccgccgaac tgcagtgggc 4020 cgtaatgacc cccttcttcg tgcaattcgt cgatttaact gtgtatattc tatgtttgat 4080 ttcaaccttc ccttatcctc tttccgatcc cgcatcagaa ccctccataa tcccgtgctg 4140 ccttaatttt taatttttaa tttttaattt ttaaatttta gattctaatt ttctaaaaaa 4200 aaattttttt ctcaaatttt tgataatttg tgttaatttt tgttaggtta ggaacatagg 4260 taataagtat tatttaagca tttgtgtaac caaaaggtag acaaataaat atgaatatga 4320 atatgaatat ga 4332 // ID GYPSY63-LTR_AG repbase; DNA; ANG; 144 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY63-LTR_AG is an LTR of retrotransposon GYPSY63_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY63_AG; GYPSY63-I_AG; GYPSY63-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-144 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY63_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 166-166 (2004). XX DR [1] (Consensus) XX CC GYPSY63-LTR is a long terminal repeat of GYPSY63_AG (its CC internal portion is deposited as GYPSY63-I_AG). XX SQ Sequence 144 BP; 54 A; 22 C; 23 G; 45 T; 0 other; tgttggatat cacagtacac atgtacaacc atatagaata gttattgtac gatcgtagta 60 tacatttata accttataga atagtcagtt ggaataaagc tattaatctg aacacatcgt 120 agtctggcaa ctaagtatat aaca 144 // ID GYPSY22-LTR_AG repbase; DNA; ANG; 172 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY22-LTR_AG is an LTR of retrotransposon GYPSY22_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY22-I_AG; GYPSY22-LTR_AG; GYPSY22_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-172 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY22_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 10-10 (2004). XX DR [1] (Consensus) XX CC GYPSY22-LTR_AG is a long terminal repeat of GYPSY22_AG (its CC internal CC portion is deposited as GYPSY22-I_AG). XX SQ Sequence 172 BP; 64 A; 36 C; 47 G; 25 T; 0 other; tgtcataacc atcgaaatgt caacatcacc aaatggacca acctggtagc ggttggctga 60 cagcggctgt caacaggggg aacgggaaaa aagtgcgcca ctggagcagc gtcgggagaa 120 gcgtacaaac ggaaaagata cagaagtaca aatatacgag tgaaatacaa ca 172 // ID BEL7-LTR_AG repbase; DNA; ANG; 503 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL7-LTR_AG is a long terminal repeat of the BEL7_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL7-I_AG; BEL7-LTR_AG; BEL7_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-503 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL7_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 44-44 (2003). XX DR [1] (Consensus) XX CC BEL7-LTR_AG flank an internal portion of BEL7_AG (deposited as CC BEL7-I_AG). XX SQ Sequence 503 BP; 143 A; 116 C; 109 G; 135 T; 0 other; tgttacgttc gcgattatta aattcgtaac ggacgtgtca aattgtaaat tcgcgtggtg 60 cgcgatcacc ttaatctaga aaccgggtac gctacgtgta aaagcggaat aaaaatgcgt 120 cacaaagact aacatgcggc actcacgcgt acgtacactt atgtgctaat tatcctcgga 180 aaggatacac taacacccta aaaattgtat gggcttccgg gggtataaaa gggaccgaac 240 ttttaataaa caaaccattc ggattttgca actctaagaa acctcgtctt ttttaatttt 300 gcatattcgg atcaagaaag aaatctctca tccctcttca cccccttctg cggcataggc 360 tcctcaaatt acttgagaga gcaactagcg ggaaggttta ttattggtgt cgaacggttg 420 agaagatcgc tgttacccga gcgggtttga tttacaaggc cgcatccttc gcgtggccca 480 cagatccgtt cgccgtcgta aca 503 // ID MARINERN11_AG repbase; DNA; ANG; 363 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE MARINERN11_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN11_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-363 RA Kapitonov V.V. and Jurka J.; RT "MARINERN11_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(3), 62-62 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of MARINERN11_AG in the genome, CC they are ~97% identical to the consensus sequence. CC MARINERN11_AG copies are flanked by 2-bp target site CC duplications. CC The consensus sequence has 13-bp terminal inverted repeats CC (4 mismatches) and a 3' terminal palindrome (pos. 194-241). CC Putative classification: a nonautonomous Mariner/Tc1-like DNA CC transposon. The genome harbors several subfamilies of CC MARINERN11_AG. CC The 30-bp and 81-bp 5'- and 3' termini of MARINERN11_AG are CC 100% and 87% identical to the corresponding termini of CC MARINERN10_AG. XX SQ Sequence 363 BP; 99 A; 80 C; 100 G; 84 T; 0 other; cccgctgcgc aaagcgatcg aaaacttctc agcctctcgc tcaatgtagc tagagagcaa 60 gaaccaataa tgataaagcg aggtgaaagg atggggaaga ggggaaggag gggagaaaga 120 gaacgtgggt gtgaactatt tttcggccat tatcattttt gcgccaacac ggtcgcgtac 180 cggagctttt ttgtcagaaa gagataaaca catacagcgc gctctctcga acgatcgtaa 240 acgcttcaat cgatcaagag cttcgatcgg ttcttcggac gtttctgctc gctttctcga 300 tcggtttcgg cggaaagcga gctagaaacc gagctcaaaa actgctcgat tttgttgtgc 360 ggg 363 // ID HARBINGERN1_AG repbase; DNA; ANG; 452 BP. XX AC . XX DT 11-FEB-2003 (Rel. 8.01, Created) DT 11-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE HARBINGERN1_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Harbinger; DNA transposon; Transposable Element; Nonautonomous; KW HARBINGERN1_AG; Harbinger superfamily; KW nonautonomous DNA transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-452 RA Kapitonov V.V. and Jurka J.; RT "HARBINGERN1_AG: a family of nonautonomous Harbinger-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(1), 3-3 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of HARBINGERN1_AG in the genome. CC They are ~97% identical to the consensus sequence. CC HARBINGERN1_AG copies are flanked by the TAA and TTA 3-bp target CC site duplications. This element has 48-bp terminal inverted CC repeats. CC Putative classification: a Harbinger-like nonautonomous CC DNA transposon. XX SQ Sequence 452 BP; 141 A; 95 C; 100 G; 116 T; 0 other; ggctggaaat agacgaaccg agatcatcgc ggtaccgcga aaatcgcgta tgacttgagc 60 tgtcaattat gttataaaat ctgacagata ggcgagataa tcgcggttcg tgtatggaac 120 catccgaggt ctggtgacga ttgaatgtgc caagaagaaa tatttttcta tcaatttgcg 180 aatcaaatta tttatttcac atagtttatg caaaataaaa tacaaaactc atttctcgac 240 acgagttttc acacaaatcc gccagaaaaa gtaaaaataa tcggttcgcg ctaccgcgaa 300 aatctcagca ataccgaagt tgggtatctc tgcgaaacag caggcgcgat tggaggcgcg 360 gaagatttga cagctctact gttctacaac tgttataaac aaggcgcgaa tttcgcggta 420 ccgcgacgat ctcggttcgt ctatttccag cc 452 // ID GYPSY45-LTR_AG repbase; DNA; ANG; 208 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY45-LTR_AG is an LTR of retrotransposon GYPSY45_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY45_AG; GYPSY lineage; GYPSY45-I_AG; GYPSY45-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-208 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY45_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 83-83 (2004). XX DR [1] (Consensus) XX CC GYPSY45-LTR is a long terminal repeat of GYPSY45_AG (its internal CC portion is deposited as GYPSY45-I_AG). XX SQ Sequence 208 BP; 81 A; 38 C; 34 G; 55 T; 0 other; agttaaccat cctatgtgta tgtgacacac cactatattt accaatagct atagccaagc 60 accatattca tataagataa gccgtacact catacataca agataaggtt aggattagag 120 ttgataggaa tcaaactagg agaaataaag acagttagaa tctggaactc aaagcgttcg 180 catcatttga tccgaaaata tattaact 208 // ID MARINERN6_AG repbase; DNA; ANG; 1170 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE MARINERN6_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN6_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1170 RA Kapitonov V.V. and Jurka J.; RT "MARINERN6_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 23-23 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of MARINERN6_AG in the genome. CC They are ~99% identical to the consensus sequence. CC MARINERN6_AG copies are flanked by 2-bp target site duplications. CC This element has imperfect 12-bp terminal inverted repeats. CC Putative classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. One copy of MARINERN6_AG has been inserted into CC a HARBINGERN2_AG element and multiplied in a few copies as a CC composite subfamily of HARBINGERN2_AG. XX SQ Sequence 1170 BP; 340 A; 244 C; 278 G; 307 T; 1 other; cccaaccagc aaaaccgaag ctattggaaa actttcctgt gtgccaaagc gagagagcat 60 gtcttctttc tctagatttt ggacggcaca acgccgatcg tgtggcctat gaacgaaaaa 120 cgggagaagg ggaaaccgat atccttcgag tgacacacac gcatgcatct gcatgcgtat 180 tccgctgaac cgagactaca catctctctc gatctcccat gatttttctg ctatcgcgct 240 catccgggtg ttaggcatcc tcagtctgtc cagaactttc actgtggaag gatgtgttgg 300 agtgatgtgt gatttcttgt gccccgtact ttgctcgttt acgttttgtg ccgtgttcct 360 tcttcttcaa ttatggaatg atttatcctt gtaaatttct aaggtaagtt tcttatcctc 420 gtgtgtgctt caatggrtac attcaactaa cgcttgtcgt ctaaaaagtt acagtgcacg 480 cgcatgttaa tcgacggttg ccaacatccc gcaaaacgat cgacggaagg aaaacggtgt 540 ccggttttaa aatataaagg taagcgtttc tataatccta tgtgcaaaat attgtttagt 600 ttgttataaa aaacatggat taattattgg ttaattttca tcaacagatt tcattacagc 660 tctttcgtag aagagcgcac ttcaaggacg cctaggatac ggctaaacat gcgcagcacc 720 agtccgtgac gtcatttcca gacaggcgcc ggctgaacac gcgcagcacc agtcgaggat 780 atcctagcgc agagggtaga aggaagaacg agaggaaggg taacagaaag caaccgcagc 840 atttggaagg ggtgaagcat cctgccgtat acagagtgtg agtgtgccaa aacgcatgtg 900 gaaagagtaa caccaataaa tagcatgcga aagtgtgcat aaagcaaaaa gaaaaagcgt 960 cactactaca accagcaatc cgaaaatata aggggaggga taaatacaaa tgtatgcgta 1020 agcaggacca gcaatccaag catttgaatg tgtgcgtcgg tttgtgccgt gctgaaagtt 1080 tttactttag ctgctgctag tgtaaaaact taaggaaagc ttaaagtaaa aattttttac 1140 acggggcgcc ttcggaaatt gttggctggg 1170 // ID GYPSY70-LTR_AG repbase; DNA; ANG; 355 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY70-LTR_AG is an LTR of retrotransposon GYPSY70_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY70_AG; CsRn1 lineage; GYPSY70-I_AG; GYPSY70-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-355 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY70_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 180-180 (2004). XX DR [1] (Consensus) XX CC GYPSY70-LTR is a long terminal repeat of GYPSY70_AG (its CC internal portion is deposited as GYPSY70-I_AG). XX SQ Sequence 355 BP; 107 A; 96 C; 60 G; 92 T; 0 other; tgtggcgacc acgttatagc ctcaccacag tccggagctg acgtcaaccg tcaattcaaa 60 atatcgggca ctctaattcg aacaccctaa ttcaagcacc ctaattcgag caaccttatt 120 cgagcaccct aattcgagca atagttacca acttagccag actaaactat ttcccgttaa 180 aagattataa aagccgtcgt tcccaccacg cgttctcttc tcttttcgtt ctcatccgct 240 agtggacgtg ctcccgtaac tgcgtaaccc gtgataaaaa acattgtaaa aaattgtatt 300 ttgtgacatc ttgggaatcc aaataaaaaa gtggtctacc cacgaggtag ccaaa 355 // ID AgaP15 repbase; DNA; ANG; 5462 BP. XX AC DQ301491; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP15 transposon P-like, DE complete sequence. XX KW P; DNA transposon; Transposable Element; AgaP15. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5462 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-5462 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301491; Positions 1 5462. XX FH Key Location/Qualifiers FT CDS join(173..428,834..2353,2422..3267) FT /product="AgaP15_1p" FT /note="transposase." FT /translation="MSCSCSVIYCNNGPSKAKKLGLNIKFHKFPPGGDLRT FT KWIQFCGRESDWEPCNRAVICSLHFNEEDYQMLPRNSTETNPYRLYPHVVP FT SKTGISTTERELGENEQNTDVCVEESRNENVTQIEPVNNVESPMIVVRSSD FT LVDAGKSQESLQELRKANHVLKEKLNDCYKELNALKIENKTLKKELEKNKR FT RSIEPVEFVAKIKNIFKGTLSSNQIDLIMKEKKRVQWTIEEISSALTLRYF FT GKRAYRYIAIDLHYPLPALSTLQKYARKINLKQGVLVDVLVFIGNFSSGLP FT RQDRECILSFDEMKVQRVLEYDQSTDEILGPHDYIQVVMARGLFRQWKQPV FT FIGFDTKMTKEIMVDLIQKLNEKGVNVAAIVSDNCSANIGCWKQLGAHDYF FT NPHFEHPITKKNVYIIPDAPHLLKLIRNWLIDHGFRYKDKNITIEPLRLLV FT DSRLGSEYTPLFKLSSKHLNMSSHERQNVSLAAQLLSRTTATTLRRYSIND FT DAKHLAEFIEKVDLWFSVSNSYSPKASLDYKKSFICSEDQVKALDDMFEMV FT DNMRVIGKNNHQTFQKSILMQITSIKKIFGDMVQKHSIKYISTNKLNQDVL FT ENFFSQLRQRGGVYDHPSPLACLYRIRNIILGKSPSVLLNQTHIATKDNDD FT EQQIPETDSHTHRNDHRNEHFVSAIMYSRANIEPSLPDMNVVEKDTNLFYD FT MNDNSSTDSSTITEFNEQEEDGFEYVVGFLGRKFKDNFPDLYLGCYTFEND FT TEHSYSNPPSFVQHLSVGGLFQPSITFLDVCREMETIFLKTHKNGQFSNKK FT SIVRILTNKISRKLPELPLEIVKAFSKQRIIIRMRYINLKNKEELNKRQAL FT KRKAHDENRKGAKKMRKIVN" XX SQ Sequence 5462 BP; 1853 A; 948 C; 1041 G; 1620 T; 0 other; caaggtggat ttaagaagat caggtggtgg aatatagagc attttgacgt tttgcgagtt 60 gacgtaaggt ccccagattt ggggtttgtt tacgttgggg tttgtttaca tttgattttg 120 cagttgttcc agttttcatc gcaggaagca ggaagttaaa agtgttttcg tcatgtcttg 180 tagctgttca gtgatttatt gcaataatgg cccatctaag gccaaaaagc ttggattgaa 240 tatcaaattt cataaatttc ctcccggcgg agatttaaga acaaagtgga tacagttttg 300 cggacgcgaa tcggattggg aaccttgtaa ccgagctgtg atttgttcgc ttcattttaa 360 tgaggaagat taccaaatgt tacctcgcaa ttctacagaa acaaaccctt atagacttta 420 tccacatggt aggttcagag tagtaatagt agtaggttta gtttgattaa acaataaatg 480 tactacacat attataattg acataacatt cgattcgcaa tgtgtttcct gtgtaatcga 540 ttgtattact gttgtagagc tgagttaaaa tatgttcttt gtcggtagta aatgtatgtt 600 ttggggaaaa agaactgaac aaaacgaaac atatggtttg gcctgttcaa aatagtaaat 660 gttttatttg tagtctaaaa tgtaatcaac tttaacgaat ctggacggca cgaaggcgtc 720 ggaggatacc gcaatgaaaa aaggacgttg attgttctat acaatttttt cattaattgt 780 gataccgcta aaataacaaa aacatgtttg atttatttaa acttatattt cagtggttcc 840 gtccaaaaca ggcatttcaa ctactgaaag agaattaggg gaaaatgaac aaaatacaga 900 tgtttgtgtt gaagaaagta ggaatgaaaa tgttacccaa atagaaccag tgaataatgt 960 agaaagtccc atgatagttg ttagatcgag cgatttagta gatgcaggaa aatcgcagga 1020 atcattacag gagttgcgta aggcaaacca tgtactgaaa gaaaaactga acgattgtta 1080 taaagaactc aatgcgttga aaatcgaaaa caaaacatta aagaaggagc ttgaaaaaaa 1140 caaaagaagg agtatagaac cggtagagtt cgttgccaaa ataaaaaata ttttcaaagg 1200 gactctttcc agcaatcaaa tagacctgat catgaaggaa aagaagcgtg tgcaatggac 1260 aattgaagaa ataagctctg cactgacgtt gcgttatttc ggaaaacgag cctatcggta 1320 catagcaata gacttacact atcccttgcc agctttatcc acacttcaaa aatatgccag 1380 aaaaataaac ctcaagcaag gcgtattggt cgatgttctt gtttttattg gcaatttttc 1440 aagtggattg cccagacaag atcgtgaatg tattctaagt tttgatgaaa tgaaggtgca 1500 aagggtttta gaatatgatc aatcaacgga tgaaatttta ggaccacatg attatataca 1560 agtggtaatg gccagaggct tatttagaca gtggaaacag cctgttttca tcggattcga 1620 cacgaaaatg actaaagaaa taatggtaga tttaattcaa aaacttaacg agaaaggagt 1680 aaatgtcgca gcgattgtta gtgataactg ttctgcgaat attggttgtt ggaaacagct 1740 gggcgcccat gactatttca atccccattt cgagcatcca ataacgaaaa aaaatgttta 1800 tataatacct gatgctccac atctcttaaa attaataaga aactggttga ttgatcatgg 1860 tttcagatat aaagataaaa atattacgat tgaaccattg cgcctcttag tagattctag 1920 gctaggatca gaatatactc cattatttaa actaagttct aaacacttga atatgtctag 1980 tcacgaacgg cagaatgtaa gcttagcagc gcaattatta tcgaggacta cagcaacaac 2040 tttaaggcgt tattcaatta acgatgatgc gaaacattta gctgagttta tagaaaaagt 2100 agacctttgg tttagtgttt caaactcata ttcgccaaaa gctagtttgg attataagaa 2160 atcctttatt tgttctgaag atcaggtgaa agcattagat gatatgtttg aaatggtcga 2220 taacatgaga gtgataggta aaaataacca tcaaacattt caaaaatcaa tcctcatgca 2280 aataacttca attaaaaaaa tattcggaga tatggtacag aagcattcaa tcaagtacat 2340 ttcaacgaac aaggtaatat tatcgtcaaa aatgttaaat ggtatcaatg ttttactgtt 2400 atatttattt ttctctttca gctaaaccaa gatgtattag aaaacttttt ttctcaatta 2460 agacaacgtg ggggtgtcta tgaccatcct tcaccattgg cttgcctata tagaataaga 2520 aatattatat taggcaagtc tccatccgtc ctattaaacc aaactcacat agcaaccaaa 2580 gacaatgacg atgagcaaca gatacctgaa acagactcac atactcacag aaatgaccac 2640 cgcaatgaac acttcgtgtc tgctatcatg tattcaagag caaatattga accatctttg 2700 ccagacatga acgttgtaga aaaggataca aatttgtttt atgatatgaa tgataatagt 2760 agtaccgata gcagcacaat aacagagttt aatgaacaag aggaagatgg atttgagtat 2820 gttgttggat tccttggtag aaaatttaaa gataattttc cagatctata cttaggatgt 2880 tatacatttg aaaacgatac cgagcatagt tattctaatc ctccttcttt tgttcagcat 2940 ctgtctgttg gaggtttatt ccaaccatct ataacatttt tggatgtatg ccgtgaaatg 3000 gaaaccattt ttttaaaaac gcataagaat ggacagtttt caaacaaaaa aagtatagta 3060 agaatattaa ccaataagat aagtaggaaa ttgcctgaac tgccattaga aattgtaaaa 3120 gccttctcaa aacagcgaat tattatacgt atgcgctaca ttaatttaaa aaataaagag 3180 gagttaaaca aaaggcaggc attgaaaagg aaagcacacg atgagaatag aaagggagct 3240 aaaaaaatga ggaaaattgt aaattagggt atttatgtta atcatatagt gcaatttgta 3300 aataagcctg gcgttcgacc gaatttttat atcaaattta attatatcta ttgccaatta 3360 aacatgtttt tcaaacgtaa taaaaatccc taaaacttaa tataaacgaa aaaaatccca 3420 gtatgcgggt caaaaagttt gcaatatgga atatcaataa ttataatcaa agatatcgcc 3480 aatctatgtt gagatttaac actcttttcc actaattttg ttcactacat tgtcaaattc 3540 aatcaagtga aatgcaactg cactgattat aacatatgtt cttggacaaa cgacttaaac 3600 gatcatagta ggatattaaa aaaaaaaaca ttttaaaata aatcaaccta gagtaaacac 3660 aagcgaaaac ggctggacca agcgacaagc gaccctcaat gaacatatta tgtccgaaag 3720 taaaacaata cgttcgatgc tacatgttcc taaacataaa cataatgctc tctattgtta 3780 ttgaatattc taattaataa aattatacca aacatgttca acaattaatg gcaaaacata 3840 ctgcccaaat gtaattgtta attgttagca atcaatttac caatattaaa tcaatcatac 3900 tgagctggct tgaatatcat ccaagcctcg ttcacccggt taagatggcc tccacctcag 3960 gttcgtagaa atggttatat aatgtccggt tccgcctgtc gtttctttct aactagtctt 4020 tctagcaata ggtctgcgga tttatcttat tattttgtat gcttagtatg tcttggattg 4080 gcttttagac gaccctgtgc cctagttatg ggtgtttcgg agcgcactaa cggctccgga 4140 gccggctcca catccacggc ttcgcttcaa cggctcctga gccttaggtg aggagccggt 4200 tccggagcta ttggtgacga gccagctccg gagccgttga agcggggccg ttggtgcaga 4260 gccggctcag aaattgtttg gagccgacct gagccgtctc cgacttcgga gccgttttac 4320 ccatcactac tgtgcacaat actccattgg ctccgctatc ctcattgatg cccttaatgc 4380 tcttatcctt ggcactttga ttgtatgttc tattatgata ctaccgctat tctcatctgt 4440 tttttttgca ttacatcagc gcaagtatac gatccaacac gctattagca acttcattcg 4500 tcatttcata tctctctttt ggtcattttt tcatatcgag ggtagttgca caactacttg 4560 ctcgaaatct tctggggccg aggggagccg tcggaatcgt cgaaaagttc cagccgtcca 4620 gagtcggcac cggctatcgt tggttttata tagctggaat cgaagtcaga gtaggaatgt 4680 ttggaagcgg ccgtgaatgc aattctggca acccattaca attccaaata cgtaatgtat 4740 gaacatagtt cgcgaacatt tctcagggct acaatacact gttctaacct aaaaatacaa 4800 aaaaaagctt gtagccgatc gcaaatctat cagtaaaaga ttcatttttc ttttcatctg 4860 tgtgcgaagt tcattgcttt cccggatggg aaataatgtt gaagtgcagg agtagaatca 4920 ggttacggaa gtatccgtcg ttttcactac agtatgcatg acgaaaacac agcctgtata 4980 cgtatgaact atcctaaaaa atacagaaat ctagaggaga aatgcaaaaa gattattaga 5040 tgataataat gtgctcatct atgaaatgaa atacggttag gaaatacagc gcttcaatgg 5100 tgttttgttg aacaatgggg atgtccattt tgattgcagc cggcaactga aggtcatata 5160 tcctacataa taaacaagaa tccattttga ttctggccga aaatgagcac tgtgccacaa 5220 cagaacaaat gttaatgctt ttaaataaca cacataaaga caaatacata atatatatat 5280 gtatatatat acgtaaaata atatgaacgc agctcgtgta tctatcactg tttatggacg 5340 atgatttttg cagccggcac gagcgttgtt tacattcatg tgtaaaatga cataaggtct 5400 ctccaagatg gcgaccccct aaaaccgcca atccaccacc tgatcttctt aaatccacct 5460 tg 5462 // ID Clu-87A_AG repbase; DNA; ANG; 902 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-87A_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-902 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1447-1447 (2010). XX DR [1] (Consensus) XX CC 8bp TSD. ~92% identical to consensus. XX SQ Sequence 902 BP; 330 A; 135 C; 121 G; 314 T; 2 other; ggtaaaagta aggctcagcc ctctacgtta cgctagcgtt acgcgtaacg cctgtttatc 60 acaaacccaa gcagcatttt tgaacgcctg ttttaaccgc cgtggatgag caaattaata 120 gattatcatg tcttaatatt gcttattttt acatgataat tagaaaagca acaatttttg 180 aacaaacata aaaaagtacg gcaatcgcta ataaagataa aactcattga taattcatga 240 gtataaataa attataatat tactattagg ctatcatttt taacagcaat cagagttgta 300 acaacttcaa gtattataat taataacaat cttatacata aaaaagggtt acttataaaa 360 aatgattatt tatttatttn cccagctatt gataaagcag gataatgaaa tacatacaag 420 tatttattaa cttcattaac ctaactttna aaaagagccc tcgcacctaa agtaatgttt 480 attcaaaatg gccggaaaca tagattgaat aggtaataat tatatttcat atatataatt 540 tatttataaa tatttttaat attttatatt ttctaatcat aggaaatgtt actccaattg 600 aaagcttttg gggtgtttat gaacttcctt ctttcagact atttttcata aatcatagtg 660 caacaatgaa aaatattaaa cctgttattt acttgttatt taaaaattaa atctcgtaga 720 aatcgattca tcttaaagca attaactttt tttcatcgca atttaaaata ttttttattc 780 aaataatgta cactggtcgg tcgcctggat gacaaaactg taataaaaat gctgcttggg 840 aatattttaa gggacgtaac gcttctccaa tgtaacgtag agggctgagc cttactttta 900 cc 902 // ID Clu-186_AG repbase; DNA; ANG; 739 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-186_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-739 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1454-1454 (2010). XX DR [1] (Consensus) XX CC 3bp TSD. >96% identical to consensus. XX SQ Sequence 739 BP; 245 A; 119 C; 153 G; 221 T; 1 other; agggggcgtc cataaattac gtaacgctca aagggggggg agggggttcc ctgtagcgtt 60 acggtttgtt acatagggga gagggggggt tcacattctg ttacgtaacg taatacaatc 120 ctatcactgg tcactgacaa atgatcgccc acctgggtaa gttttgcact tttatttctt 180 tatatctttt gatagggaat acgttttgct aaacgagcat ttgagaatgc atgctagcca 240 aggagaaggg aaaaaaacct tacttcagaa caaaagaatg atgataggtt taaagaagta 300 atacacatac ataatgaaga tgccatatgg aatgaagaag atgccaatgt aatcgaaaca 360 gatagtgttc ccatcataaa cattactgaa caattgaagc caatctggca agatgtctaa 420 aaatgctcaa catatatatt atgcgtttct ttaattataa attttttatc taactttcct 480 ttcattaatt aaacattaat tttaattaaa cattaattaa aacaactctg catactttat 540 aaaagcaatc aaataaaaaa atataaaccc anattttttt gtatttgttg atgcaaaacc 600 aaattaaggg ggggtgttcg atcatgcgtt acgtaatttt ggaagggggg tctcctccaa 660 cgttacgttt tgttacgata gggggggagg gggtcgaaaa atcccaattt cagcgttacg 720 taatttatgg acggcccct 739 // ID SINEX-2_AG repbase; DNA; ANG; 195 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE A nonautonomous non-LTR retrotransposon - a consensus sequence. XX KW SINE; Non-LTR Retrotransposon; Transposable Element; KW Nonautonomous; SINEX-2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-195 RA Jurka J.; RT "A family of nonautonomous non-LTR retrotransposons from African RT malaria mosquito."; RL Repbase Reports 9(2), 632-632 (2009). XX DR [1] (Consensus) XX SQ Sequence 195 BP; 55 A; 48 C; 54 G; 38 T; 0 other; ggggcggtcc cgtggtacag tcgtcaactc gaacgactca ataacatgcc cgtcatgggt 60 tcaagcctag aatggaccgt ccccccgtag caaggattga ctatccggct gcgtggtaat 120 gaattaagtc tcgaaagcct gtataggccg gcatgtccgc gtaggacgtt acgccaaata 180 gaagaagaag aaaaa 195 // ID COPIA3-I_AG repbase; DNA; ANG; 1607 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA3-I_AG is an internal portion of the COPIA3_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA3-I_AG; COPIA3-LTR_AG; COPIA3_AG; Copia clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1607 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA3_AG, a family of nonautonomous, copia-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 51-51 (2003). XX DR [1] (Consensus) XX CC COPIA3_AG is a young family of Copia-like LTR retrotransposons. CC COPIA3-I_AG, an internal portion of COPIA3_AG is flanked by CC 99-100% CC identical COPIA3-LTR_AG LTRs. The A. gambiae genome contains CC about 14, CC >98% identical, copies of COPIA3-AG internal sequence. CC The consensus sequence encodes the 500-aa COPIA3_AGp protein CC (positions 41-1540). The protein seems to be truncated, so the CC COPIA3_AG family is probably nonautonomous. XX FH Key Location/Qualifiers FT CDS 41..1540 FT /product="COPIA3_AGp" FT /translation="PKERFRKMERLGIAKLNGGNYSVWKTKVEFLLIREEL FT WQYVISDGPGNTTASPTTEAIGAVWKSGDQKARATIGLLLEDNQLNLIKDC FT KTAKATWEKLRGHYEKATLTSKVSILKNICEKRFSDGEDIEQHIFEMEELF FT DRLTLTGEELSKSLQVAMVLRSLPQSFSVLTTALESRSDDELTLDLVKTKV FT VDEVAKRGNRGCCDSVLKTIVKKNQMLCHFCQQPGHKRKDCLILMEKRSRD FT LKQSEGQHGMKRSQLCVQTTDEETLEQEYSFTMRGFTANSWIVDSGATSHM FT CIDRSCFVELDERYQQDVILADGTTARVEGIGSCRITTLSPEGKTSRVTLN FT DVLFVPKLETNLVSVKKLTAKGAVILFDMSGCRIVKDQKVIALATISNGLY FT SLKTRMQNVLTRKNARYRSIAKCLLMPMDLEYRNVQQKEEGLMLRKGELRK FT RKLRKGESCSFVHCKDRTMSSGSRIWVGCHGKDMFGRCEKSIPRNNEFQSV FT CCRS" XX SQ Sequence 1607 BP; 497 A; 286 C; 440 G; 384 T; 0 other; ggttatgggc ccaggattag tggcgattaa aacgttttaa ccaaaagaaa gatttcgcaa 60 aatggaacga ttgggaattg caaaactcaa cggaggtaat tacagtgtct ggaagacaaa 120 ggtcgaattc ctcctcatcc gagaagaatt gtggcagtat gtgatcagcg atggaccggg 180 taacacgacg gccagtccca ctaccgaagc catcggagca gtatggaaga gtggtgatca 240 gaaggcgcga gcgacaatcg gccttttact ggaagataac caactcaatc taattaagga 300 ctgtaagacg gctaaagcga catgggaaaa gctgcgagga cactacgaaa aggccacact 360 aacttcgaag gtgtcgattt taaaaaatat atgtgagaag cgtttttccg acggtgagga 420 tatcgagcag catatcttcg agatggaaga attgttcgat aggctaacat tgaccggtga 480 ggagctaagc aagagcctgc aagtggcaat ggtgctccga agtctcccgc aatctttctc 540 ggttctgacc acagcgttgg aaagcaggtc tgacgacgag cttacgctgg atttggtgaa 600 aaccaaggtg gtagacgaag tagccaaaag gggaaacaga ggttgctgtg attctgtact 660 aaagacaatt gttaagaaga atcaaatgtt atgtcatttc tgtcaacaac cagggcataa 720 aagaaaggat tgcctgattc taatggagaa gcggtccaga gatttgaaac agtcggaagg 780 ccagcatggt atgaaacgga gccagttatg tgtgcaaaca actgatgaag aaactctgga 840 gcaggaatat tcgttcacga tgcgtggatt tacagccaat tcgtggatcg ttgattcagg 900 agcaacatcg catatgtgta tcgatcggtc atgtttcgtg gagctggacg agcgttatca 960 gcaggacgta attttggccg atggcacaac ggcgagggtt gaaggaatcg gttcgtgtcg 1020 aataaccacg ctgtctccgg agggtaaaac atcaagagtg actcttaacg atgttttgtt 1080 cgtcccgaag ttagagacaa acttagtgtc tgtgaaaaaa ctgacggcga aaggtgctgt 1140 gattcttttt gacatgagcg gttgtcgaat cgtaaaggat cagaaggtta ttgcactagc 1200 aacaatttca aatggattat attcactgaa gaccagaatg cagaatgtct tgacccgaaa 1260 gaatgcccga tatcgatcaa ttgcaaagtg tttattgatg ccaatggatc ttgaatatcg 1320 gaatgtacag cagaaggagg agggactgat gcttaggaaa ggcgagctta gaaaacgcaa 1380 gttacgaaaa ggcgagtcat gttcttttgt acattgcaaa gatcggacta tgtcatcagg 1440 gtctcgaatc tgggttggat gtcatggaaa agatatgttt ggaaggtgtg agaagtccat 1500 acctagaaat aatgaatttc agagcgtgtg ttgtagatca tagcaacctc gatagttaga 1560 atctgcgcct acaacttggt ttagagccag ccgagattga ggaggag 1607 // ID RETRO18_AG_LTR repbase; DNA; ANG; 269 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO18_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; DIVER; KW Long terminal repeat; RETRO18_AG_I; RETRO18_AG_LTR; ROO; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-269 RA Jurka J. and Drazkiewicz A.; RT "RETRO18_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 3-3 (2002). XX DR [1] (Consensus) XX CC Related to ROO and DIVER from Drosophila melanogaster. 5 bp CC target site duplication. XX SQ Sequence 269 BP; 70 A; 59 C; 66 G; 74 T; 0 other; tgttgtgcga gttgcgcatt gcgtttgctg cttgggcaaa ctgacatatg tcacagggat 60 tgggtttgtc gcttgggttt gcattttttt gtgacagtag aattcggagt gcgacacgca 120 cacattgaag tgataataaa tgaactttag tcttgcatgc gacatcaacg aagaaagatc 180 gtcttgtccg tgagggctcc cgatttatgg tgaaagaaaa cctactaccc caccccctca 240 agttacagtc caccgctaga gacacaaca 269 // ID GYPSY66-I_AG repbase; DNA; ANG; 4352 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY66-I_AG is an internal portion of retrotransposon GYPSY66_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY66-I_AG; GYPSY66-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY66_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4352 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY66_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 171-171 (2004). XX DR [1] (Consensus) XX CC GYPSY66_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, CC GYPSY64_AG, GYPSY65_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY66-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1409-aa CC GYPSY66_AGP gag-pol like polyprotein (pos. 95-4321). The CC sequence of the LTRs flanking GYPSY66-I is deposited as CC GYPSY66-LTR_AG. CC GYPSY66_AGP: CC MTTLDPSSMAALSHMIAEALKTAIGSAAGEARQDATPSTSRAPSFSHPAYRTTEGTVCDYFDRM CC EWALQLNGIPTEKYADYARVHMGAELNNALKFLIAPKLPQEILYKDLRKTLETHFDKQRNKFVE CC SIKFRQITQQQGESIAQFTLRLKQGAAHCDYGEFLDRMLIEQLLHGLMERGICDEIIAKNPVTF CC KDACEIASTLEATHNTAQEVNAGMPSVESTNKLGYEQPKTKKTPASRATSKPSKSTRRETSQGK CC PKQGTTNPGMCNGCGGNHLRNQCRFRDARCNTCNKKGHISKVCRSRKTNHDEDSINPVNTRPEA CC SIDYVQTLNALQTTSPSGKKMVEVTIDGHPLHMELDTGAPCGIISEAKLKTIKKQFRLLPSERQ CC FSSYSGHSITCLGRLPVNVAMGSANRKLNVYVVSGTSEPLFGREWIAQFSDQIDINSMFAPNVS CC VQSVTDTEPSAIQKSKLNDLLENFREVFSDTPGKLIGPPAKVHLKPGATPIFSKAREVPLSLRE CC RYAAEINKKIAAGFYERVDYSEWASPTHIVVKKNGNIRITGNYKPTVNPRMIVDEHPIPKVEDI CC FNKLKGAAFFCHLDLTDAYTHLPIDDDFKHVLTLNTVTHGLIRPTRAVYGAANIPAIWQRRMES CC VLQDLNNVVNFYDDIIVAAANFDNLLETLKGTLQKLKDNGLRLNRAKCVFAVPSLECLGHKIDR CC HGLHKSDKHIAAIRDAPRPNTPEELQLFLGKATYYSSFIPNLSTRARSLRDLLTAQTFEWNDVA CC DEAYQDIKEALISPQVLVQYDPTLPVILATDASKTGLGAVLSHRLSDGTERPIAYASCTMSKTK CC QKYPQIDKEALAIVWAVKKFFHYLYARKFTLVTDHKPLTQILHPEKSLPTLCISRMANYADYLA CC HFNFDVVFKPTNLNTNADYCSRIPNNSNVNQFKDEGRSSGDDFEQFAHVQIEQLPVRAEHIARE CC TRKDPHLGKILQHLEQGQNLGQYGYKAPESKYTIAANCLLFEHRVVIPSTLRQAILNDLHVAHF CC GIVRMKALARSYVYWPGIDADIEKTAKECHACARYGPTPPRFDSHHWEYPSAPWERIHIDFAGP CC VCNTMLLIVVDAFSKWVEVKPTNTITSAATIKILDELFATYGVPITVVSDNGTQFTAKEFKEFL CC QKSGVKYHKLTAPYHPATNGQAERYVQTVKNGLKAMGTTSANLTRNLNTFLQHFRKTPHAETKD CC SPAKLFLGRNIRTRLDLVRPQAVHASVSERQKAQFRPAFRTLPPGQSVYVLSGNTRMDKWIPAT CC IFARLGDLHYDVLYNGQHLKRHIDQIRRFEMKTNDEEESSSAAISPPVATTTTSTPLAVSSSTA CC TRRLRFYGHAAPSAPAEPEQVVTADDDLEDETFSTPPSSPQPPGNVSTTLRRSTRLRQPPKKFS CC P. XX SQ Sequence 4352 BP; 1329 A; 1127 C; 919 G; 977 T; 0 other; tttggtgtca gaagttccgg atttacggtc cacagtctgt aacgtacatt acccttgtga 60 gagattcact gtgagaagaa aaacttactt taaaatgacg acactagatc catcgtctat 120 ggctgcgcta tcccacatga tcgcggaagc cctaaaaaca gccatcggat cagcggcagg 180 ggaagcacgc caagatgcaa caccctcgac gtcacgggca ccgtcatttt ctcatccagc 240 ttatcgcacc accgaaggga ccgtctgcga ctatttcgac cgaatggaat gggccctgca 300 gctgaacggt atccctaccg aaaagtatgc cgattacgcg agagtccaca tgggagccga 360 gttaaacaac gcgctaaagt tcctgattgc accgaagcta ccgcaagaga ttctgtacaa 420 ggatctccgg aaaacgctcg aaacgcattt tgacaaacag cgcaataagt ttgtcgaaag 480 catcaagttc cggcaaatta cgcagcagca aggggaaagc atcgcacaat tcacactacg 540 cctgaagcag ggagctgctc attgtgatta cggagagttc ttggaccgca tgctcatcga 600 acaattgcta cacggactaa tggaacgtgg tatttgcgac gagattatcg caaagaatcc 660 agtaaccttt aaagatgcat gtgagatcgc cagcacgctg gaagcaacac acaacaccgc 720 gcaggaggtt aacgcaggga tgccgtccgt cgaatcaacg aataagttgg gttacgagca 780 gccgaagaca aagaagacac ctgcatctcg tgctacgtcg aagcccagca agtcaacgcg 840 gcgagaaacg tcccaaggta aaccaaaaca aggtacaaca aatccaggta tgtgcaatgg 900 ttgtggtgga aaccacctcc gaaatcaatg tcgttttcgt gatgctcgtt gcaacacgtg 960 caacaaaaag ggacacattt ctaaagtgtg tagatctcgg aaaactaacc acgatgaaga 1020 ttctattaat cccgtaaaca cccgcccaga agcctccatc gattacgtac aaacgctaaa 1080 tgccctacaa acaacttcgc catccgggaa gaaaatggta gaagttacga tcgatggaca 1140 tcctttacac atggagcttg acacaggagc cccatgcggt attatttcgg aagcaaagtt 1200 aaaaacaatt aaaaaacaat tccgcttact tccatccgaa cgacagtttt ctagctattc 1260 aggtcacagc atcacctgtt taggacgttt gccagttaac gtagccatgg ggtctgcaaa 1320 tagaaaacta aacgtttatg ttgtttcggg aacatccgag ccgttattcg gtagagaatg 1380 gatcgcgcag ttttcagatc agatcgatat aaattccatg tttgctccta atgtttctgt 1440 ccaatcagtc actgacacag aaccatcagc cattcaaaaa tcgaaactca atgatctttt 1500 ggaaaatttt cgagaagttt ttagcgatac ccctggaaag ctaattggtc ctccggcaaa 1560 agtacatcta aagcctggcg ccacacctat tttttctaag gctagagagg tgccactttc 1620 tctgcgagag cgttacgcag cagaaatcaa caaaaagatt gcagcaggct tttacgaacg 1680 agttgactac tccgaatggg cttcacccac acacatcgtt gttaaaaaga acgggaacat 1740 cagaattacg ggtaattaca aaccgactgt taatcctcgt atgatagtcg atgaacaccc 1800 catcccgaag gtcgaggaca tttttaacaa actcaaggga gctgccttct tttgtcatct 1860 cgatttaacc gatgcataca cgcaccttcc tatagacgat gatttcaagc acgtccttac 1920 gctaaacact gttacacatg gtctcatccg accgaccaga gcagtttatg gagccgccaa 1980 tatcccagca atatggcaaa ggcgtatgga aagcgttctg caagacctaa acaatgtggt 2040 caatttttac gacgacataa tcgtagccgc agccaatttc gacaacctgc tcgaaaccct 2100 caaaggtaca ctacagaaac taaaagataa cggactccgg cttaaccgag cgaaatgtgt 2160 atttgctgtt ccttcgttag agtgtcttgg tcacaaaatc gatcgccatg ggttgcacaa 2220 atccgacaaa cacattgccg caatccgtga tgctccacgt ccaaacaccc cggaagagct 2280 gcagctattt ttaggcaaag ctacctatta cagttcgttc attccgaacc tttcaacacg 2340 agctcgtagc ttacgtgatt tgctcacagc gcagacattc gaatggaatg acgtcgcgga 2400 tgaagcctac caagatatta aggaagcgct aatctcaccg caagtcttgg tacaatacga 2460 tcctacactt cccgtcatct tagccacgga tgccagcaaa accggtttgg gagccgtact 2520 atcacaccga ttgtctgatg gaacggaacg tcccatagct tatgctagct gcacgatgtc 2580 caaaacaaag caaaagtatc cgcagataga caaagaggcg ttagcaatcg tgtgggcagt 2640 gaaaaagttc tttcactatt tgtatgcacg taaatttact ttagtgacgg accataaacc 2700 gctaacacaa attctacacc ccgagaaatc ccttccaacg ttgtgtatta gcagaatggc 2760 gaattatgct gactacttgg cacacttcaa cttcgatgtt gtattcaagc ctacaaatct 2820 caacacgaac gctgactact gttcgagaat tccgaacaat tctaatgtca accaattcaa 2880 agacgaggga agaagctcag gcgacgattt tgaacaattt gcgcatgtcc aaatcgaaca 2940 actgccagta agagcagagc acattgctcg agaaacacgc aaagatcctc acctaggaaa 3000 aatccttcaa caccttgagc aaggtcaaaa cctcggacaa tatggataca aagcacccga 3060 atcgaagtac acaatcgcag ccaactgttt actctttgag caccgagtag tcataccaag 3120 cacactacgt caagccattc ttaacgattt gcatgtggca cacttcggca tcgttaggat 3180 gaaagcctta gcccgatcat atgtctactg gcccggcatc gacgccgata tagagaaaac 3240 agctaaggag tgtcacgcat gtgctcgtta tggaccaaca cctcccaggt tcgacagcca 3300 tcactgggag tatccgagtg ctccctggga gcgcatccat attgacttcg ctggtcccgt 3360 atgcaacact atgctgttaa ttgttgtcga cgcttttagt aaatgggtcg aagtcaagcc 3420 cacaaacact ataacttcag cagcaacaat taaaattcta gacgaactct ttgcaacgta 3480 tggagtaccg atcaccgtag tatccgataa cggtactcag ttcaccgcaa aggagtttaa 3540 agagttcctg caaaagagtg gagtcaaata tcacaaactc accgcacctt accacccagc 3600 aaccaatgga caagccgaac gttatgtcca aacggtgaaa aatggtctca aagctatggg 3660 aacaacaagc gctaacctca cacgcaactt aaacacattc ctacagcatt tccgtaaaac 3720 tccacacgcc gagacgaaag actctccagc caaactgttt cttggacgaa acattcgtac 3780 tcgactcgac ctcgtgagac cacaagctgt tcacgccagt gtctccgaaa gacaaaaagc 3840 ccagttccgg cctgcgtttc gaactttgcc acctggacag tccgtctacg tactatcagg 3900 caacacgcgc atggacaagt ggattccggc taccatcttc gcccgattgg gagacctaca 3960 ttacgatgtg ttgtacaacg gtcaacattt gaagcggcac attgatcaaa ttcgacggtt 4020 tgaaatgaag accaacgatg aagaagaaag ttcttcagct gcaatctcac caccggttgc 4080 aacgactacc acatcgacac cactagctgt atcttcaagt accgctacta gaaggcttcg 4140 tttctatggc cacgcagcac cgtcagcccc agcagaacca gagcaggtcg taacagctga 4200 tgacgatttg gaggatgaga ccttctcaac accaccttcc agcccacaac caccaggaaa 4260 cgtgtcaaca actttgcgtc gttcgaccag actccggcag cctcccaaga aattttctcc 4320 ctagaagcat ctttattaaa gcagggagga aa 4352 // ID GYPSY63-I_AG repbase; DNA; ANG; 4342 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 29-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE GYPSY63-I_AG is an internal portion of retrotransposon GYPSY63_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY63-I_AG; GYPSY63-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY63_AG; mag lineage; reverse transcriptase. XX NM GYPSY63-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4342 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY63_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 165-165 (2004). XX DR [1] (Consensus) XX CC GYPSY63_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY63-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1432-aa CC GYPSY63_AGP gag-pol like polyprotein (pos. 35-4330). The CC sequence of the LTRs flanking GYPSY63-I is deposited as CC GYPSY63-LTR_AG. XX FH Key Location/Qualifiers FT CDS 35..4330 FT /product="GYPSY63_AGp" FT /note="gag-pol like polyprotein" FT /translation="MSLENQNIQLEILKALQKLSETSTGTNNTERFVAMNM FT TEFTFDPENGGTFQKWFRRYEDLFESDAKELEDVAKVRLLLRKLDAQAHNQ FT YTNYILPKLPKELTFKETVQTLSKIFGSQSSLFSRRYRCLQLVKTEADDII FT SYAAKVNRACEDSEFHNMKADHFKCLVFICGLKGQTYADIRARLLSRIDAE FT TADAPITLQNLVDDFQKLVNLKADTSIVEQQPNSSTTVNALHEKTEHHHHE FT QYRQRYQESKTSEQPRRPCWRCGQMHFVRDCQYSTHQCRKCNRVGHKEGYC FT GCFSKFKPAGEEKTNTKPSTSDQGKLNARGVYIVNHITQHSSKRKFVPATI FT NGVTINLQLDTASDITVISQQTWQKLGSPNIQPVTIQAINASGKPLHLSGE FT FQCTININGQTQQGRCFVTTAVNLNLLGIEWIELFELWSIPIDTICNQLTT FT ESIDQQMREIQAKHADVFKDTLGHCKKTKVKLYLKSNAKPVFCQKRPVPFN FT TIPLVDAELTRLQNLGIIETVDFSEWAAPIVAVRKPNGRVRICADYSTGLN FT AALEANHYPLPTPEEIFSQLNGSTIFSIIDLSDAYLQLEVDDDSKHLLTIN FT THRGLFRFNRLAPGVKSAPGAFQRLVDGMIADIPGVRSFIDDVIVFGKDMK FT SHKDSLNTLFARLKEYGFHVKAEKCHFCKTQLVYLGHVVDKHGIRPDSEKI FT KTIASIPPPSNVSELRSYLGAVNFYGRFVRNLHELRYPMDQLLKKESKWKW FT TPECQEAFVKFKEALQSHLLLTHYDPKLPIIVAADASNTGIGAVIFHQFTD FT GKMKAIQHASRTLTPAEQNYGQPEKEALALVYAVCKFHKYLLGRHFTLLTD FT HKPLLSIFGSKKGIPLHTANRLQRWALTMLNYDFEIQYVSTQDFGCADLLS FT RLIDRNKQPEEEYVIATLTLEDDLSSILSDTSQKVPISFQALRKATASSST FT LQAVCKFIREGWPNCSTNLPTAIQPYYARRESLSIVQGCVMFGDRVVVPNI FT FQKKILQQFHRGHPGIVRMKSIARSYVYWPGIDKEIEDFVKCCSPCAITAK FT TPTKTTLEFWPIPSKPWSRVHIDYAGPVDGFYFLVIVDPHSKWPEVYATRS FT ITARTTIRILKQIFATFGVPEVLVSDNGTQFTSYEFKEFCVSQGINHLRIA FT PYHPQSNGLAERFVDTLKRSIQKIRKGGESLEDALTTFLQVYRTTPSGDLD FT GKAPADIMFSRPLRTISSLLKPSEHGNVEPRNRMKEAEFFNKKHGAVKRCY FT QQGDAVYVKIYRRNSWQWEAATVIDKIGNVNYNVFLKEKQQLVRSHTNQLK FT SRLANGQNMAEFSTPLSVLLDDFGLKTPLPSDQQSTSSQFVSCDEPVTTSS FT DTELASQTLCSTPDHSELGNISSENDNAEESEGEEPVVQQQEQQSSILERS FT RRVIKLPERFKSYWMPNP" XX SQ Sequence 4342 BP; 1395 A; 968 C; 894 G; 1085 T; 0 other; attggcgacg aggatagttg aactaaaagc aagaatgtct ctcgaaaatc agaacattca 60 attagagatt ttgaaggctc tgcagaagct atcagaaaca tcaacgggta caaataatac 120 ggaacgattc gtggcaatga acatgaccga gtttactttc gatccagaga acggaggaac 180 ttttcaaaaa tggtttcgac gttatgaaga tttgttcgag tcggatgcca aagaactcga 240 agatgtcgct aaggtcaggc ttttactgag gaagttggat gcacaagcac acaaccagta 300 caccaattac atcctcccga aacttccgaa ggagctaact ttcaaggaaa ctgtgcaaac 360 attatcgaaa attttcgggt cgcagagttc actttttagc agacgttacc gttgtcttca 420 gctcgtcaaa accgaagcag atgacattat tagttacgcg gcgaaagtga accgggcttg 480 tgaagattca gaattccaca acatgaaggc tgaccatttc aagtgtttag tatttatttg 540 tggattgaaa ggtcaaacct acgcagacat acgagccaga ctactctctc gcattgatgc 600 cgaaactgca gatgcgccca ttacactgca aaatttggtt gacgatttcc aaaaactcgt 660 caatttgaaa gcagatactt ccatagttga gcaacagcca aactcatcta caacggtaaa 720 cgctctacat gagaagacag aacatcacca tcacgaacag tatcgacagc ggtatcaaga 780 atcaaaaaca tcagagcaac ctcgcagacc atgctggcgt tgcggtcaaa tgcactttgt 840 tcgagattgc caatactcga cacaccagtg tagaaaatgc aatcgtgttg gtcacaaaga 900 aggctactgt ggatgttttt caaaattcaa acctgccggg gaggagaaga ctaacacgaa 960 accatcaaca agtgatcaag gaaaactgaa tgccagaggt gtctatattg tcaatcacat 1020 cactcaacac tccagcaaaa gaaaatttgt tcctgcaacc atcaacggcg ttacaatcaa 1080 tctgcagctc gacacagcaa gtgatattac ggtgatttca caacaaacat ggcaaaagtt 1140 gggctcaccg aacatccaac cagtgacgat tcaggccatc aatgcatctg gtaagccact 1200 ccatttatcg ggcgaattcc agtgcaccat caacatcaac ggccagaccc aacaaggcag 1260 gtgtttcgtc actacagcag ttaacctaaa cttgctaggg atagaatgga ttgaactatt 1320 cgagctttgg tccattccaa ttgatacgat ttgtaatcaa ctaacaacag aatcgatcga 1380 ccagcagatg cgagaaattc aagcgaagca tgcggatgtt tttaaggata cattggggca 1440 ttgcaagaaa actaaggtta agctttacct caaatcaaac gcaaagcctg ttttctgcca 1500 aaagcgtcct gtacctttta acacaatacc tttggttgat gccgaactta ctcgattgca 1560 aaacttgggc ataattgaaa ctgtcgattt ctccgagtgg gcagctccaa ttgtggcggt 1620 gaggaaaccg aatggacgtg ttcgaatatg tgccgattat tcaacaggat tgaatgctgc 1680 gttggaggca aaccattatc cattgccaac accagaagaa attttctcgc aacttaacgg 1740 cagcaccatc tttagcatca tagacctgtc cgatgcctat cttcagctcg aagttgacga 1800 cgattcaaag catttactaa ccatcaatac acatcgtgga ttattccgat tcaaccgtct 1860 cgcaccaggg gtaaaatcag caccaggagc attccaacgc ctcgtagatg gaatgatagc 1920 tgatattcct ggggtgcgat cattcattga tgatgttatt gttttcggca aggatatgaa 1980 atcacacaag gattcactca acaccttgtt cgcacgtctt aaggagtacg gatttcacgt 2040 aaaagccgaa aaatgccatt tttgcaagac tcaacttgtg tacttgggac acgttgtaga 2100 taagcatggt attcgtccag attccgaaaa gatcaagaca attgcttcga ttccaccacc 2160 aagcaatgtg tccgagctac gatcttatct tggagcagtg aatttttacg gaagattcgt 2220 tcgtaacctg cacgaattac gttaccctat ggatcagctg cttaagaagg aatcgaaatg 2280 gaaatggacg ccagagtgtc aggaagcttt cgtcaagttt aaggaagcac ttcagtcaca 2340 tttgctccta acgcactacg atccaaaact tccgatcatc gttgctgcgg acgcatcaaa 2400 cacaggaatt ggtgcagtca tttttcatca atttactgat ggaaaaatga aagcaattca 2460 acacgcgtca cgaacactta cacccgctga acagaactat ggacaaccag aaaaagaagc 2520 tctcgcatta gtttacgcag tatgcaagtt tcacaaatac ttgcttggac gtcatttcac 2580 tttgctcacg gatcacaaac cattactttc aatttttggt tcaaaaaaag gtataccact 2640 tcataccgct aaccgtttgc aaaggtgggc acttaccatg ttgaattacg atttcgaaat 2700 tcaatacgtg tccacacaag atttcggatg cgcagatctt ttatcacgat tgattgaccg 2760 aaacaagcag ccggaagaag agtacgtaat tgcaacactg actttagaag atgacctttc 2820 gagcattctg tccgatacat cacagaaggt tccgatttca tttcaagcac tccgtaaagc 2880 aaccgcttca agctcaacac tacaagcagt ctgcaaattc attcgtgaag gttggccgaa 2940 ttgttccact aatcttccaa ctgcaatcca accttactac gcgaggcgcg aatcattatc 3000 aatagtccaa ggatgcgtta tgtttggtga cagagttgtt gtaccaaata tatttcaaaa 3060 aaagattcta caacaatttc atcgaggaca cccagggata gttcgaatga agtcaatcgc 3120 tcgaagctat gtttactggc ctggtattga taaagaaatc gaggattttg ttaaatgttg 3180 tagtccgtgt gcaattacag cgaaaacacc gacaaagaca actttggaat tttggcccat 3240 accatcaaaa ccatggtcca gagtacacat tgactatgca ggcccagtag acggattcta 3300 cttccttgta atcgtggatc cacactcgaa atggccggaa gtttacgcta ccagatcaat 3360 aactgcgaga acaacaataa gaattttgaa acaaattttc gcaactttcg gagtgccaga 3420 agttctcgtg tctgataacg gtactcaatt taccagttac gagtttaagg agttttgcgt 3480 tagtcaaggc atcaaccact tgcgcattgc tccatatcat ccgcaatcca acgggttagc 3540 tgaacgattt gtggatacac tgaaacgaag tattcaaaaa attcgcaagg gaggggaatc 3600 tctcgaagat gcactaacca ctttccttca agtatatcga accacaccat ctggagattt 3660 ggatggaaaa gctcctgctg acattatgtt ctctagacca ttacgaacta tatcgtcgct 3720 cctcaaacca agcgagcacg gaaatgttga gccgaggaac agaatgaagg aagccgaatt 3780 tttcaacaaa aagcacgggg cagtgaaacg atgttatcaa cagggcgatg ctgtttatgt 3840 caagatatat cgtagaaact cctggcagtg ggaagcagca accgtaatcg acaaaatcgg 3900 caacgttaat tataacgttt tccttaaaga aaaacagcag ttagtacgat cacacaccaa 3960 ccagctgaaa tctcgattgg caaacgggca aaacatggca gaattttcaa caccactatc 4020 tgtactgctt gatgattttg gtttgaaaac acctttaccg tcagaccaac aatcaacgtc 4080 gtcacagttt gtttcctgtg atgaacctgt aacaacatca agcgatactg aactagcatc 4140 tcagacactc tgttcaactc cagatcattc tgaattgggt aacattagtt ccgaaaatga 4200 caacgcagaa gaatccgagg gagaagaacc agttgttcaa cagcaagagc agcaatcatc 4260 aattttggaa agaagtcgaa gagttatcaa attaccagaa cggttcaaat cttactggat 4320 gccgaatcca taagggggga ga 4342 // ID GYPSY29-LTR_AG repbase; DNA; ANG; 161 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY29-LTR_AG is an LTR of retrotransposon GYPSY29_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY29_AG; GYPSY29-I_AG; GYPSY29-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-161 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY29_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 51-51 (2004). XX DR [1] (Consensus) XX CC GYPSY29-LTR is a long terminal repeat of GYPSY29_AG (its internal CC portion is deposited as GYPSY29-I_AG). XX SQ Sequence 161 BP; 58 A; 24 C; 32 G; 47 T; 0 other; tgtaaaagtt tggaaaatgt atatacgtgt aaacgctact tgatgtcacg gtacgtgtat 60 attagcatta ggttattttt gacagctgcc attagaacag aataaaaagc gaacgaatca 120 ggacgccaaa tcgaagtgac tacaatttaa tcgattttac a 161 // ID BEL13-I_AG repbase; DNA; ANG; 5752 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 18-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE BEL13-I_AG is an internal portion of the BEL13_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL13-I_AG; BEL13-LTR_AG; BEL13_AG; Bel clade; RING Zn-finger; KW integrase; peptidase; reverse transcriptase. XX NM BEL13-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5752 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL13_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 33-33 (2003). XX DR [1] (Consensus) XX CC BEL13_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL13-I_AG, an internal portion of BEL1_AG is flanked by CC BEL13-LTR_AG CC LTRs. The BEL13-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 6 copies; they are less than ~1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes one 1762-aa BEL13_AGp Bel-like CC protein CC (pos. 429 5714). CC BEL13_AGp is composed of the peptidase A16 (pos. 150-290), RING CC Zn-finger (pos. 385-422, reverse transcriptase (pos. 800-930) and CC integrase (pos. 1480-1620) domains. XX FH Key Location/Qualifiers FT CDS 429..5714 FT /product="BEL13_AGp" FT /translation="MSVRRSDVHKTPAESEEWKIAPIDMEEAGPSSEIPAG FT LDEAKEVHRPIDKNQCIRKHFLNKLQRIEEALSGTSLGDTTFLKGCANRLT FT SLASEYEKWHQTVLETADMENFEDGEEEYARFEKRHFNLLLRIERGMSITT FT NVSQSRVKLPELRLPTFDGSLEAWLPFRDSFSSLIDANASLSDVDKLRYLK FT GALTKEANKLIADIEITSANYIVAWELLKARYENKKLAVKRHIDALFLIPV FT MKKDSYESLIHILDSFERSVNITKQLGVATEGWSVLLAHMLHSRLDSATQM FT HWEAHHRSTDVPEYYELLTFLKSHALVLQAMLSPGQKKEQYTSSWKQRSKS FT EVHVVNSSMEICSFCKKGSHSPFKCDMFSGWTVQERYDKVKEKKLCINCLL FT PGHIMKNCTSSVCRVCNKKHHTMLHKPVQTSNSAEASPNDREIVTQPDPPA FT DQNVVTYCGNALLTNSIETPSTILLPTALVKIELPDGSLHWARALLDGGSQ FT INLVTERLCQRLQVIKKRENHPIGGVGQSKHVSSHSTQLTIKSHCTSFKAN FT WKFHVMRYITWNLPAEKVNKTRYCIPNTCTLADPKFYEPSSIDLLIGRESY FT DELMLEGILKLVPEKVMLQNTELGWIVSGRVELERRPTSSIVNLVCTNQDL FT ENQLTKFWEIESCNTNSTMSIEETSCEKVFSETTTRDDQGRFIVTLPTKKD FT IVPQLGNSFEIAKRRLNSLNRRLASNKDLKAAYIAFLEEYVQLGHMEEITE FT QHTNIDTPIYYLPHHCILRPDSLTTKLRVVFDASCATDSGLSLNDALMVGP FT VVQDDLVAIMIRFRLPKFAIVADIEKMYRQVWIKKEDRSLQRILWQNCPEN FT KLRIYELKTITYGTASAPYLATKCLQMLSVHGTSTHPEASRVLANEFYMDD FT LLTGVETQTEGEELCHQLTDLLSSAGFTLRKWASNSSQILQSIPVDQRDTS FT GLCSLDINSSIKTLGLKWIPATDELGFCVPIWTEDEQITKRIALSDASRLY FT DPLGLIGPTIMIAKCFMQNLWALQKAWDEPLEKELHKQWNQFRQQLSIVKD FT MRIPRRVVGSTHRIEIHGFSDASMKAYGACLYMKSVSEDGKVSVNLLCSKS FT RVAPLANSKRQKNVTLPRLELSAALLLCHLWQKVKDSLKHEYSCFYWVDST FT IVLHWINSSPSRWKPFVANRVSEIQHLTEPRHWNHVPGDQNPADIISRGMM FT PSQLQESCLWWHGPEWLSQPSNTWKLHHPILDCPPSEFEERKTVLIINKQS FT NIHHPIFSLKSTFSGLVRLMAYMQRFSYNCKPVNRNNRRQGYLQTFELHAA FT RENLVRIAQNESFADDIRSLETAGEVKTSSSLRSLTPMLVNGVLRIGGRLR FT NAPVAYDRKHPMILPYKHPLTRLVMDFYHLKTLHAGQQLLIASVREKYWPL FT RVRNLARQVVHECIQCFRCKPSTMEQIMGDLPAERVTPTFPFLNTGVDFCG FT PLFYRSASRKSAPVKCYVAVFVCLATKAIHLELVADLSSDAFISTLKRFVA FT RRGKPSLLQCDNAKNFRGAERKLKVFHQQLQQQQFQQSISSYCGPEGIEFR FT FIPPRSPHFGGIWEAAVKSFKHHFRATIGTSILRRDDLETIIAQVESCLNS FT RPLTPISTEPEDLEVLTPGHFLIHRPLVAVPEPSYEEVPSNRLDRYQQNQE FT FVRRIWNRWSTDYLSGLQPRTKWTKQRDNIHIGTLVLMKEDNLPPLKWSYG FT RVTQIYRGDDGNIRVVTVKTKDGEYKRAITKICVLPIHSNTE" XX SQ Sequence 5752 BP; 1667 A; 1200 C; 1313 G; 1572 T; 0 other; ttttggtcct tcaaatccgg atatcgtttg aaaatcgtga tctatggtga tctttcgtga 60 acattcgtac gcctttgtga acagtcgagc attatcgtat gcgttgttca taggtgttcc 120 tttcgttgta tgcgccttgt tcgtgtggag ttttatcttg ttgttattat ctatggtgat 180 aatttgaaga cagttcctgt tagaggatac tttgttgaat atcatacaat tgagtgtgat 240 attgtgcaaa ttcagcttgc tatcgggtat gctgtattag acgcggacaa tttttaggaa 300 aagatacatt ggtgtgcgtg tacgtgtcgt caggctgtac ctgtttggaa gtgttggtga 360 gatcgcttcg cagccatttt gaagagcagc atcttaaaaa gtattgtcgg cgcaaatagc 420 tcaacaaaat gtcggtcagg agaagtgatg tgcacaaaac gccagcggaa tccgaggaat 480 ggaaaattgc acccattgac atggaagaag ctggaccgtc atcggaaatc cctgcgggtt 540 tggatgaagc aaaggaagtg catcggccta ttgataaaaa ccaatgtata cgcaagcatt 600 ttcttaacaa attacagcga atcgaagaag cattgagtgg tacatcgttg ggtgacacga 660 cgtttctgaa gggatgtgca aatcggctta cctcactggc gtcggagtat gaaaaatggc 720 atcaaaccgt cttggaaacg gccgatatgg agaattttga ggacggtgag gaggaatatg 780 cacgattcga aaaacgtcac ttcaacttat tgctacggat tgaacgtggt atgtcgatca 840 ctacaaatgt ttcgcaatct cgtgtaaagc ttcccgaatt acggctgcca acgtttgatg 900 gctctctcga agcttggctg ccttttcgcg attcttttag tagcctgatt gatgctaatg 960 caagcctgtc agatgtggac aaattgcggt atttgaaggg agcattgaca aaggaagcaa 1020 acaagctaat tgcggacatt gagattactt cggctaatta catagtggca tgggagcttc 1080 ttaaggctcg ttatgaaaac aaaaagctag cggtgaagcg acatattgat gctttgtttt 1140 tgataccggt aatgaagaag gattcgtacg agtcgctcat tcacatttta gatagtttcg 1200 agagaagcgt caacataacg aagcagcttg gtgtggcaac agagggttgg agtgtgcttc 1260 tggcgcacat gctacattct cggcttgatt cggctactca aatgcattgg gaggctcacc 1320 atcgcagtac tgacgttcca gaatactatg aacttctaac gttcttgaag agtcatgcgt 1380 tagtgttgca ggcaatgctg tctccaggcc aaaagaaaga acaatataca tcgtcgtgga 1440 agcagcgatc gaaaagtgag gtgcacgtgg taaattcttc tatggagata tgctcgtttt 1500 gtaaaaaagg ttcacattcg ccctttaagt gtgacatgtt tagtggctgg acagtacaag 1560 agcgctatga caaggtcaaa gaaaagaagc tttgtatcaa ctgtttgttg cctgggcata 1620 ttatgaagaa ctgtacatct agtgtatgcc gagtttgcaa caaaaaacat cacactatgt 1680 tgcacaaacc agtacaaacc agtaattcag ctgaagcatc tcctaatgac cgggagatag 1740 tgacgcaacc ggatccacct gcggatcaga acgtggtcac atactgtggc aatgctttgc 1800 tgacaaattc gatagaaaca ccatctacta tcctattacc cacagctttg gtgaaaatag 1860 agttaccaga cggatcgttg cattgggcac gagctctttt agatggagga tcgcagatta 1920 atcttgtaac tgagcgtttg tgtcagcgat tgcaggtcat taaaaagaga gagaaccatc 1980 caattggcgg agttggacaa agcaagcatg tatcatcgca ctcgacacag cttaccataa 2040 aatctcattg cactagcttc aaggcaaatt ggaaatttca tgtaatgcgt tacattactt 2100 ggaatttacc tgcagagaaa gtgaacaaaa cacgttactg cattcccaac acttgtactc 2160 tagcagatcc caaattctat gaaccatcat ccatcgattt attgatcggc agagagagct 2220 atgatgagct gatgctagaa ggtattctta aactggtacc cgagaaagta atgctacaga 2280 acaccgagtt aggctggatc gtttctggca gggttgagct cgaacgtcgt cctacatcct 2340 ctatagtaaa tctggtatgt actaatcaag acctagaaaa tcaactaact aaattttggg 2400 aaatcgaatc ttgtaataca aatagcacca tgtctattga agaaacatca tgtgagaagg 2460 ttttctctga aacaaccacg cgtgatgatc aaggaaggtt tattgtaacg cttcctacga 2520 aaaaggatat cgttccacaa ctcggaaact cgtttgaaat cgccaaacgt agattgaact 2580 cgctaaatcg tcgtcttgca tcaaacaaag accttaaggc tgcttatata gcctttttgg 2640 aggaatatgt tcaattggga catatggaag aaatcacgga acaacatacg aacattgaca 2700 cacccattta ttacttaccg catcactgta ttttacgccc tgacagttta actacaaagc 2760 ttagagtcgt gttcgatgcg tcttgtgcta ccgactccgg cctctcgttg aatgatgcgt 2820 taatggtggg tccagttgtt caggatgatt tggtcgcaat tatgattcgt tttcgtctgc 2880 ccaaatttgc aatcgtagcg gacatcgaaa agatgtaccg acaagtatgg attaaaaaag 2940 aggatcgctc gttacaaaga atcctttggc agaattgtcc cgaaaataag cttcgaatat 3000 atgagctaaa aacaattacg tatggcactg catctgctcc ttacttagca accaaatgtt 3060 tgcaaatgct ttccgttcat ggaacttcta ctcatccgga agcctctaga gtactggcaa 3120 atgaatttta tatggacgac ttgcttaccg gagtagaaac tcaaacagaa ggagaagaat 3180 tatgccatca actaactgac ttgttgtcga gtgctggttt tactttgaga aaatgggcgt 3240 ctaactcttc ccaaattctg caaagtattc ccgtagacca acgcgatact tctggattgt 3300 gcagccttga cataaatagt tccattaaga cattgggtct taaatggatc ccagcgaccg 3360 atgagttggg attctgcgta ccgatttgga cagaagatga acaaataaca aagcggatag 3420 ccttatcaga tgcatcgcgt ctatacgatc cattgggact cataggtcca acaataatga 3480 tagccaaatg ctttatgcaa aatctttggg cactacaaaa ggcatgggat gagccgctag 3540 aaaaagaatt gcataaacaa tggaatcagt ttcgccaaca gctctcaatt gtgaaagaca 3600 tgcgtatccc acgtagagtg gtaggaagta cccatcgcat cgaaatccat gggtttagtg 3660 atgcgtctat gaaggcttat ggagcctgtt tatatatgaa atcggtctct gaggatggaa 3720 aagtttctgt taaccttttg tgctctaaat ccagagttgc tccactcgca aatagcaaac 3780 gacagaagaa cgtcacttta cctcggctgg aattatctgc agctttattg ttatgccatc 3840 tctggcaaaa ggttaaagat agtcttaaac acgagtattc gtgtttctac tgggtggatt 3900 cgaccatcgt gcttcactgg ataaacagta gcccgtcacg ctggaagcca tttgttgcta 3960 atcgggtgtc tgaaatccag catctgacgg aacctagaca ttggaatcat gttcctgggg 4020 atcaaaatcc agctgatatc atctctagag gaatgatgcc gagtcaattg caagaatcat 4080 gcctatggtg gcatggacca gagtggttaa gccaaccatc caatacttgg aagctacatc 4140 atccaatact cgattgccca ccctctgagt ttgaagaacg gaagactgtt ctgatcatta 4200 acaaacagtc taatattcat catccaatat ttagcttaaa atcgacattc tccggactcg 4260 ttagactgat ggcatatatg caacggttta gctacaactg taaacctgtt aatagaaaca 4320 atcgtcgtca aggctacctt caaacatttg aactacatgc agcaagagag aatttggtac 4380 gcattgcaca gaacgaatct tttgctgatg atattcggtc tctcgaaact gctggagaag 4440 tcaaaacatc gtcatcttta agatcattga caccaatgct tgtaaacggt gtactacgca 4500 ttgggggacg tcttcgaaat gctcctgttg cttacgatcg gaaacaccca atgatactgc 4560 cctacaaaca tccattgaca cgtcttgtca tggatttcta tcatcttaaa accttacatg 4620 ctggacaaca actgttgatt gcttctgttc gagagaaata ctggccttta cgcgttcgaa 4680 accttgctcg gcaagtagta cacgagtgta tccagtgctt ccgttgtaaa ccatcgacaa 4740 tggaacagat aatgggagat ttacccgcag aacgagttac tccaactttt ccgttcctga 4800 acactggtgt cgatttctgt ggaccgctgt tttatcgttc ggcgtccagg aaatctgctc 4860 cggtgaagtg ctacgttgca gtatttgtgt gcctggccac gaaggctatc catctagaat 4920 tggtagctga tttgtcgtcg gatgcgttca tatcaacact gaaacgattt gtcgctcgtc 4980 ggggaaaacc atctcttctt cagtgtgata acgctaaaaa ttttcgtgga gcggaacgaa 5040 aattgaaagt gtttcatcaa caactgcagc aacaacaatt tcaacaatca atttcgtcgt 5100 attgcggtcc agaaggtata gagtttcgtt ttatccctcc taggtctccc cactttggtg 5160 gaatctggga ggccgccgtc aagtctttta agcatcattt tagagctact attggaactt 5220 cgatcctgcg tcgagacgac ttagaaacga tcatcgccca ggtggaaagc tgcttgaatt 5280 cgcgcccttt gaccccaatc agcacggaac ccgaggattt ggaggtgctt actccagggc 5340 atttcctgat ccatcgtcct ctggttgctg ttcctgaacc ttcatacgag gaagtgccat 5400 ctaatcgcct ggatagatat caacagaatc aggaattcgt gagacgcatt tggaaccgat 5460 ggagtacaga ctacctgtct ggcttgcagc cccgcacgaa gtggacgaaa caacgggaca 5520 acattcatat cggaactctc gtgttgatga aggaagacaa cttgccaccg ttgaaatgga 5580 gttatggacg agtaactcaa atctaccgag gagacgacgg caacatccgc gtggtcacgg 5640 tgaagacaaa ggacggcgaa tacaaacgag caattacgaa gatctgcgtt ttgcccatcc 5700 attccaacac ggaataagtg gtggaaattt caatttccac gggggccggc ta 5752 // ID BEL14-LTR_AG repbase; DNA; ANG; 400 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL14-LTR_AG is a long terminal repeat of the BEL14_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL14-I_AG; BEL14-LTR_AG; BEL14_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-400 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL14_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 36-36 (2003). XX DR [1] (Consensus) XX CC BEL14-LTR_AG flank an internal portion of BEL14_AG (deposited as CC BEL14-I_AG). XX SQ Sequence 400 BP; 106 A; 80 C; 109 G; 105 T; 0 other; tgttacgaaa agaacttagg cgagccctgc aattatctac caaaccgggc gaaatttgca 60 gcgagttaga aatgatcggg cgaacccgat cgtggccgga gcgcgatgaa caccacacgt 120 gtgtgtgggt gagtgctcaa gaggccgaaa aggaagaggt tcgggttcgg gatatcgcag 180 ggcattcggc aagcgtacga gcagggaacg ggttggcaga gagcgatagg cgaacctttt 240 ttgtagttct tgatttttca ttgctttttt ctttgctcta actggataga tttgcttaag 300 tacatacagt tcaactttag ttaatctgtt gggtgaaaat aaagacatta ttgtagagct 360 gaactcacaa acgcctacta cgcgcctcct tgtgcttaca 400 // ID ISEAG1 repbase; DNA; ANG; 4879 BP. XX AC AB097148; XX DT 17-NOV-2004 (Rel. 9.1, Created) DT 29-OCT-2010 (Rel. 15.11, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon IseAg1, complete sequence. XX KW L1; Non-LTR Retrotransposon; Transposable Element; KW gag-like domain; ISEAG1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4879 RA Kojima K.K. and Fujiwara H.; RT "Cross-Genome Screening of Novel Sequence-Specific Non-LTR RT Retrotransposons: Various Multicopy RNA Genes and Microsatellites RT Are Selected as Targets."; RL Mol. Biol. Evol 21(2), 207-217 (2004). XX DR Genbank; AB097148; Positions 1 4879. XX SQ Sequence 4879 BP; 1201 A; 1278 C; 1217 G; 1183 T; 0 other; cagtacgaat caaaacgtca acgagtgcag tcgtgttttg attcattttt gcaaaccccc 60 gcgatggcga ataagcgatc acaatatgcg gacggaacgt tccgcatcga tttctcgatg 120 gtgccaaagc gaccgtctcc ttcggagacc gtcgatttca tgtgcgagcg gctgggaatc 180 ggcggaaaaa aagtgcaagt cgcccagcta aatagtgcga agagctgtgt gtttgtgacg 240 atggaaaggg tggaggatgc ggaggccatc gtcaaggaac acagaggaaa gcactccatc 300 acgcacgagg aaaagcactt tcccatcaat attgagatgg tagatggggc ggtagacgtg 360 tccgtgcgcg atctgcctca tctaatcccg gacgaggaaa ttgttgccga gatggcacaa 420 tacggtgaag tcctatccgt caccaagggc gtatgggccc ccgattcacc gtaagctggt 480 gcgctcaacg gtgtgcgtgt gttgaggatg aagctgttaa agcccatcct ctcctacgta 540 acactctgtg gggaggttac aggtgtgtcc tatcgtggtc aggcacaaac ttgccgcaac 600 tgtgctgcac cggtgcacca cggtctgaat tgtgtacaga accggcaaaa tcggtttgcg 660 aacgttgtac aagtaaaggc cacgtacgcc aacaccgtta gtgcaaaaac agttgcaccc 720 acaccacaag cagcgtcaca cactgcagca ggcaatagtg gccagaagaa aaagaagaag 780 agccgctcaa ggttcctatc ggcggcgaca cctgcaccta cacacgcaaa cgtggaagat 840 cagcatccac gcgacgcatc gaagatcgcc gcacgactga tcatcccgat cgtgccggtg 900 accgttcggc cttccagccc gaagccgaac tctcgcaagg ggaagataag caatgcgaac 960 gataagcatg tcgctccagc tctatctatc gctcccacga cggatgctgt cgtctccgcg 1020 catgatcggc gcttcgatga tgcgtcatca cctgccgttc ctgccgctcc tgtggccacc 1080 gccgctcttg cggctaccgc ctttgctgcg actaatgctg cttctgtggc taccgccgct 1140 cctgcagcta tcaccgctcc tgcagctaac gccgcatcca cggctgctgc tcctgctgct 1200 gccaccgctc atgcggctac cgcctctcct gtggctaccg ccgcactcgc tgccggcgct 1260 cctgcgactg tctccactcc tatggataag gatgatcctg ctgctgccgc cgctcctgct 1320 actgccgaag tccctggtgc tgtcgtcgct aacccagcgg caacaagtgc tccactggcc 1380 tttaaggtac cgctcgatgt ccttcctgcc ccctttccag gcccttccac ggatgaaccg 1440 cgaacggtaa cacgtaagcg aaccactgag tcagatgtcg agagtgacgc ctcaagttca 1500 tccatgaaca gctttacgat ggtcgccaag cgtaagcctg gtcgaccatc caaaaaggcc 1560 accaccacta acaacctctg aatttcgttg cgatggatac ggtgaccaat aatataagaa 1620 caattgttaa tggtagctgt aatatagctt ctattaacat caacaccata tccagcgcaa 1680 cgaaattaga agctcttaag acatttatca ggacgatgga tctggatgtt atatttctgc 1740 aggaagtata tcatacggat ctagcgcttc caggatacaa tgttttgtct aatgttgacg 1800 cgtcgaggag gggtacggcg atcgctttac gggaccattt aaaattttct cacgtcgagc 1860 gcagcttgga ctcacgtcta atctgcgttc ggttggagaa tatagccacc ctatgtaacg 1920 tttatgctcc ttcgggaagc cagcgtgggg cggagcggga ggaatttttc aaccgtaccg 1980 tggcgttcta tctgcggaat gcatgtcctc atgtcattct tgcgggcgac ttcaactgtg 2040 tgttgaagtc gaaggatgtc acgggtgggg ggaatttcag ccttgctctg cgaaacgccg 2100 taaacagcat gggtatgagt gatagctggg aggctctcag aggcaactct gtggagtttt 2160 cctacatcac tagtggttcg gggtcacgca tcgatcgttg ttatgtctcc tcttcgctgg 2220 aaacacagtt aaggactacc gatatgcatg tgctctcctt ttcggatcac aaggcactca 2280 cagtgcgcct ctgcctccca accccgccca atcgtttgac caacaacggt tattggcagc 2340 tgaggccaca cgtattaact gaaagaaacc tggaggagtt cagatgcaag tggaataact 2400 ggactaaaca gcgtcgaaac tacggcacgt ggatctcgtg gtgggtggag ttcgcgaagc 2460 ctaaaataaa atcatttttc cgctggaaaa caaatgagag attccgaagt ttccacttgc 2520 accacgaact cctctacaga gggttgaagt cctcatacga gcgatatctg tctgacccta 2580 acgagttgac gaatatcaat cgtataaaag gtaaaatgct tctccttcag agacgttttt 2640 ccgaggactt taaacgtatc aatgaaaccc gcgtatgtgg cgaaaacatc tcgaccttcc 2700 agcttgagga aaggaggagg aggcgaactg tgatcgaaaa actaaatatt gaagacggga 2760 catccttaac tgataaagac gcgataaatg ttcacattag gagcttcttt tcagagctat 2820 acaccacaga caacacagac aacacacaaa cagacgacac atttacatgc acacgcgtca 2880 ttcctgagga ttgtgagatc aataatggtt gcatggagga aatcacttct ggtgagatcc 2940 tctatgcgat caaaaactca cagtctcgaa agtcccctgg cccagatgga atacccaagg 3000 agttttatct ccgcgcgttc gacgtcatcg agcgcgagct tgggctcgtg ctcaatgagg 3060 cactccgcgg agagattcct cgaagcttcg tcgacggggt catagtgctg gtgaggaaaa 3120 aggggggtgg ggatgccatg tcttcaattc gacctatatc actccttaac acagactaca 3180 agcttttgtc gaaggttatt aaaccccgtc tcgattgtat aatcgggaaa tgggggataa 3240 tctcgacagc gcaaaagtgt tgcaacaaac cttgcaacat ttttcaagct gtactctctg 3300 ttaaggagag ggtggttgat ttgaagatga aacgcaagtg tggtaaacta gtctcctttg 3360 atctctcgca ggcgttcgat cgcgtcgata gaggcttcct ctttaataac atggtttcta 3420 tgggtttcaa tcctgggctg gtggggcttt tgagaaggtt tggtgatcag tcctcctccc 3480 gaatcctagt caacggatcc ctatctcccc ctatttccat ccgacgatcc gttcgtcagg 3540 gggatccact ttccatgcac cttttcatcc tataccttca tcccctcatc accagactag 3600 aaggcatatg ctgcgaccag gatgacctgg tcaatgcgta cgccgacgac atctcggtgg 3660 tgactacctc atcccaaaaa atcgagttgg tgcgtgaggc ttttgaagcg tttggcagag 3720 tttccggagc ccgcctcaac gtcgacaaaa ccatcgcgct cgacgtggga tacaccacat 3780 caaatgccat tcaggtccct tggcttcgta ccgtggagag gctaaggctc ctcggtatat 3840 tgtttaccaa caatgtacga gaagctatga gcctcaactg ggacctgttg atccaccatt 3900 tccgacagct ggtgtggctt catcgcgtga gggacttaaa cgtggtgcag aaggttgttc 3960 cgctgaacac cttcctactc cccaagctgt ggtttgttgc atcggtttgt ggcgcccgtg 4020 caatggacat agcgaaggtg acctgcaccg tgaacacgtt cctctgggat ggatccggag 4080 gcttccgagt cccactgcag caactggcac tgcctcgcaa ccgtggtggg cttaaccttc 4140 accttcctgc catcatggcc aaggcgctgt tgacgaaccg gtatgttacg gaacaggatt 4200 gtctacgagt cggaggtcag cacatttctt gtgctgggaa tcctccgaac atcgccgctg 4260 tttcgtcaac gtacccgtgt ctgcgtatcg tcatacagca acttggctac caaccaccat 4320 cagccaccat cacgacacgc tggattcgcc aaacaatgac ggaagtgctt cttgagccga 4380 aggtatcctt ggaaaatccg tcagtgaatt ggcggctact ctggcgcaat attcatcgtt 4440 cctgtctatc gtcccttcaa cgcagcacgc tcttcttgct ggtaaacggc aagatcagtc 4500 atggagagct gctgcatcga atgaaccgcg tcccatctcc gtcatgttcg ttttgttctg 4560 caatagacac cctggaacac aaattcgctg gctgcagaag agttgctgat gcttggcaga 4620 ttctgcacca gcgaattagc gttgtaattt cagggtgtcc gtcttcgtca acgttattgt 4680 tcggtttgtt gcgtttgcct gtactaaacg gcattaatgc aggtaaacgc agccatatcc 4740 taagtttgtt tgcgaactat gtcatccaca tcattggaaa caatgatgct gtgattgaca 4800 tagaagttct gcgttgttca ttgtaaaaat taatgttaaa tagaacctaa gcacgttttt 4860 aataaaacat taaaaaaaa 4879 // ID GYPSY56-LTR_AG repbase; DNA; ANG; 226 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY56-LTR_AG is an LTR of retrotransposon GYPSY56_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY56_AG; GYPSY56-I_AG; GYPSY56-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-226 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY56_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 152-152 (2004). XX DR [1] (Consensus) XX CC GYPSY56-LTR is a long terminal repeat of GYPSY56_AG (its CC internal portion is deposited as GYPSY56-I_AG). XX SQ Sequence 226 BP; 66 A; 43 C; 64 G; 53 T; 0 other; tgttgtgacg agtagaaaac gcccttaggc actcaggagc tgcgcgcctc ggaacgtgag 60 tgtgttcggg tttagccagg gatcgtgagg acactctcgg ataagcggtc aagcaagtag 120 gaacgcgatt aggaagtaag cggaataaaa aagtatcctg ttagctgtat gagtaaacgc 180 gtgttgtttt aacgttttat attacgaacc acccgaaaac gtaaca 226 // ID Ag-Jen-11 repbase; DNA; ANG; 4178 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE A Jockey clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW Jockey; Non-LTR Retrotransposon; Transposable Element; Ag-Jen-11. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4178 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4178 RA Kojima K.K. and Jurka J.; RT "Jockey clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 4 CC sequences with >96% identity. XX FH Key Location/Qualifiers FT CDS 144..1277 FT /product="Ag-Jen-11_1p" FT /translation="MGCSKKGSAPRDAVPSCSYITIPNVTTSNDYDILSED FT EEAPSIIAASATAKQQKSKIPPITVTGTQVGDVHKKITSVGVVNYTTNCTY FT KGIVVNTTTSKDFKLVVDVLKRHNNSFYTHQLPEEKTTKVVIFGLPEQDTE FT YIKTVLAEVNIAPCVIKKLELKKKRYDDQCMYLLHFPRGSITLSELKQVRA FT VDHHRVYFEYYSNKYGPMQCTNCQRYGHGSANCYLPPVCVKCADKHTSKTC FT PLSQTTTCPDGKIAPAKLKCANCSEHHTANFSGCPARLRLKKVTPTPSRSR FT EFTYNETSFPRLPNTPTRSTPSNHVNFSDLFHQQQTRSNNNLDFDLNEIGQ FT IVKDVFSNLSKCSSKQDLIAVIVQISAKYCFPNLK" FT CDS join(1228..2331,2420..3883) FT /product="Ag-Jen-11_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="SQLLFKYQQNIVFLTSNNVLKIIHWNANGISSKKDEF FT FNYLLKKDIHIAAISETFLKNGTRFSHPNYHTYRLDRTQGEKGGVALVISK FT AINHEVNSDYKLKVLEAIGITIKTQNSKVTIISIYNPGSNNDHSSFRKDIT FT KLSKISNHFIICGDFNARHRFWNCSKANQMGKTLFEENQRGKFSIVHPEEH FT THYPNDPNRRPSTIDLLLTNKPEKICNILTIQKFTSDHLPVYFEYNIGLIQ FT NKHISGIPNYSEANWTLFKEILNNRLNTNNLNLQNINTSNQIDQMIKYFSK FT LINYAHTKSVPFIIPDKFKITLPPNILQKIKHRNATRKQWQKDRNNTILKT FT TYNTLKKRNNIRYKKLTKPELDRQPKANGKCYITNNEKCEILRDHFEKAYR FT TTFPNQSPIEHHIQETNSRFFSYHQAQPSNNFDPINFIKPKHIKNIIYKLK FT SKKSQGPDQINNHSLKQLPKKAIVYLAFILNACTKIHYFPLEWKIGKVIPI FT PKLKKDHKDPKSYRPISLLSCLGKIFEKILHKKIIKHLNQHTIIPSYQYGF FT QPGHSTTKQIDRLTTNIKTQRNIKKSTGFVMLDLEKAFDTVWHQGLLYKLI FT SFNFPPNIIQLIQSFLSDRSCFVQITSSKSNDIHPPAGLPQGSVLSPVLFN FT IFTSDLPKTKSVKKYCFADDFAMSSSDSNPKSIIKNLNNGIKKYVTFCKTW FT KLKINEEKSEAIFFTRCTSAYKLPDRNLKISDWDIAWKDNVKYLGLFLDKR FT LTFRKHIEEKVISGSRLIKXLYSFINKNSKLNLKNKLLLYKTVIRPTILYS FT SKIWANCANVHINKLQVFQNKFLKRVLNLPPWHSTFDIHRITGFELVKNFV FT RNT" XX SQ Sequence 4178 BP; 1598 A; 865 C; 622 G; 1091 T; 2 other; gagttcttcg tcagtctccg acccgatcgg acgcatctaa tcgttatctg cgatcactga 60 tatagaacac acacactagt aaaacagaca actacaacaa caaaatagcg aaggcgttaa 120 aaaggctagc tcggcccata gccatgggct gcagcaagaa aggatcagca ccgagggacg 180 ccgtgccttc gtgctcatac atcacgatac caaacgttac cacaagtaac gattacgaca 240 ttcttagcga agatgaagaa gctccatcga taattgccgc atcggcaacg gcaaaacaac 300 aaaaatcaaa aataccaccg atcacagtca ccggcacaca agtaggagat gtgcacaaaa 360 aaatcacctc cgttggtgta gtgaactaca caactaactg cacatataaa ggtatagtgg 420 ttaacactac cacatctaaa gattttaagc ttgtagtgga cgtgttgaag cgccacaaca 480 attctttcta cacgcatcaa ctaccggaag aaaaaacaac gaaagtggta atctttgggt 540 tacccgagca agatacagaa tacattaaaa ccgttcttgc tgaagtcaat atcgcaccat 600 gtgtaatcaa aaaactcgaa ctcaagaaaa aacgctacga cgaccagtgt atgtatctcc 660 tacacttccc tagagggtca ataacactga gcgagctgaa acaggttaga gcagtagatc 720 atcaccgagt atactttgag tactactcta acaaatacgg tccaatgcag tgtacaaact 780 gccagcgcta cggacatggt agcgcaaact gttatcttcc tccagtgtgc gtaaagtgtg 840 ccgataaaca cacatccaaa acgtgtccac taagccaaac aacaacctgt ccagacggta 900 aaatagctcc agccaaattg aagtgcgcaa attgcagtga acatcatacg gcaaatttct 960 ccggctgccc agctaggcta cggcttaaaa aggtaacacc aacaccttct cgctccagag 1020 aatttacgta caacgagaca tcgttccctc gtctccccaa cacaccaaca cgttccactc 1080 cttcaaacca cgtgaatttc tctgacctgt tccaccaaca acaaacacgt tcaaacaata 1140 atcttgactt tgacctcaac gagatagggc aaatcgttaa agatgtattt tcaaatctga 1200 gcaaatgtag ctcaaaacaa gacctaatcg cagttattgt tcaaatatca gcaaaatatt 1260 gttttcctaa cctcaaataa tgtacttaaa atcattcact ggaatgccaa cggtatttca 1320 agcaaaaaag acgaattttt taactacctt ttgaaaaaag atatacacat cgcagccata 1380 agtgagacat ttcttaagaa tggtacccgt ttttcgcacc ccaattatca tacttacaga 1440 ctagacagaa ctcagggaga aaaaggtgga gtagctttag taatatctaa agctatcaac 1500 cacgaagtaa atagtgacta caaattaaaa gtacttgaag caataggaat taccatcaag 1560 acccaaaaca gtaaagtgac cattatttcc atctataatc caggttcaaa caatgatcac 1620 agtagtttca ggaaggacat cacaaaacta agcaaaatat caaatcactt tattatctgt 1680 ggtgatttta acgcacgtca ccgtttctgg aactgtagca aagcaaatca gatgggaaaa 1740 acccttttcg aggaaaatca aagaggaaaa ttttctattg ttcatcctga agagcacacc 1800 cattatccaa atgacccaaa tcgtcgcccc tcaaccatag atctactatt aactaacaaa 1860 cccgaaaaaa tttgcaacat tcttaccata caaaaattta cctcagatca cttacctgtt 1920 tattttgaat ataatattgg cctaatacaa aacaaacata tttctggaat accgaactac 1980 tcagaagcaa actggacttt attcaaagaa attttaaata atcgcctcaa tacaaataat 2040 ctaaatctcc aaaatattaa cacatcaaac caaatagatc aaatgataaa atatttctca 2100 aaattaatta actacgccca cactaagtct gtcccattca taataccaga taaatttaaa 2160 ataactctac cccctaatat acttcaaaaa attaaacatc gaaatgcaac aaggaagcaa 2220 tggcagaaag atagaaataa cactatactt aaaacaactt ataacacatt aaaaaaaaga 2280 aataatatta gatataaaaa acttacgaaa ccagagttgg acagacaacc ttgaaaaaat 2340 taataacgac ccaagcaaat ctagactatg gaaatttgtt agaattataa aaggtaattc 2400 ttcatttatt cctccctgaa aagctaatgg aaagtgttat attacaaata atgaaaaatg 2460 tgaaattctg agagaccatt ttgaaaaagc ttacagaaca acgtttccaa atcagagccc 2520 aattgaacat cacatccaag agacaaatag tagattcttt tcctaccatc aagcacaacc 2580 cagcaacaac ttcgatccca taaactttat taaaccaaaa cacattaaaa acatcattta 2640 taagttgaaa tcaaaaaaat ctcaagggcc agatcaaatc aataaccatt ccctaaaaca 2700 gttgcctaaa aaagccatcg tatatttagc gtttatttta aacgcttgta ccaaaatcca 2760 ttattttcct cttgaatgga aaattggcaa agtcatccca attccaaaac tcaaaaaaga 2820 ccataaagac cctaaaagct atagaccaat aagtttactc agttgtttag gtaaaatttt 2880 tgaaaaaatt ttacacaaaa aaattattaa gcatctcaat caacacacca tcataccttc 2940 ttaccaatac ggttttcaac cagggcattc aactactaaa caaattgata gattaaccac 3000 aaacatcaaa acacagcgaa atattaaaaa atccactggt tttgtgatgc tagatctaga 3060 aaaagcgttt gatacggtat ggcaccaagg gcttttgtac aaactaattt ctttcaattt 3120 tcctccgaac atcattcaac tgattcagtc atttttatcc gatagaagct gtttcgtgca 3180 aataacctct tcgaaatcaa atgacattca cccacctgcc ggattacccc aaggcagcgt 3240 gctgtcaccc gttcttttca acatatttac aagcgacttg cctaaaacaa aatctgtaaa 3300 aaaatattgt tttgcagatg actttgccat gagtagttcg gattccaatc caaaatctat 3360 tattaaaaac ctcaacaatg ggattaaaaa atatgtgaca ttttgtaaaa cttggaaatt 3420 aaaaatcaac gaggaaaaat ctgaggcmat attttttaca cgttgcacta gcgcatataa 3480 attaccagat cgtaatctaa aaataagtga ctgggatatt gcttggaaag acaatgtaaa 3540 gtatttggga ttgtttttgg ataaaaggct caccttcagg aaacatatag aggaaaaagt 3600 tataagtgga tcaagactta taaaaawttt gtatagcttc ataaacaaga attctaaatt 3660 gaacttaaaa aacaaactgt tactttataa aaccgtaatt aggccaacaa ttctttactc 3720 atctaaaata tgggcaaatt gtgcaaatgt tcatataaat aagttacaag tatttcagaa 3780 taagttcttg aaaagagtgc taaaccttcc cccatggcat agcacatttg atatacatag 3840 aattactggt tttgaactag taaaaaattt tgttcgtaac acataagttt tgtaatgcca 3900 gtgaaggcac aattgtaaat agctagattg taatgtagat gtaaaaatag gagtaaggac 3960 tagtacgtag gattatctta aggtaaaaca atcagagggt tttttttcta atttttcccg 4020 atctacttca attcaggtaa agatgtacaa tccaaatacg caaagcgtta taacaatcag 4080 gttattgagt agaggtcaat cgatcggaac caatgtataa ttcgtggaat attgttacaa 4140 aacaaatcaa taaattactt acttacttac ttacttac 4178 // ID GYPSY72-I_AG repbase; DNA; ANG; 4102 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY72-I_AG is an internal portion of retrotransposon GYPSY72_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY72-I_AG; GYPSY72-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; integrase GYPSY72_AG; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4102 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY72_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 183-183 (2004). XX DR [1] (Consensus) XX CC GYPSY72_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase is phylogenetically grouped CC with representatives of the MDG3 lineage of other organisms. CC GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, GYPSY34_AG, CC GYPSY35_AG, GYPSY36_AG, GYPSY37_AG, GYPSY38_AG and GYPSY71_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY72-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1341-aa CC GYPSY72_AGP gag-pol like polyprotein (pos.35-4057). The CC sequence of the LTRs flanking GYPSY72-I_AG is deposited as CC GYPSY72-LTR_AG. CC GYPSY72_AGP: CC MDIPASIEELMEQFTRVGLVKRCEALGLDTSGSKEELAGRIVEHSEGGNQDTTRVGDYLDAVEG CC SSTSSRESVDREARGVSIHFKDVQDYLPTFSGKEDVCAWVIEFETSSKLFKWHDLHKLIYAKRV CC LQGAAKAFVRSIPEVSSWIDLKHALVEEFEEKVSSATVHESLRRRKKKSDESCSEYIYAMCEIG CC KKGHVDEQSLCEYIVQGIDDDPMNKTCLIAAKTIKQLKSQMRTYEKLAEQIREQRSKRQIPAKN CC PDSNKTKYAPPASMKCRSCGKTGHIVKDCPNKAAGPVCFACGKPGHRARECSERPHFSGNANLI CC SSDKGGMITMKIGEKELNTLFDTGSQFNLLSEVAQNTVSCLNVTPTTMCFNGFGGKRTAALGKI CC ETNVTIDQHIYQGIQFYIVPASSMGYDAVIGRDALKDMEATVTQQGIKVRPIKNNAEESECVSE CC VLLCEPDKITVPPKFRETIASIVSEYEEARCEGKSVQSPVKLTIVPNERIVPFRHTPGRLAYPE CC EKAVDDQVEEWLTKGIVRPSTSDFASRVVVVKKKDGSNRVCVDFRKLNAMILKDGFPIPVIDEV CC LQKLQNAKWFSVMDLENGFFHVPIDEASKKYTAFVTKKGFYEFNRAPFGLCNSPAVFMRFVNHV CC FLNLINEGVLEIYMDDLIVYAESEEECLLKTKRVLLTAAAYGLAVKWNKCKFMQSSVFFLGHYV CC ENGTISPSPEKISAVTKFSVPRNVKAVQAFLGLTGFFRKFVKNYAQIARPLTNLLRNDAVFVMG CC HEEMHAFNLLKEILASEPVLRLYERKAETEVHTDASKEGYGAVLLQRFDGKLHPVAFWSKKTTD CC AEAKKHSYVLEAKAVYLAVKKFRHYLLGMRFKLVTDCIAFKQTLQKSEVPREVLSWVVYLQDFT CC FDTEHRPGTRLKHVDCLSRYPVRVMIVNDDVTAKIGEAQKKDETIKAIREILASRPYESYKLKG CC GLVFKTIDGIDLLVIPKVMEKEIISIEHNNGHFATQKTVHGLKQKYWIPHLEQKVKQVIDNCVR CC CIMYNKKLGNKEGYLHPIEKGDRPLQTLHVDHVGPMDATGKQYKYILTVVDGFSKFVWLYPTKT CC TNAQETIRKLENWTAIFGSPERIVSDRGAAFTSLVFADFIKENGIEHVVCTTGVPRGNGQAERV CC NRTVIAALAKLSAEDSTKWFKWVDQVQQALNSHFNASTKKTPFKVMFGVEMRHIVTGELVKLIQ CC EEMYERFEEERQEIRLEAKEAIGRAQLEYKKQYDKKRKPEPGYSVGDLVAIKRTQFVAGKKLAS CC EFLGPYEITQVNRNGRFKVKKAAECEGPNITTTSADNMKLWTYTVNNEDLEAYDSDESSEI. XX SQ Sequence 4102 BP; 1335 A; 709 C; 1089 G; 969 T; 0 other; tttggtgtca gaagtgggat aacgcaattt tgctatggat atcccggcga gtatcgaaga 60 gttgatggag caatttaccc gtgtcggctt agtaaaaagg tgtgaggcct tagggctaga 120 cacaagcggt tcgaaggagg aattagctgg cagaatagtg gagcacagtg aaggtggaaa 180 tcaggacacc acgagggtgg gtgattattt ggatgcggta gaagggtcaa gcacgtcgtc 240 acgagagagc gtagatcgtg aggcccgggg cgtgtccatt cacttcaaag atgtgcagga 300 ttatttgcct acattctccg gaaaggaaga cgtatgcgca tgggtgatcg agtttgagac 360 aagtagcaaa ctgttcaaat ggcatgatct tcacaagctc atttacgcaa aacgagttct 420 ccagggcgca gccaaggctt ttgttagaag cattccggaa gtatcgtcgt ggatcgattt 480 gaagcacgcg cttgtggaag aattcgagga gaaagtgtcc agtgcaacgg tacacgaatc 540 acttcgacgt agaaaaaaga agtcggatga atcatgttca gaatatatct acgctatgtg 600 cgaaatcggg aaaaagggac atgtcgatga acaatccctt tgtgagtaca tagtacaagg 660 aatagacgat gatccaatga ataaaacctg tttgatcgct gcaaaaacga ttaaacaact 720 gaaaagccaa atgcgaactt atgagaaatt ggcagagcaa attcgcgagc aacgatctaa 780 gcgtcaaatt cctgctaaga accctgacag taataaaaca aaatatgcac ctccagccag 840 tatgaaatgc cgtagctgcg gtaaaacggg acatatcgtc aaagattgcc caaacaaagc 900 agccgggcca gtttgttttg cgtgtggcaa gccagggcac cgagccaggg agtgcagtga 960 aagaccccat ttttcgggga acgcaaattt gatcagcagc gacaaaggag gaatgatcac 1020 gatgaagatc ggtgaaaaag aattgaacac tttgttcgac acgggtagcc agtttaacct 1080 tttgtccgag gtagcacaaa acacagtaag ttgcttgaac gtcacaccaa caacaatgtg 1140 tttcaacggg ttcggtggaa aacgaaccgc agcgctgggc aaaattgaaa cgaacgtgac 1200 aatcgatcaa cacatttatc aggggataca gttttatatc gtgcccgcca gcagcatggg 1260 atacgacgca gtcattgggc gagacgcgtt gaaagacatg gaagcaacgg taacacaaca 1320 gggtataaaa gtacgcccca ttaagaataa cgcggaagag agtgagtgtg ttagtgaagt 1380 gttgttgtgc gagccggata aaataactgt gccacccaag tttcgtgaaa caattgcgag 1440 tattgtgagt gaatatgaag aagcacgttg tgaaggtaag agtgttcaaa gtccagtgaa 1500 gttaacaata gtaccgaacg agcgcattgt tccttttcga cacacgccag ggcgattggc 1560 atacccagaa gaaaaagcgg ttgacgatca ggtagaagaa tggttaacaa aaggaattgt 1620 gcgaccgtcg acgtccgatt ttgccagtcg cgtagtggtg gtaaaaaaaa aggacggcag 1680 caatcgcgtg tgcgtggatt ttaggaaact aaacgcgatg atattaaaag atgggtttcc 1740 aatcccggtg atcgacgaag tgttgcagaa gctccagaat gcaaagtggt tttcagtgat 1800 ggacttggaa aacggttttt tccatgttcc gatcgatgaa gcgagcaaaa aatacactgc 1860 gtttgtaacg aaaaagggtt tttacgagtt caatcgagcg ccttttgggt tgtgcaactc 1920 gcctgcagtt tttatgcgtt ttgtaaacca tgtgtttcta aatcttatca acgaaggagt 1980 gttagaaatc tatatggacg atttgatcgt gtacgcagaa tcggaggaag aatgtttgtt 2040 aaaaacgaaa cgagtgcttt taacggcagc ggcgtacggg ctcgctgtga agtggaataa 2100 gtgtaagttt atgcagtcgt cagtgttttt cttaggacac tacgttgaaa acggtacaat 2160 ttcgccaagt ccggaaaaga tcagtgcggt gaccaagttt agcgtgccaa gaaacgtgaa 2220 agctgtgcag gcatttcttg ggttgaccgg attcttcagg aagttcgtga aaaattacgc 2280 gcagattgct cgacctctta ccaatctgtt gcgtaacgat gcggtgttcg tgatgggtca 2340 tgaagagatg cacgctttta atctgctgaa ggagattttg gcgagtgaac cggtgttgcg 2400 gctgtacgag aggaaagcgg agacggaagt gcatacagac gcgtcgaagg aagggtatgg 2460 tgcggtgttg ctccagcgtt ttgacggaaa gctgcacccg gtagcgtttt ggagtaagaa 2520 gacgacggat gcggaagcga aaaagcatag ttacgttttg gaagcgaaag cggtttactt 2580 agcagtaaaa aagtttcgtc attatttgtt ggggatgcga tttaaattag tgactgattg 2640 tattgcattt aagcaaacgt tgcaaaaatc ggaagtgcct cgggaagtgt tgtcgtgggt 2700 agtatattta caagacttta cttttgatac ggaacatcgg ccaggaacca gattgaaaca 2760 tgtagattgt ttaagccgtt atcctgtgcg tgtaatgata gtgaacgatg atgtaacagc 2820 gaaaataggt gaagcgcaaa agaaagatga aactatcaaa gctatacgag agatacttgc 2880 tagcaggccc tacgaatctt ataaattaaa aggcggatta gtttttaaaa ctattgacgg 2940 tattgattta ttagttatac ctaaagtgat ggaaaaggaa ataattagta ttgagcataa 3000 taacggacat ttcgccactc agaaaactgt tcatgggttg aagcaaaagt actggatacc 3060 tcatttggaa caaaaagtga agcaagtaat tgataactgc gtacggtgca ttatgtacaa 3120 taaaaagtta gggaataaag aaggatattt gcatcccatt gagaaaggag atagaccttt 3180 gcaaacattg catgtagatc atgtcggacc catggatgca acaggaaaac aatacaaata 3240 tattctcaca gtggtggacg gattttcaaa atttgtatgg ctgtacccca ctaaaacaac 3300 taatgctcag gaaacgatta ggaagctaga aaattggaca gcgatttttg gaagtccaga 3360 acgtatagtg agtgatagag gagcagcctt tacttcacta gtttttgccg attttataaa 3420 agaaaatgga atagaacatg tagtgtgtac caccggtgtg ccaaggggaa acggtcaggc 3480 agaaagggta aaccgaacag taattgcagc tttagcaaaa ctttcagcgg aagattcaac 3540 gaaatggttc aaatgggttg accaagtcca gcaagctttg aattctcact ttaatgcatc 3600 aactaagaaa acacctttta aggtcatgtt tggagtagaa atgagacata ttgtaacagg 3660 ggaattggtg aaactaatac aggaggaaat gtacgaaaga tttgaggaag aacgacagga 3720 aataagatta gaagcgaaag aagcaatcgg aagggctcag ttagaatata aaaagcagta 3780 cgataaaaaa cgaaagccag agccagggta tagtgtaggc gatctagtag cgataaaacg 3840 aacacagttt gtggcaggca aaaaattggc aagcgaattt ctagggccat acgaaattac 3900 acaggtgaac cgaaacggac gtttcaaggt caagaaagct gcagaatgcg aagggccaaa 3960 tatcaccaca accagtgcag acaatatgaa gctttggact tatacggtca acaatgagga 4020 tttagaagca tacgatagcg atgaaagcag tgaaatatag tgcctggcag gggctgcaaa 4080 tcaggcgtca ggaaaggccg at 4102 // ID GYPSY65-LTR_AG repbase; DNA; ANG; 140 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY65-LTR_AG is an LTR of retrotransposon GYPSY65_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY65_AG; GYPSY65-I_AG; GYPSY65-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-140 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY65_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 170-170 (2004). XX DR [1] (Consensus) XX CC GYPSY65-LTR is a long terminal repeat of GYPSY65_AG (its CC internal portion is deposited as GYPSY65-I_AG). XX SQ Sequence 140 BP; 41 A; 32 C; 34 G; 33 T; 0 other; tgtggtatgt gttgtaagag ctcggcttta ttccgctcct tgctagagcg cggtataatc 60 tgacaacgat gacagatagt cagaacacat acagaacgga gtgaagccag atcgtgtatt 120 acagcaagct acccaccaca 140 // ID RTE-2_AG repbase; DNA; ANG; 3172 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 29-OCT-2010 (Rel. 15.11, Last updated, Version 2) XX DE RTE-like non-LTR retrotransposon - a consensus sequence. XX KW RTE; Non-LTR Retrotransposon; Transposable Element; RTE-2_AG; KW Ag-JAMMIN-1. XX NM RTE-2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3172 RA Jurka J.; RT "RTE-like non-LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 647-647 (2009). XX RN [2] RP 1-3172 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX DR [1] (Consensus) XX CC This is the same element as Ag-JAMMIN-1 in [2]. XX FH Key Location/Qualifiers FT CDS 14..1654 FT /product="RTE-2_AG_1p" FT /translation="MLGAACPFCPGILRDFNQGCVGPTTRLEVGGEALQAT FT PGKTYATXGNQRISNRTNRYRPTQPNKANDWKLGTWNCRSLTAPGSTRILS FT DEVRARGFGIVALQEMRWKGVTERPYRSDCMIYQSGGEKHELGTAFLVIGE FT MRKRVIGWWPINERMCRLRIRGRFFNLSIINVHSPHLGSTDDDKDNYYTQL FT EREYDRCPQHDVKIVIGDFNAQVGREEAFKPTIGSFSAHRLTNDNGLRLVN FT FASSKHMNIRSTFFQHAPRFSYTWRSPQQTLSQIDHVLIDGRHFSDIIDVR FT TYRGANVDSDHFLVMVKLRQKLCVANKLRYQPTPRLNTDRLKQAVVARDFA FT IALGEALPEDNTTEAMSLNDHWRMVEQAISSTAERTIGRVTHNQRKEWFDD FT ECRRALSEKNAARTRMLQRETRQNVEDYRRLRRQQTLLFQDKKRXFEESDE FT QLMQQLSQSGETRKFYRMLNAARSGFTPMTAICRNEEGDILSDEREVIDRW FT KCYFDGHLNGADTGEADAGSRGEQPYDSSSMTMTKCPRHLWTKSSAPSNS* FT " FT CDS 1705..3150 FT /product="RTE-2_AG_2p" FT /translation="MGPERLAVIMHRLIVRIWDQEELPDEWKLGVIHPVYK FT KGDRLDCANFRAITVLNAAYKILSRILFCRLSPLATDFVGSYQAGFVGGKS FT TTDQIFTLRQILQKCREHQIPTHHLFIDFKAAYDTIDRNELWNTMQQYGFP FT GKLIRLLRATMDGVQCKVRVTNMLSESFESHRGLRQGDGLSCLLFNIALEG FT VMRSAGFDIRGTIFTRTLQFLGFADDIDIIGRTTAAVCEAYTRLKREAARI FT GLRINATKTKYLLAGGSDRDRARLGSRVSVDGDDLEVVEEFCYLGTIVTSD FT NNVSSEIRRRIVQGNRAYYGLHKLLRSRRLQQHTKCAIYRTLIRPVVLYGY FT ESWTILTEDANALAIFERRVLRTIFGGVCEHGVWRRRMNHELAELFGGADI FT LTVIKAGRIRWLGHVMRMPDSCPTRKVLVSDPFGTRRRGAQRARWLDQVES FT HLSEIGCSRGWRTAAQDRVSWKRIGDLAMSTRRAHT*" XX SQ Sequence 3172 BP; 813 A; 824 C; 930 G; 593 T; 12 other; gcgtctgttc tccatgttag gggcggcttg ccccttctgt cctggcatcc tacgggactt 60 taatcaagga tgcgtaggcc ctacgacgag actggaggtt gggggagagg ccttgcaagc 120 cacccctgga aaaacatacg caacgaawgg aaaccaaagg atttcgaacc ggaccaaccg 180 ataccgaccc acgcaaccga acaaggcaaa cgattggaaa ctcgggacat ggaactgcag 240 atctctcaca gcaccyggaa gtacccgcat tctttcggac gaggtgaggg cccgtggctt 300 cggaatagta gcacttcagg agatgcgctg gaaaggagtg acggagcgcc cstatcgtag 360 cgattgcatg atctaccaga gcggtggtga aaagcatgaa ctcggtacgg cgttcctggt 420 cataggtgag atgcgaaagc gagtgatcgg gtggtggccg atcaaygaac ggatgtgcag 480 gttgaggatt cgtggcaggt tcttcaacct gagcattata aatgtgcata gtccgcacct 540 tggragcacc gatgacgata aagacaatta ttatacgcag ttggagaggg agtacgaccg 600 ctgcccacaa cacgacgtta aaattgtcat cggggatttt aatgctcagg tcggacggga 660 ggaggcattt aaaccgacga taggaagttt cagtgcccac cggctgacca acgacaacgg 720 gcttcggctc gtaaacttcg cctcctccaa gcacatgaac atycgcagca ccttcttcca 780 gcacgcacct cgcttcagtt acacctggag atcaccgcag caaacacttt cccagatcga 840 ccacgttctc atcgatggaa ggcacttctc ggatataatc gacgtaagga cctatagagg 900 agcaaacgtc gactcagacc atttcctggt tatggttaag ttacgccaga aactgtgcgt 960 ggcyaacaaa ctgcgctatc agcccacccc aaggctcaac acagaccggc tgaaacaagc 1020 tgttgtggcg agggacttcg caatcgcgct tggggaagcg ctgccggagg acaacactac 1080 cgaggcgatg tctctcaatg accactggcg tatggtggag caagccatca gcagcacggc 1140 cgagcgaaca attggccgcg tgacccataa ccagaggaag gaatggtttg atgatgagtg 1200 cagacgagca ctctccgaga agaacgcmgc gcggacccgc atgctccagc gcgagacccg 1260 kcagaacgtg gaagactaca gacgactgag gaggcagcag accctrctct tccaggacaa 1320 gaagcgcsgc ttcgaagagt cggacgarca actcatgcag cagctatccc agtcggggga 1380 aactcgcaag ttctacagga tgctgaatgc ggcacggagc ggttttactc ccatgaccgc 1440 tatatgccgc aatgaggagg gagatatcct gtcggacgag cgagaggtga tcgacaggtg 1500 gaagtgctac ttcgacggac acctgaatgg agcagatacc ggagaggcag acgcaggaag 1560 cagaggagag caaccctacg acagcagcag catgacaatg acgaagtgcc cccgccatct 1620 ttggacgaag tcatcagcgc catcaaacag ctgaagtgta acaagtcagc tggcagcgat 1680 ggtctggtgg ccgaactgtt caagatgggt ccggagaggc ttgccgtcat catgcatcgg 1740 ctgattgtga ggatttggga tcaggaagaa ctaccggacg agtggaaact gggtgtcata 1800 cacccagtgt acaaaaaggg cgacaggctg gactgtgcta acttccgagc catcacagtc 1860 ctgaatgctg cctacaagat cctgtcccga atactcttct gcagactttc gccccttgct 1920 acagatttcg tcggcagcta ccaagctgga tttgttggag gcaaatcaac taccgaccaa 1980 atctttactc tacggcagat cctccagaaa tgccgagagc accagatccc tacgcaccac 2040 ctgttcatcg acttcaaggc ggcctacgat accatagatc ggaacgagct atggaacacc 2100 atgcagcagt acggattccc tgggaagctg atacggctgt tgcgggccac tatggacggg 2160 gtgcagtgca aggtgagagt gacgaacatg ctgtcggaat cgttcgaatc tcaccggggt 2220 ctgaggcaag gggacggact ctcctgtttg ctgttcaaca tcgctctgga aggtgtcatg 2280 cgaagcgcgg gcttcgacat ccggggcacg attttcaccc gaactctcca attccttggc 2340 ttcgcggatg acatcgacat catcgggcga acaactgcgg cggtgtgcga ggcgtacacc 2400 cgactgaaac gcgaagccgc aaggattgga ttgaggatca atgcgacgaa gacaaagtac 2460 ctgcttgccg ggggctctga ccgtgataga gcccgactcg gaagcagagt atcagttgac 2520 ggcgacgatc tcgaggtggt agaggagttc tgctaccttg gcacgatcgt aacttcggac 2580 aacaacgtaa gcagcgaaat ccgaaggcgc attgttcagg gaaatcgtgc ctactacggg 2640 ctccacaaac tcctgcgatc cagaagactc caacaacaca cgaaatgcgc gatatatcgc 2700 acactgattc gtccggtggt cctctacggg tacgagtcct ggactatact gacggaggac 2760 gccaatgcac tcgccatttt cgaacggcgg gtgctaagga ctatctttgg cggtgtgtgc 2820 gagcacggcg tgtggagaag gaggatgaac cacgagctag ctgagctgtt tggcggtgca 2880 gacatcctga cggtcatcaa agccggaagg atacgatggc tggggcacgt gatgaggatg 2940 ccggactcat gccccaccag gaaggtgctc gtcagcgacc cgttcggcac gaggcgtaga 3000 ggagcacagc gagctcgctg gctggatcag gtggagtcgc acctgtcgga gatcggatgc 3060 agccgtggtt ggaggactgc agcccaggac cgagtttcct ggaaacgaat tggcgacctg 3120 gccatgtcta cgagacgtgc tcatacatga gcaggccaag aagaagaaga ag 3172 // ID GYPSY43-I_AG repbase; DNA; ANG; 5274 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY43-I_AG is an internal portion of retrotransposon GYPSY43_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY43-I_AG; GYPSY43-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY43_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5274 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY43_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 78-78 (2004). XX DR [1] (Consensus) XX CC GYPSY43_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY40_AG, GYPSY41_AG, GYPSY42_AG, GYPSY44_AG, CC GYPSY45_AG, CC GYPSY46_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY43-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. CC The consensus encodes the 402-aa GYPSY43_AG1p gag-like CC polyprotein (pos. 425-1630) and the 1150?aa GYPSY43_AG2p CC pol-like polyprotein (pos. 1585-5034). CC The sequence of the LTRs flanking GYPSY43-I_AG is deposited as CC GYPSY43-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1585..5034 FT /product="GYPSY43_AG2p" FT /translation="YFGKFLGSCRNNFTDLDFLPYITGTHPKTSQRFKLLL FT DSGANKNILKPGIIQSLQPVNTVIKNASGCHNVSQKGSINLLGPELPNQMY FT YEYDFHKFFDGIIGSQYLARCNSIINYSNETVTIAGKIIPFVKYFPTNKFF FT HHIVSIDTLENGDWFVPCHQLLCNNLIIEPGLYKSENNKSIVNIISQSHIA FT PTITTSFHLNVNNFETINPIPLKVDEPLNKKAVEILIRSDHLTPYEKDNLF FT DVILNNQQVLLKRNEKLSSTTVVKHKIITKDDEPTYSKTYRFPNHFKKDVE FT EQILEMLEDGVIIPSNSPYSSPIWVVPKKPDASQKRKIRVVIDFRKLNEKT FT INDKYPIPQIEEILDSLGKSTYFSTLDLKSGFHQIEMDPAHREKTAFSTSQ FT GHFEFTRMPFGLKNAPATFQRAMNHVLRGYIGSICFVYLDDIIIIGNNLTS FT HLENLDKVLKRLAACNLKIQVNKCEFLKRETEFLGHLITQDGIRPNPDKIK FT KILDWSLPMNQKEIKQFLGLTGYYRKFVKDYARLTRPLSKCLKQGAKVAYD FT EEDYKKAFEELKHIIASDQVLAYPDFELPFILTTDASNFALGAVLSQVQDN FT CERPIAFSSRTLNSTESNYPATEKEALAIIWAVKKFKPYLYGKKFTLITDH FT KPLTFIKTSFKNSKILNWRLELENFDYEVKYKEGKTNVVADALSRRTDPDK FT SIRDINASSISDSPRSGTNRSSETIHSGDISGDYFVHYTTRPVNLYRNQLI FT FKLGDTNAIVKETPFVNFQRTVITQNNFDSSNVSWFLSMYHNGKQTAIFAA FT ENLMQTIQETFSSHNFTKGHFVVTPNMVEDVTSIERQNFLVTSEHDRAHRG FT ICEVEYQLRRSYFFPCMLKMIRSYVNSCEICSSHKYERKPYNIKISPRPIT FT HKPLDRVHMDIYIINKCSFLSIIDSFTKHLQMMYLKNKNIVQVQKKLATYF FT SIIGLPKEIITDHETTFMSVQLKSFLSSLGVLLQYASCSESNGQIEKTHCT FT ITEIINTNKYKYEGADTRSLTKIAVTLYNNSVHTATKFTPNELLFNNSNSV FT NPPEISGKAQILFATAYKNMTKAAKRQTNNNDSKCAPPTLEEREAVFIMPN FT IRTKMQPRATKLIVRNVLDKTFENARGVKRHKQKIKRLKKYN" FT CDS 425..1630 FT /product="GYPSY43_AG1p" FT /translation="MWRNSTIEAFCSDTDSISNDSGSLSYVPTNVDIPLER FT LSLTDNMEQIQAQLEQLTHLVQTLAVSHEQQSQKILTLESNTQQASSSVAS FT SVAQGANLSVNLDAFYKIPDPIKAVPVYDGNKKQLFAWLKTAENALNIFKG FT NVHYAVYQMYEDAISNKIQGRAKDALCLEGNPTDFQDIKGILTKTFSDRYD FT LSTHMCQLWHNKMNNSTNLKGYYMQTKELIQKIKQIARQNEKYNESWSAIN FT HFIDETSLAAFIAGLHEPYFGYVQAARPEDLEGAYAFLCKLKSNESTAQNS FT NVRKVTETKHYTSHNEGLQKRFDCNANRPRMTNQASTSSFMNRNTRNVHSS FT NRDSPVPMETESSRTRLTLNQKNLNTTETFNSNCPRNFENEEDDILVNFWE FT AVETTLPI" XX SQ Sequence 5274 BP; 1907 A; 963 C; 935 G; 1469 T; 0 other; ggcgcagccg gtagtcgttg attagaaaaa aaaaacagtg ctagtgagcc gtaatacgct 60 tacgtagttt gtccggattt gtagctacta caagataggg ttacgtattc atctccgtgc 120 aattttaaaa gtgcggacga tttgtggcaa cgccgttttc tgttcatcga caggaccatc 180 ctcttctccc tttttcatca ccaaacgtcg gattgaggtt aggatcctca cgtgttcttc 240 aacaccaaca cacaggtaag caaacttttg gtttttaatt tctcttgttt gtctgagtga 300 cagagataga aaaacttttg ctactgtgtt tcggatagtc actataaaat agaactattc 360 gatttcttaa atttgaaaaa ttaaaaacgt gttcttagag agacgtttat aaccgaagca 420 ttttatgtgg agaaattcaa ccattgaggc tttctgttcg gatacagatt caatatcgaa 480 cgattccggc agcctctcat acgtgcccac caacgttgat attcctctcg agcgcttatc 540 tttaaccgat aacatggaac aaatacaggc gcaacttgag cagcttacgc atctcgtaca 600 gactctcgct gtctctcacg agcaacagtc tcaaaaaata ctcacacttg agagtaatac 660 tcagcaggct agtagtagtg tagcgagtag cgtagcgcag ggcgctaacc ttagcgttaa 720 tttggatgcc ttttataaaa tccctgaccc gataaaggca gtccctgtgt acgatggaaa 780 caaaaaacag ctttttgcat ggcttaaaac agcagagaac gcactgaaca tttttaaggg 840 aaacgtgcat tatgccgttt accaaatgta tgaggatgct ataagcaaca aaatacaggg 900 gcgtgcaaaa gatgcattgt gcttagaagg taaccctact gattttcaag atatcaaagg 960 gattttgaca aaaactttca gcgatagata tgatctatca acgcacatgt gccaattatg 1020 gcataataaa atgaacaata gtaccaattt aaaaggctat tacatgcaga caaaagaatt 1080 aatacaaaaa attaagcaaa tagcaaggca aaacgaaaaa tataatgaga gctggtcagc 1140 tataaaccac ttcattgacg aaacaagcct agctgcattt attgcagggt tacatgaacc 1200 atattttggt tatgtacaag cagcacgacc tgaggatttg gaaggggcat atgcgttttt 1260 atgcaagctg aaatccaacg agtccacagc acaaaattca aatgttagaa aagtgaccga 1320 aacaaagcat tacacttcac acaatgaagg tttgcagaag cgatttgatt gcaacgcaaa 1380 tagaccaagg atgacaaatc aagcttcgac atcgagtttc atgaaccgta acacgagaaa 1440 cgtgcactct tctaataggg attctccagt tccaatggaa actgagtcta gtagaacaag 1500 attaaccttg aatcaaaaaa atttgaatac aaccgaaaca ttcaattcaa attgtccacg 1560 aaattttgaa aatgaggagg atgatatttt ggtaaatttt tgggaagctg tagaaacaac 1620 tttaccgatt tagattttct tccatacata acagggacac atcctaaaac ttctcaacgg 1680 tttaaattac ttttggatag cggtgcaaat aagaatattt tgaaaccggg aataattcag 1740 tcattacaac ctgttaatac cgtaataaaa aatgcatcag gttgccataa tgtttcacaa 1800 aaagggtcaa taaatttgct tggaccagaa ttgccaaatc aaatgtacta tgaatatgat 1860 ttccataaat tttttgatgg aattattgga tcgcaatatt tagctagatg caattctata 1920 attaactata gcaacgaaac agttactatc gctggtaaaa ttattccatt tgttaaatat 1980 tttccaacca acaaattttt ccatcacata gtttcaatcg acactttaga aaatggtgat 2040 tggtttgttc catgtcatca attactctgt aataatctta taatagaacc tgggctctac 2100 aaatcagaaa ataataagtc catcgtaaat attataagcc aatcccatat tgcaccaaca 2160 attacaacca gttttcatct aaatgtaaat aattttgaaa caattaaccc cattccatta 2220 aaggtcgatg agccactcaa taaaaaagca gtagaaattt tgataaggtc ggaccatttg 2280 acaccttatg aaaaagataa tctttttgat gtaatattga ataaccaaca agtactcctt 2340 aagagaaatg aaaaactttc ttcaacaacc gtagttaaac ataaaattat aactaaagac 2400 gatgagccaa cgtacagcaa aacttataga tttccaaatc attttaaaaa agacgtagaa 2460 gagcaaatct tagaaatgct cgaagatgga gtaattattc cctcgaatag tccatattcg 2520 tctcccattt gggtagttcc caaaaaaccg gatgcttcac aaaaaaggaa aataagagtc 2580 gtaatagatt ttcgcaaact gaatgaaaaa actattaatg ataaataccc tattccccag 2640 atagaggaaa tcttggacag tttgggaaaa tctacatact tctcgacatt agatctgaaa 2700 tctggcttcc accagataga aatggaccct gcacatcgcg aaaaaacggc cttctcgaca 2760 tctcagggac acttcgagtt cacgaggatg ccgttcgggt taaaaaatgc cccagcgacg 2820 tttcaacgtg caatgaacca tgtactaaga ggatacatag gatcgatttg ttttgtctac 2880 ttagacgata ttattataat tggtaataat ttaacatcac atttagaaaa cctggataaa 2940 gtgttgaaaa gattagcagc ttgtaattta aaaattcaag taaacaaatg tgaatttctg 3000 aaaagagaaa cagagttcct aggtcattta ataactcaag atggaataag acctaatccc 3060 gataaaatta aaaaaatcct cgattggagt ctacctatga atcaaaagga aatcaaacaa 3120 ttcctagggc ttactggtta ttaccgtaag tttgtgaaag actatgctag attaacaaga 3180 cccctttcaa aatgtttaaa gcaaggtgct aaagtagcgt atgacgagga agattataaa 3240 aaggcatttg aagagctaaa acatattatt gcttctgatc aagtccttgc ttatccggac 3300 tttgaactgc ccttcattct cactacagac gcaagcaact ttgcattggg agcggttttg 3360 tctcaagttc aagataattg cgaacggcct atagcatttt ctagtagaac actgaatagt 3420 acagaatcca actatccagc aacagagaaa gaggccctag ctataatttg ggcggttaaa 3480 aagtttaagc cctaccttta cgggaaaaag tttacgttaa tcacggacca taaacctctt 3540 acgtttataa aaacatcttt caaaaattcg aaaatactga attggcgttt agagctagaa 3600 aacttcgact atgaagttaa atacaaagaa ggaaaaacta atgtggtagc agatgcgttg 3660 agcaggagga ctgaccctga caaatctata cgagatatta atgcttcctc catttcagac 3720 tcgccaagaa gtggaacaaa ccgaagtagc gaaacgattc attctgggga tatatccggc 3780 gattactttg ttcactacac aacaagacct gtaaatttgt atcgcaatca gcttatattc 3840 aaattaggag acacaaatgc gattgttaaa gaaacaccat ttgtaaattt tcaaagaact 3900 gttatcactc agaacaattt cgattcaagt aatgtcagtt ggtttttaag catgtaccac 3960 aacggaaagc agaccgctat ttttgctgca gaaaatctta tgcaaacgat tcaggaaaca 4020 tttagtagcc ataatttcac aaaaggtcat tttgtagtta caccaaatat ggtagaagat 4080 gtaacaagca ttgagagaca aaattttctc gttacatcgg agcatgacag agcccataga 4140 ggtatatgtg aagtagaata tcagttacgg cgatcatatt tctttccctg catgctaaaa 4200 atgattagaa gttatgttaa cagctgtgaa atttgtagta gtcataagta cgaacgcaag 4260 ccttacaata taaaaatttc acccagaccg attacccata aaccattaga tcgcgttcat 4320 atggatattt atatcattaa taagtgtagt ttcttatcaa taattgattc atttacaaaa 4380 catctacaaa tgatgtacct taaaaacaag aacatagtac aagttcaaaa gaagttagct 4440 acttattttt caatcatagg attgcctaaa gaaatcataa cagaccatga gaccacgttt 4500 atgtctgtcc aactaaaaag ttttttgtcg tcgttaggag tcttgttaca atatgcatct 4560 tgttcagaat cgaatgggca aattgaaaaa acacactgca caataacaga aatcattaat 4620 acaaataagt ataagtatga aggagcagat acaagatctt taacaaagat tgctgttacc 4680 ctttacaaca atagcgtgca tacggcaacg aaatttacac ccaacgaatt gttgtttaac 4740 aattctaata gtgtgaatcc accagaaata tctggaaaag cacaaatatt atttgctacg 4800 gcatataaga atatgacaaa agctgccaag cgccaaacca ataataatga ttcaaaatgt 4860 gctcctccaa cattggagga aagggaagca gtgttcatca tgcctaatat aagaacaaag 4920 atgcaaccaa gagcaacaaa actcattgta agaaacgtac tggacaaaac atttgaaaat 4980 gctagaggag ttaaaagaca taagcaaaaa atcaagaggt tgaagaagta caattagcga 5040 attcttttga acaaaatttg aatattgata aacataccat acttttctag tgaacgtaat 5100 ttaaaggtac ttaatcactt gaaactaata aaaagcgaat gaaattctgt tgtaaggctg 5160 cgtaaggtta ggtaatacta agttgacttt attataacgt ttttatactg tcctattatc 5220 tttgtactgc ctatcctaat cggacgagga cgcccgagct ttaccccccg gaga 5274 // ID GYPSY52-LTR_AG repbase; DNA; ANG; 393 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY52-LTR_AG is an LTR of retrotransposon GYPSY52_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY52_AG; CsRn1 lineage; GYPSY52-I_AG; GYPSY52-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-393 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY52_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 97-97 (2004). XX DR [1] (Consensus) XX CC GYPSY52-LTR is a long terminal repeat of GYPSY52_AG (its internal CC portion is deposited as GYPSY52-I_AG). XX SQ Sequence 393 BP; 116 A; 112 C; 78 G; 87 T; 0 other; tgtagcgacc agagctccat ctggtgcgca cacctagaac taccgacaga aaatgtttaa 60 cgggctagtt aggcagggac tgcacacaaa gctagagcag gacaaacagt ggtagcatgc 120 tgcaccgcag ctataaaaac ggtggcttgc tccatacacg ctctctctcg tcctaaacgt 180 ttcaacaccg acacgcgcat atccgtccgg cttaccaaac acttcttctc cggcaactct 240 cgaacgaaca acactaattt tcgcgtctct cccgaactca ccgttccgcg gcttaaaaga 300 aaaaaataaa tcgaattcta ccaaaacgta gttcgcaaag cgtgttgcgt atctctttaa 360 aaaattaggg accacttttg tggcgcccct aaa 393 // ID MARINERN2_AG repbase; DNA; ANG; 811 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE MARINERN2_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN2_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-811 RA Kapitonov V.V. and Jurka J.; RT "MARINERN2_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 2(11), 20-19 (2002). XX DR [1] (Consensus) XX CC There are several hundred copies of MARINERN2_AG in the genome. CC They are ~98% identical to the consensus sequence. CC Some copies are 99% identical to each other. CC MARINERN2_AG copies are flanked by the TA target site CC duplications. CC This element has imperfect 46-bp terminal inverted repeats. CC Putative classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. XX SQ Sequence 811 BP; 276 A; 136 C; 142 G; 257 T; 0 other; cacagcaaat tttctcatcc tgcttcagta aagtttttta ctgaaacggt tcagtaatag 60 aatattttac tgattttcag catgaatgtc atgtttttac tgactttcag caaaataatt 120 tactgaattc atttcagtaa tacgcatttt actgaaaatc agtaaaataa gtgtcaaatt 180 agtcgtttca ctggatatca gtaaaacaaa ttactgaaaa tgcagcaatt cattctgtca 240 aatcgacctg tcagtcagaa catcaaaaca aaacaatgcg gtacctactt gctcgtgatg 300 tgcaaaaagt tgattttgta ggcgcttaca ggcaccctgg tgaaatttca ggtgatattt 360 cggcttacgg gattaattta aaacaattgg cagccgtgtt ggtgtgtaaa atgtttgcta 420 caatccactg tgcttgctga aattgcgtgg tgtctgtgag cgcttactga accatctaca 480 cttgcggcgg caaaatacag tataaaaact gtataaaaca acacaaaaca tgtatgaata 540 tgaactgagg gactggcaat atctgcgcat caataaaaac tcaatttaaa ataaatattt 600 cgtcattttt taatcgctac caactactga aaaatcagta atatgagaat ggctttactg 660 ggatttcagt aaacgagctg tcattttggt gagcaccgga cattactgaa tgattcagta 720 aacattcttt ttactgaaag gattactgaa acttcagtta aaaaaatttc agtaaaattt 780 tactgaagtt cagtaaaaaa agttttctgt g 811 // ID HARBINGERN2_AG repbase; DNA; ANG; 424 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE HARBINGERN2_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Harbinger; DNA transposon; Transposable Element; Nonautonomous; KW HARBINGERN2_AG; Harbinger superfamily; KW nonautonomous DNA transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-424 RA Kapitonov V.V. and Jurka J.; RT "HARBINGERN2_AG: a family of nonautonomous Harbinger-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 18-18 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of HARBINGERN2_AG in the genome. CC They are ~97% identical to the consensus sequence. CC HARBINGERN2_AG copies are flanked by the ANT 3-bp target CC site duplications. This element has imperfect 15-bp terminal CC inverted repeats. CC Putative classification: a Harbinger-like nonautonomous CC DNA transposon. XX SQ Sequence 424 BP; 114 A; 109 C; 106 G; 94 T; 1 other; gcgtgcgaac agacggcgtc cacgggcctg cccacgcccg aatgcagagc cgatggcata 60 cgattctatc ttcaccatta acatgagagg gacagagctt ataccccgaa cgtacccgaa 120 aggtatgcat gcttcactca tcaggatcta aaggcaagga caaggcaaaa gagacacaga 180 aaaatttcct cacggctaca catgtccttt tgatgcgttg tctatgcgtc tcgctcggta 240 tcacagcggc atacggcagg taagcagtac aaatgctgct acctgatacg cacacatgtt 300 tatcgaacac aactattcgg ttcggctcgg taaacttcac gaggaygccg gaacaaaagg 360 gcgcagactt tttcatgatt ttgtatgact tcggctgttt tggaccatgc gtgtaggcgc 420 aggc 424 // ID Ag-Outcast-2 repbase; DNA; ANG; 5347 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE An Outcast clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW Outcast; Non-LTR Retrotransposon; Transposable Element; KW Ag-Outcast-2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5347 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-5347 RA Kojima K.K. and Jurka J.; RT "Outcast clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 3 CC sequences with >95% identity. XX FH Key Location/Qualifiers FT CDS 180..1445 FT /product="Ag-Outcast-2_1p" FT /translation="MGVNDPGGGISLSNYYSGLGVDFQIGSAEEPRRKIKK FT GTQLRKISEVNESEERFLVIEAEKEDKDLSRANPFLVKKTIDVIVGENNAN FT ITRIRGGKLLVRVNSNKNAEKLKRLKKIGVGENETKIVVKEHPTLNTCKGV FT IRCPDIEFLTENEIMEGLTEDKVTEVVILKRKVNDKVVNTRTAVITFNTTK FT VPRKIDFGWYPLKVELYIPKPMRCGTCLKIGHTKKVCRREKVCAKCSENVH FT IEECKDLKCVSCGGDHHTLDKQCPVFVDEMEIQKIKTVNKITYGEARKIRR FT SQCPSIPRIFTNGQTYAQKTKITDNTENRSQTESNKKTYSKKIMIQNTTYT FT ENNKNNENITENKTQENTQKENNTTEETVNTNNTEKPIHISETRISMNTPI FT DVAQLDLEQILIQNENGELIPLQHEKKQ" FT CDS 1570..4845 FT /product="Ag-Outcast-2_2p" FT /note="apurinic-like endonuclease, reverse FT transcriptase and ribonuclease H." FT /translation="MNNLNILQANVQSFRKNKDEIEHIIKMEKINIACIQE FT TWIKGKDNTKISGYNIITENREDGYGGSAIIIKKGMKFKQIKIGNRIKNIQ FT IVAIEVIESSINIISIYMSPKTTIQEITEITQQINSIFNRRTVIMGDFNCH FT HTFWGDKKIDNKGTKLIEEFDYTNFLILKNEKNTCISEDPNKRDTSIDLTI FT ITQDLFTRIEKNIIEQHIGSSKHRIIKACIKEKTKKQQRTQLNKGKVIKNL FT QNLKIDKISTINQITKKIETIIKQNITKVNYEPKSWWNKDIKTALKNRNTA FT RKIYNNTKNIKDAIEYKKNEAVLKRKIKLSKKEFYKNSIEKINKNTTTKEL FT WRLVKKLDGKEESIREENIIHNNEESARELLRKTYGKNEKIDIESINIPEF FT INENKLIEEEEWEKILNKKKKSAPGVDKITYEMLKNIHPGTSKIIRDEINE FT MWEKGKIKKELKRIKTIAIQKPNRDKTKVENYRFISLINTIMKVGNSAALK FT KINKHCTINKILPDRSFGFRKNKSTQMCANYLLNWIQKKKKQKRKIGIIFA FT DLENAYNNVKLKILIEELVKTNMNSQIITWISQFLKNRTIELECNKKIIKK FT LVSDGLPQGDVMSPTLFNIYTRKIHSRINKIPKIRLIQYADDFAIVAADKN FT ITLLNRTLQRGIIEFQRAISELKLEMNIKKTKMMIMGKTKQKVRLKINNIE FT IENVTAYKYLGITIDKQLNFKKHIEDIKNKARDRLYVFKVLCGKNRGLGTD FT KARNIYRATIRSILENASSITKNAKINTRKKISTIANTALRKISGCTKTTP FT INSLHAINAEIPLEIRCKYIANKEMTKAYTVSPIIREQLLNAVRNNTNKKT FT LTAQQQLFKENRSTIQNIAISKINDIEHNVSINTSIKNKIEEKKNRNNQVT FT KQIVLENINEQMKKRNNIFTDGSKQQEKNGIGIYIQKGQENWYHSIKANND FT ACIASIELTAIEIAMKMAEELKMERPTIYTDSLTSCDILKKAKDEEEIEET FT IYNIIEIGNRLKAKIWWIPAHVGIHGNECADRLAKEGTISNITTENKIRLV FT DAYERYKIEMIKKQKSGTRNNVKQRSKIPGISKKT" XX SQ Sequence 5347 BP; 2680 A; 771 C; 899 G; 997 T; 0 other; gcacaaacca attgcatctt gaaggaacca cacgtgccat tttgagcaga cagtttacaa 60 ggacagtgaa cccaaaagca caaaagatag aaacacgtgg agtgatattt tgtgaattgg 120 acgaaataaa tcaaagaaat aagtgaaaaa ggacactaag catacgaagt accgaaacga 180 tgggggtcaa cgaccccgga gggggaattt ccctctcaaa ctattatagt ggattaggag 240 tagatttcca aataggatcc gcagaggaac ctagaaggaa aataaaaaaa gggacccaat 300 tgaggaaaat aagtgaagtg aacgaatcgg aagaaagatt tttggtgata gaagcggaaa 360 aagaagacaa agacttaagt cgtgcaaacc cattcttagt gaaaaagaca attgatgtga 420 tagttggaga aaacaatgca aatattacga gaataagagg tggaaaacta ctagtgagag 480 tgaactcaaa caaaaatgcg gaaaaactta aacgtctaaa aaaaataggg gtgggagaaa 540 acgaaacaaa aattgtagtg aaagaacatc ccacactaaa tacgtgcaaa ggtgtaataa 600 ggtgcccgga catcgaattc ctcaccgaaa acgaaatcat ggaaggcctc accgaggaca 660 aagtgacgga agtagtaata ttaaaaagaa aagttaatga taaagtagta aacacacgca 720 cggcagtaat tactttcaat acgaccaagg tgcctagaaa aatagatttc ggttggtatc 780 cgttaaaagt agaactatac ataccgaagc caatgcgatg tggaacatgt ctaaaaatag 840 gacacaccaa aaaagtgtgt agacgagaaa aagtatgcgc caaatgtagt gagaatgtac 900 acattgagga gtgcaaggac ctgaaatgcg tatcttgtgg aggagatcac catacacttg 960 acaaacaatg tccagtgttt gtggatgaga tggagataca aaaaataaaa acagtgaata 1020 aaataacata cggagaggca agaaaaatac gcagatcaca atgcccgagt attccaagaa 1080 tattcactaa cggacaaaca tacgctcaaa aaacaaaaat aacagataac acagaaaata 1140 gaagccaaac agagagtaat aaaaaaacat attcaaaaaa aataatgatt caaaacacaa 1200 catatacaga aaataacaaa aataatgaaa atataacaga aaacaaaaca caagaaaaca 1260 cacaaaaaga aaataacaca acagaagaaa cagtaaacac taacaataca gaaaaaccca 1320 ttcacatcag tgaaactaga atatcaatga acacgcccat agacgtggca caactagact 1380 tagaacaaat tttaatacaa aatgaaaacg gtgaactaat cccactgcaa catgaaaaaa 1440 aacaatagaa aaaagattaa acaaatgtaa aaagagcgaa aaaaaaacaa aatatagtat 1500 aatatccctt ttctccataa aaaaaatata aaaaaaacaa tagtaaaaaa accaaaaata 1560 atcaattaaa tgaacaactt aaatatctta caagcaaacg tacaaagctt taggaaaaac 1620 aaagatgaaa tagaacacat aataaaaatg gagaaaatca acatagcatg catccaagaa 1680 acatggataa aaggaaaaga taacacaaaa atatcaggat ataatattat tacagaaaac 1740 agagaagatg gatatggcgg aagcgcaata atcataaaaa agggtatgaa atttaaacaa 1800 ataaaaatag gaaataggat aaaaaacata caaatagtag caatagaggt aatagaaagt 1860 agtataaata taatatcgat atacatgagc ccaaaaacaa ctatacaaga aataacagaa 1920 ataacacaac aaatcaactc aatattcaac agaagaacag ttataatggg agattttaac 1980 tgtcaccata cattttgggg agacaaaaaa atcgataata aaggaactaa attaatagaa 2040 gaatttgatt atacaaactt cttaatttta aaaaatgaaa aaaacacatg catatcagaa 2100 gatccaaaca agagagacac ttcaattgat ttgaccataa tcactcaaga tttatttact 2160 aggatagaaa aaaatataat agaacaacat ataggaagca gcaaacacag gataataaaa 2220 gcatgcataa aagaaaaaac gaagaaacaa caaagaacac aactaaataa gggaaaagta 2280 attaaaaact tgcaaaacct taaaatagat aagatatcaa caattaacca aataacaaaa 2340 aaaatagaaa caataattaa acaaaatata actaaggtta attatgagcc taagtcgtgg 2400 tggaataaag atattaaaac ggcactaaaa aacagaaata cagcaagaaa aatatacaac 2460 aatacaaaaa atataaaaga tgccatagag tacaaaaaaa atgaagcagt actaaaaaga 2520 aaaataaaat tgagcaaaaa agaattctac aaaaactcaa tagaaaaaat aaataaaaac 2580 accacgacta aggaactatg gaggctagtt aaaaagttag atggtaaaga agaaagtata 2640 agagaagaaa atataatcca caacaacgaa gaatcggcga gagaacttct cagaaaaaca 2700 tatggaaaaa atgaaaaaat agatattgaa tcaataaaca tacctgaatt cataaacgaa 2760 aacaagctga tagaggaaga agagtgggaa aaaattctaa acaaaaagaa aaaatcagct 2820 cccggagtag acaaaataac atacgaaatg ctaaaaaaca ttcacccagg aacatctaaa 2880 attataaggg acgaaataaa cgagatgtgg gaaaaaggga aaataaaaaa agagctaaaa 2940 agaataaaaa ccatagccat acaaaaacca aacagggaca aaacaaaggt agaaaattat 3000 agattcatat cattaatcaa tacaatcatg aaggtaggca attcggcagc cctaaaaaaa 3060 ataaataaac actgtacaat aaacaaaata ctaccagaca gatcttttgg gttcagaaaa 3120 aataaatcaa cacaaatgtg cgctaactat ctactaaact ggatacaaaa aaagaaaaaa 3180 caaaagagaa aaataggtat tatctttgca gatctcgaaa acgcatataa taacgtaaaa 3240 ctaaaaatac tgatagaaga actcgtcaag acaaacatga atagccaaat aataacctgg 3300 attagtcaat ttttgaaaaa taggaccata gaattagaat gtaataaaaa aataatcaaa 3360 aaactagtca gtgatggact gccacagggt gacgtgatgt ccccaacatt atttaatatt 3420 tatacaagaa aaatacatag taggataaac aaaataccaa agattagact aatacagtac 3480 gcggatgatt ttgctatagt ggctgcagat aaaaatataa cactattaaa tagaacactc 3540 caaagaggta taatagaatt ccaaagggca ataagcgaac taaaattaga aatgaacata 3600 aaaaaaacaa aaatgatgat aatgggcaaa acaaaacaga aagtcagatt aaaaataaat 3660 aacatagaaa tagaaaacgt aacggcatac aaatatctgg gaataacaat agacaaacaa 3720 cttaacttca aaaaacatat tgaagatata aaaaataagg caagagacag attatatgtc 3780 ttcaaagtgc tgtgtggaaa aaatcgagga ttaggaacag ataaagcaag gaatatatat 3840 agagctacaa ttcgcagtat actcgagaac gcatcgtcga ttacaaaaaa tgcaaaaatc 3900 aacaccagaa aaaaaataag tactatagcc aatacggcac taaggaaaat atcaggatgt 3960 accaaaacaa ccccgattaa ttcgctgcat gccataaatg cagaaatccc actagagata 4020 aggtgtaaat atatagcaaa caaagaaatg accaaagctt acacagttag ccccataata 4080 agagaacaac tgttgaatgc agtacgaaac aacacaaaca agaaaacatt aacagcacaa 4140 caacaattat ttaaagagaa cagaagtacc atacaaaaca tagccataag taaaataaac 4200 gacatagaac ataacgtaag cataaatacc agcataaaaa acaagataga agagaagaaa 4260 aataggaata accaagtaac aaaacaaata gtattagaga acataaatga acaaatgaaa 4320 aaaagaaaca acatatttac agatgggagt aaacaacaag aaaaaaacgg aataggaata 4380 tacatacaga aaggacaaga aaactggtat catagcatta aagcaaataa cgatgcttgc 4440 atagcaagca tagaactgac agcaattgaa atagctatga aaatggcaga agaactaaag 4500 atggagagac caacaatata cacggatagt ttaacaagct gtgatatcct gaaaaaagcc 4560 aaagatgagg aagaaattga agaaacgata tacaatataa tagaaatagg aaatagatta 4620 aaggcaaaaa tatggtggat acccgcacat gtaggaatac atggaaacga atgtgcggat 4680 agattagcca aagaaggaac aatctcaaat ataacaacag aaaacaaaat aagattggta 4740 gatgcgtacg aaagatacaa aatagaaatg ataaagaaac aaaaatctgg tacgaggaac 4800 aatgtcaaac aaaggagcaa aattccaggc atttcaaaaa aaacttgaaa tgatccatgg 4860 gaaaaagata ttgacataaa cccggaagaa gcgagaacaa taaacaaatt gttaacagga 4920 cacgacctgt caccattttg gctggccaaa atgaaattag cagaatcagg attgtgccaa 4980 atatgccaaa cacaaaatac agggcggcat atggttttta actgcaaaaa atacgaaaca 5040 aatagaggag acataacatt tgatagatta aaagaaagat ggaaacaaaa atcaggatat 5100 gaaattaaaa aaattaccaa attcatgaag gaaaatcaaa tagaaattta aaaaatacaa 5160 aaaaaaatag catagaaaga caaaaatgaa gtcaagaaaa aaaaaacaga ataacaaaaa 5220 aaaaaacaac caaatgacaa ctgtcaggtt aaaatataga gtgggtaaaa ccttgaatac 5280 atttgaaacg tcaaatgtag gggaatacat cctcactcac cgaagaagaa gaagaagaag 5340 aagaaga 5347 // ID BEL19-I_AG repbase; DNA; ANG; 7794 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE BEL19-I_AG is an internal portion of the BEL19_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL19-I_AG; BEL19-LTR_AG; BEL19_AG; Bel clade; integrase; KW protease; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7794 RA Kapitonov V.V. and Jurka J.; RT "BEL19_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(5), 85-85 (2003). XX DR [1] (Consensus) XX CC BEL19_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL19-I_AG, an internal portion of BEL19_AG is flanked by CC BEL19-LTR_AG CC LTRs. The A. gambiae genome harbors several copies of BEL19; CC they are less than 5% divergent from each other. XX SQ Sequence 7794 BP; 2457 A; 1768 C; 1927 G; 1642 T; 0 other; tttggtggct ccagagagga taataattca ttagtatttt tatttacccc gtaagaggga 60 cccttcattg ggacaacctg gccaaacaat attagtatat gtgttgtatt gtccaaaccc 120 ccttccccgc cccctcgatt ccccaccccc tcaaccgagt cgggccagcc aggagccccg 180 ctgccaggga aggactcaac cgagtcgggc cagccaggag ccccgcagcc agggaaggac 240 tcaaccgagt cgggccagcc aggagccccg cgccagggaa ggactcaacc gagtcgggcc 300 agccagtagc cccgctgcca gggaaggact caaccgagtc gggccagcca ggagccccgc 360 tgccagggaa ggactcaacc gagtcggacc agccaggagc cccgcgccag ggaaggactc 420 aaccgagtcg ggccagccag gagccccgcg ccagggaagg actcaaccga gtcgggcctg 480 ccaggagccc cgctgccagg gaaggactct accgagtcgg gccagccagg agccccgcgc 540 cagggaagga ctcaaccgag tcgggcctgc caggagcccc gcgccaggga aggactcaac 600 cgagtcgggc cagccaggag ccccgcgcca gggaaggact caaccgagtc gggcctgcca 660 ggagccccgc tgccagggaa ggactcaacc gagtcgggcc agccaggagc cccgctgcca 720 gggaaggact caaccgagtc ggaccagcca ggagccccgc gccagggaag gactcaaccg 780 agtcgggcct gccaggagcc ccgctgccag ggaaggactc aaccgagtcg ggccagccag 840 gagccccgcg ccagggaagg actcaaccga gtcgggccag ccaggagccc cgcgccaggg 900 aaggactcaa ccgagtcggg ccagccagta gccccgctgc cagggaagga ctcaaccgag 960 tcgggccagc caggagcccc gctgccaggg aaggactcaa ccgagtcggg ccagccagga 1020 gccccgcgcc agggaaggac tcaaccgagt cgggcctgcc aggagccccg ctgccaggga 1080 aggactcaac cgagtcgggc ctgccaggag ccccgctgcc agggaaggac tcaaccgagt 1140 cggaccagcc aggagccccg cgccagggat agactcaacc gagtcgggcc tgccaggagc 1200 cccgctgcca gggaaggact caaccgagtc gggccagcca ggagccccgc tgccagggaa 1260 ggactcaacc gagtcgggcc agccaggagc cccgcgccag ggaaggactc aaccgagtcg 1320 ggccagccag gagccccgcg ccagggaagg actcaaccga gtcgggccag ccaggagccc 1380 cgctgccagg gaaggactca accgagtcgg accagccagg cgccccgcgc cagggaagag 1440 ctcgtgggag ctaggccgct gccagggagg tagccgccca ccgccccgcg ccagggaaga 1500 gctcgtggga gctaggccgc tgccaggaag gtagccgccc accgccccgc accagggaag 1560 agctcgtggg agctaggccg ctgccaggga ggtagccgcc caccgcccct cgccagggaa 1620 gagctcgtgg gagttaggcc gctgccaggg aggtagccgc ccaccgcccc gcgccaggga 1680 agagctcgtg ggagctaggc cgctgccagg aaggtagccg cccaccgccc cgcaccaggg 1740 aagagctcgt gggagctagg ccgctgccag ggaggtagcc gcccaccgcc tcgcgccagg 1800 gaagagctct tgggagctag gccgctgcca gggaggtagc cgcccaccgc cccgcgccag 1860 ggaagagctc gtgggagcta ggccgctgcc agggaggtag ccgcccaccg ccccgcgcca 1920 gggaagagct cgtgggagct aggccgctgc caggaaggta gccgcccacc gccccgcacg 1980 aggaaagagc tcgtgggagc taggtcgctg ccagggaggt agccgcccac cgccccgcac 2040 cagggaagag ctcgtgggag ctaggccgct gccaggaagg tagccgccca ccgccccgca 2100 ccaacagcaa agctgagcgc cgtagaaggt cggtaagcga tttactattt ttactataat 2160 taagagtgaa gtgaagtgaa acacaaaaaa aatggcgaca tcaaaggcga gggcaatcaa 2220 ggaagcggaa gtgaatctcg tagaggtgat agttggtcta aaacaggaat tagcggaaac 2280 caaagaaaag tgggaaaagg aatcaaagga attaaaaaaa gtaattgaaa gtagggcaaa 2340 agaagatgaa gcgtcaatcc tcttaaagga gcaacgccag gctttcgatt cgatgatgat 2400 cgaacaacga caaatttttg atgaatggaa gaaacagact gaaaatccgc aagtaacgac 2460 cactccaatt aactctacca gtaatttaga aagcatagta cattcattag cggatgcact 2520 gaaaacacgc actaacactg ctatcctcga tcttcccgaa tttgatgggg actataaaat 2580 gtggcctagg ttcaaagcta ttttcgataa aactaaccaa gaaggcaagt taagtactac 2640 ggagcaatta gcccgtctat caaagagcct taagggaaat gccgctcaga gtgtaagccg 2700 tctgatgatt gacccggcta atgtatcaaa aataatagac cggttggaag aagaatatgg 2760 gaatgcaaaa atagtttaca aagcactttt agcagacctc atgcaaaatg aaaatccatc 2820 ccttagtaaa ccaaagacat ttttaagctt catcaggtca ttagacgacc ttgttacaaa 2880 catgacagta ttaaaaaagg aagaatatct tacggatcca aggctattag acgatttggt 2940 cgataagctt ccagaagacc taaagcgtga gtggttaata accttgattc gagagaaaga 3000 ggaaggagga aatagtcgca ttaaaacact aaacgatttt ttgaagtggc ttaaacctac 3060 agagaagctt gccatcctat caaatgtaaa tgaaggccgg gaggaggcta tatgtgataa 3120 aaacctaagt tcaccaatag aaccgaaccc aaatagaaat accagaacct gttgccatgt 3180 ttgtcaggga gaccatcgca ttgtcgattg aagaagattt cggggtatgc gattaaatga 3240 aaggtgggac gccgttagaa gatttgggtt atgtacaaac tgttgtaaca acagaaatca 3300 taatgctaat aattgtcgat taccacccca gtgtcgtgaa gccaattgtc atatgaaaca 3360 tcatccatta ttacattaca ctaacaagag tcaaaataat ttaaacacat tgaaccatca 3420 cagtaccacc gaaggaatat tttaccaaat aataccggtc tctgtcattc acgaagacaa 3480 agaaattgat acatttgcct ttatagatac aggttcttcc gcaagtcttt tgttaacaga 3540 tattaaagaa agtttgggtg ttcaggggat aagaaagcca ttggctctga catggaccaa 3600 tggcgaaatg caggaggaag ctaccagcga attagtcaat ttgcaaataa ggggtatgag 3660 tggcctaata caaccattga aaggattgag gacaattagg gaaatgaatt taccttcaca 3720 aactttaaac gcgaactcgt tgaaaacccg atacaaacat tttgatggaa ttgatcttcc 3780 aaactacata gatggagcac caaccatttt gttaggctta ccacacgctc accttatttg 3840 tgggtcggaa aatcgcgttg gtggtccaga tgaacctatt gcgatcaaaa cctcattagg 3900 ctggtcagtt ctaggtaagg gatcgaaaag ggaaaacaaa ggtagcttgt ttgtgctaaa 3960 tgaaaaatgt gataaagaag aaaaaggaat ggaggaaatt atgaaggaat tttttacaac 4020 cgaaggtttt ggtgttcaac ctagcccaaa tataataacc ccaaaacatg aagaaagagc 4080 attgaacttg atgaaaaata ctctaagacc actagaaacg ggctatgaaa taggactttt 4140 atggaaagag gatgatataa acctcccaga aagctatacc caagcgctta gacggttgca 4200 gggttttaaa gaagacttga aaaagatgta aatcttaaaa actggtacta caacgaaatt 4260 tcgttatact gccaaaaggg atacgctaag gaagttaaca cggaaataca aatacacaac 4320 agaaatttaa actatatacc acatttcgca gtagtgaact taaacaaatt aaacccaaag 4380 ccaaggttgg tctttgacgc cgcggctcgg aacatgggga tctctctaaa ttctcagttg 4440 ctgacaggac ctgatgcagt ccctccactc attggtatcc tgttaagatt tagagaaggc 4500 agcattggag tctctggcga tattcaagaa atgtttcgcc gcattagaat tcgagcagat 4560 gacagatgtg cacagcgctt tctttatagg gaatccccaa attattcacc aaagatcatg 4620 gaaatgagag taatgatctt tggcgcttct tgttcccctg cgtgtgcaca atacgttaaa 4680 aactacaacg ctggattatt tcggcagcgg taccctgagg cagttaaagc tattgaaaat 4740 aatcattatg tggacgacta tctagatagt ttcacgtcaa tagaaacagc aacccaaaga 4800 gtaaatgaag tgatacttat tcataatcaa gcagattttt ttattcgaaa tttcatatcc 4860 aactctagtc aattaaccaa ccaaattcct tgtgaacgta tttcaaccca acctgttttg 4920 aagataagcg aaggcacctc aaattttgat aaaattctag ggcagtactg ggaaaaagaa 4980 aaggatgtat tcaagtttat tttaaatgat ctgaccattc ctggccattc agtcagcaaa 5040 cgagaaatgc ttgcaaaaac aatgaaaata tatgatccaa tgggactcct taccaattat 5100 actatagagc ccaaaatact aatacaagaa gtatggaagt tgaagatgga ttgggatcaa 5160 aatattcctc ctgatctaaa tagaaaatgg caagactggt tgcacaggtt gaaggaatta 5220 gaaacatggg aacttcagag atgttattca caagcaggag agccaattac aagagagctt 5280 cacacatttg tcgatgcgtc tgagcaagca tttgctgctg tcacatatat tcgcacagtc 5340 cataaggaag gagtagacat gcgattggtt gcaggaaaat caagggtagc gccacttaaa 5400 attttaacaa tacctcgact tgagttgcag gcagcggtaa tgggggcgag actagccgat 5460 actatccgat cggagttaag actcaacatt gataatatgc atttttggtc cgactcaaga 5520 accgtgctaa gctggatatg ttcggaacct aggaggtaca agcagtttgt agcctttaga 5580 gttagcgaaa tcctctgcaa aactcacgcg agtgaatggc attggattgc gtccaaagac 5640 aatcctgcag atgaagcaac aaaaagaata aggggggatt ccatatggat atctggtcct 5700 ttatttttac gacaaaggaa tttcacaacc accaattacg attctactac tctagtagat 5760 atcagaccga tgtttttaat acactctact gaccttatag attttacagt cataaattca 5820 cattggacta gaagttgggt taaattgaaa cgagcattag ccattgggtt attatacatg 5880 gagtggttga aatcgaaggc tcagaaatat aaaatatcgt tggaaataac taaaaaacaa 5940 ttagataggg cagagaaatt gctcataaaa aggctcaaat ggaaacgttt aatgaagaca 6000 taatgttatt aaagattgga aaggaaatat caatgaaaag ccccataaga aatttaaatc 6060 ctgtgctaga tacagatggt ctgttgcgaa tgaaaggaag gctaactggg ttatcatttc 6120 tcgaagaaaa cgcaaaaaat ctcataatat taccaaagaa acatgaaatc acccatgtaa 6180 ttgtacggca ctaccatgaa cgattatacc acaagaagtt cgaaacagca ctggccgcct 6240 tgaggcaacg tttctgggta atagatagca gagcagtact taaaagagta agagcgacct 6300 gtcagagttg caaaaataac ttagctaaac ctcaacttcc ccaaatggcc gttttacccg 6360 cttgcagggc agccgtattt tgcaaacctt ttacgcatac gggcgtcgac tatttcggtc 6420 ctctaaccgt gactataggg aggcgatctg aaaaaagatg gggtgcgctg ttcacgtgtc 6480 tctcaacacg agcagtacat ctagaattgg ccacagacct tagttctgct tcgtttataa 6540 catgcctgag aaaaatgcaa caccgaagag ggaaaatcac gcacatgtat agcgataatg 6600 gaacaaattt cataggtgct gaacgcgaaa tgaaaaggtt gcgagaaagg tgtgcaaacg 6660 atggcataga atggcatttt aaccccccag ctgcacctca cttcggtggg gcatgggaga 6720 gaatggtagg cgaagtaaaa agtttactcc ctctcaagca agagtcttat cgtgaagaag 6780 cgttgcgcgc gattctggtg gaaattgaat tcattattaa cagcagacca ttaacacata 6840 taccattaga gcatgaagat gatgaaccgt tgacacctaa tcatttccta ttgggttcat 6900 cgggagaagc tgtccctaca cttcgagaag ctacatttgc tgaagctacc agaagcagct 6960 ggaaaagggt tcaattagta gcgcaacatt actggacacg gtggattaga gaatatttgc 7020 cacaattaaa caaacgcgag aaatggttga aaaaggtgtc tccaattgaa gttggcgata 7080 tagtggtttt cccaaatgaa cagattaacg gtaaatggca aaagggtcga gtatcaaagg 7140 tctataaagc aaatgatgga caagttaggt cggtcaccat cacatccggg ttatcaacag 7200 tcaatcgtcc ggtttcaaag gtcgcaaaat tagatgtact tccaaaatcc attatagctg 7260 aggcgcagca aaaagtgcaa tagtccattc caaaaaaaaa aacaaacaaa aaaaaaacaa 7320 gtaataatta taattataat actaagatta aaaaaaaaaa acaattagta gcaacaatag 7380 taatcagaac atagctttgt tcttgcagcc acattcgcgt taaatattag gtgcatcttc 7440 gtgtggaagt acatgcaata ataataatag gatataagtg aattaattat tggcgagttt 7500 atttatgttt ttaatcttgt gatgttcaat ttaacttatt attttatcat gcttatttat 7560 tcaactttta atgtgtatcg ttaagtaatt attctataat ttaagtattt ggtcaattta 7620 atcatagagc gtcaatttaa tgtctcaatc ttccagtaaa tcagagtaaa tgaactgagc 7680 ctgagtgttc agcagtcgtt agtctgagaa tattcgatag ccatttatca ataacggcaa 7740 ccgccgtcgg ctagtttatt gtaacacctc ctagaggaat tacgggaggg ggaa 7794 // ID GYPSY36-LTR_AG repbase; DNA; ANG; 294 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY36-LTR_AG is an LTR of retrotransposon GYPSY36_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY36_AG; GYPSY36-I_AG; GYPSY36-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-294 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY36_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 65-65 (2004). XX DR [1] (Consensus) XX CC GYPSY36-LTR is a long terminal repeat of GYPSY36_AG (its internal CC portion is deposited as GYPSY36-I_AG). XX SQ Sequence 294 BP; 68 A; 70 C; 65 G; 91 T; 0 other; tgtagaattg tatgtaaagt tgaaagctag tattgcctga tgcaatcggg ataatattat 60 aaagaagaga acttgggctg agcagtcagt cgttgatcag aactcaataa agaaacaccg 120 tcaaaagaga aaaagtctgt gtcaacgttc tccctcattc ggtccacctc ctccaccccc 180 tcggttggtt ttgcggttgt cgtgatccgg ttgtgtcgag tattgttctc cagtgttgga 240 gtattcggcg ccttccctca ttcctttccg ctctcgtgtc tcccctttct taca 294 // ID HARBINGER1_AG repbase; DNA; ANG; 5377 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE HARBINGER1_AG is an autonomous DNA transposon - a consensus DE sequence. XX KW Harbinger; DNA transposon; Transposable Element; HARBINGER1_AG; KW Harbinger superfamily; MADF/SANT; transposase. XX NM HARBINGER1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5377 RA Kapitonov V.V. and Jurka J.; RT "HARBINGER1_AG, a family of autonomous Harbinger-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 17-17 (2003). XX DR [1] (Consensus) XX CC There are ~20 copies of HARBINGER_AG in the genome. CC They are ~98% identical to the consensus sequence. CC HARBINGER1_AG copies are flanked by the TWA 3-bp target CC site duplications. This element has imperfect 40-bp terminal CC inverted repeats (5 mismatches) and subterminal TIRs (positions CC 54-106 and 5269-5321). CC The HARBINGER1_AG consensus sequence encodes two proteins: CC a 471-aa HARBINGER1_AG1p transposase and a 245-aa HARBINGER1_AG2p CC MADF/SANT-like DNA-binding protein. The MADF/SANT domain is CC present CC in various transcriptional regulators and is involved in the CC chromatin remodeling. Putatively, HARBINGER1_AG2p CC was recruited by the transposon from the host genome. CC HARBINGER1_AG1p is encoded by 4 exons. CC HARBINGER1_AG2p is encoded by 4 exons in -strand. XX FH Key Location/Qualifiers FT CDS join(452..811,875..952,1393..1596,2345..3124) FT /product="HARBINGER1_AG1p" FT /note="transposase" FT /translation="MSEPELENDNNEVRDRSRSRSRSPQARRLWVQDLFLN FT RNETGNRLLTDITTSGIYETMNRFLRMKKEDFFHLLSLVGPKIAKMDTDFR FT KAITEQERLLITLRYLATGETFTSLQYVFRVSRHSISRIVKETCACLIEAL FT RDYVKSQPESIPIRGCSGCARSELESNPTRGCNEFGSTRLCFLAPTDLSGR FT PELVATEPTVARPDPNSQHQRLPSTEEEWLAISRRFEQRWRFPHAIGAIDG FT KHVEIICPRNSGSEYHNYQKFFSIVLMVVVDADYNFLWADAGGKGGISDGG FT IFKNTRLYHKLENDQLNIPPATPLQVPYQTPVPYFILGDKAFAFTNYCLRP FT YSGVHPPDSMERTFNKMHSTCRMPVENSLGILANRWRVLKGIQLQPDVAKN FT IVLTTVYLHNFLRKHASRDTYTPPSAFDRVVRGRRVDGDWRSEGGLTDLQN FT IASRPSENLADIRNHIANHLKHNRST" FT CDS join(5133..5095,4587..4357,4285..4196,2229..1855) FT /product="HARBINGER1_AG2p" FT /note="DNA-binding protein" FT /translation="MEQAGSSRRNEKIDHEKSLRFIAEVQKHRVLWEKKNK FT NYKNVVLKGDAWAAIAAKEEVSPQDAKHLWSRLLGIYRTNKAKVKKTTQTG FT AGNDDVFRPRWFAYQAMSFVDEATQDAVHVDTLGYDDEGLTPRSLVEAAYA FT VPQSLDDIDWDAAVAFDPEPFSPTPSSYPSALSVTPGTIREGNGVSGALNN FT RLQQPVSGATGAPETADPDDFGRYVADELRLIPSPKRRRIMSIQRELLETL FT KKNL" XX SQ Sequence 5377 BP; 1580 A; 1209 C; 1120 G; 1466 T; 2 other; aggccgggct acattgatcg tactcgcaag cgtaattttt atattcacta gcgcatctgg 60 cggcgaccag cgcaagctag atgtgggtat aatttaagtg ccaaaattat tcatgcttgg 120 tgtctcatca attttcgacc aggctgtctg tatgccgttt gacatgttga tcaaaatcaa 180 taaattgcta caggtcagga cgccgagtac tgctgggcaa atcgtcattt gacagttggg 240 tgcctaagct tcatacaaag cgcacgacaa cattccttcc catctgccaa actccatcca 300 tcatggcttc cacagccaaa tcttccacat ataattgtag tctatgttat caccttgatc 360 gcagcagtgt gcattgttta tcgtttgaat cgtttgccgg ccggtgttta gtagaagtga 420 aaattttggt ggttctcgat tttttggaaa aatgagcgag ccagagcttg aaaatgacaa 480 caacgaagtc cgggatcgtt cccgatctcg ttcccgtagt ccgcaggcac gtcgtttgtg 540 ggttcaggac ttattcctga accgaaacga aaccggcaac cggctactga ccgacatcac 600 gacatcgggt atatacgaga cgatgaaccg atttttaagg atgaagaagg aagatttctt 660 ccatttactg tcccttgttg gtccaaaaat tgcgaaaatg gacacagatt tccgcaaagc 720 aatcacggaa caggaaaggc tgctgataac attgcggtat cttgcgactg gagagacatt 780 taccagcctc caatacgtgt tccgggtaag taatattatt tttgcttttt gctttttgct 840 aagcacactt ttatcaatca acatataatt ttaggtgtcg aggcattcta tcagcagaat 900 agttaaagag acgtgcgcat gtcttatcga ggctttgcgg gattatgtca aggtaagttg 960 cgggtgtaac cagcctcgct tcagtgttta tgtgtttaaa acgttcagcc cggtggcagg 1020 caccagggcc tgccagtaaa ctttcccgtt tgccggggcc aaccattgct ctggggactc 1080 tttggctaga gaaacagaat acattgcgaa caactgtact ctaatcgcat gaatacaacg 1140 gtgtgaaggt aaaaaaatga ttttctaaat attattgcga ttgataaaga catcaaccca 1200 cacacaacag tattgatgaa gagagtagag ttgaaattat agcgaaaaac ctcatttttt 1260 ctacatgcat ctttggatca aaattcactg cttgggtagt agatgccggc ggatagttca 1320 tgaaaaacgg agccggaatg aacgtgtgaa atgtgctgca ctacgttttc ctcttsaacg 1380 tcctcgtata aaaagagtca gccggaatcg attccgatcc gtgggtgcag cgggtgcgct 1440 aggtcggagt tggaatcaaa tccgacccgc gggtgcaacg agttcgggtc gactcgctta 1500 tgctttctcg cacctactga tctttcgggt cgacccgaac tcgttgcaac cgaacctacc 1560 gtggcccgac ccgacccgaa ctcgcagcac caacgggtcg gagtcgttta tgctttctcg 1620 cacctaccgg agtatttgca cccactgaaa ctacttatac cttttgggtt gacccgaacg 1680 actccgggtc gctttcggat ggctccgacc cgataggttc gggtcgaccc gcccattact 1740 aaaaggttca tgtaatatga aaacggaaac tttttttttc ctcgaacgta gctcttcaat 1800 caagtaacta atgtgatata caaaactcaa aaagtattca aagttatgtt tttaaagatt 1860 tttttttaat gtttctaata actcccgctg aatactcata atgcgtctcc tttttggaga 1920 cggaataagc cgaagctcat cggccacgta ccggccgaaa tcgtccggat ctgcggtttc 1980 cggtgcgcct gttgcgcccg aaaccggttg ttgcagccgg ttgttgaggg ccccactcac 2040 accgtttccc tcgcggatcg tgccaggagt gacggataag gcagaagggt aggaagaagg 2100 agtgggagag aatggctcgg ggtcgaaggc aaccgcggca tcccaatcga tgtcgtccaa 2160 cgattgtgga accgcgtagg ctgcctccac gagggaacgg ggagtgaggc cctcatcgtc 2220 gtagccaagc tgaaaaaaaa agaaaatgaa cattaaaatc ataatttatg tttctcataa 2280 gctctaaaat atacagtaac taaaacatgc attgaaatca ttcctctatt ttattttctt 2340 ttagctaccc tctaccgaag aagaatggct tgcaatctca agacgatttg agcagcgctg 2400 gagatttcct cacgcaatag gtgcaatcga tgggaagcac gttgaaatta tttgccctcg 2460 taatagcgga tccgaatatc acaactatca aaaatttttt agtattgtat taatggttgt 2520 ggtcgatgct gattataact ttttatgggc agatgctggt ggtaagggag gaatatcgga 2580 cggtggaata tttaaaaaca cacggctgta tcacaagcta gaaaacgacc aactaaacat 2640 tccaccagca acgccattgc aggtcccgta ccaaacccca gtcccatact ttattctcgg 2700 tgacaaggca tttgccttta ccaattactg cttaagaccg tacagcgggg tgcatcctcc 2760 tgattcaatg gagcgtacat tcaacaaaat gcactctact tgtcgtatgc cagttgaaaa 2820 ttcgcttgga atattagcga atcgatggag agtgctcaaa ggcatacaac tgcagccgga 2880 tgttgccaaa aacattgttt tgacaacagt ttacttgcac aattttttgc gcaagcatgc 2940 ttcgcgggac acatacacac ccccgtctgc atttgatagg gttgttcgcg ggcgacgagt 3000 cgatggagat tggagaagtg aagggggctt gaccgatctc caaaacattg cttcccgacc 3060 ttcagaaaat cttgctgaca taaggaacca cattgcaaac catttaaaac ataatcgttc 3120 tacgtaaatc catccaacca ataccatgga tattaaaatt aataataaat tatgaaatcg 3180 caaacacaac tacaccgact attctgtaca ttactaccag catggttatt tataatatac 3240 atttaagtac gcgagtgcaa caactacgga tttttttaag cattactcgc ctgctctatt 3300 agcggctgga taaactggct ggcttactag ctcactcacc gaataaacgt ggccacatgc 3360 ctgttgtgtt tggttgctgc tggtggatcg agtggacgaa cgtggcgata atcatacgca 3420 ygtaactatg tacagtaaaa tctttctaaa atcaattctc ttcaagattg atcatggaga 3480 aacttaacta tagctacgta cacgatagcg actcgcccaa acacattatc ggacgaattg 3540 gccatcgacc cgctgaaata gcggagtatc acggacgatc gtaacaagca gggtacgctc 3600 gtctatcaac tacgcacctg cttccagcgc aacgatcgat tgccaattgt tattgccgat 3660 tgcaattgca caacgagcga agaaagaatc catcgtcatg tcagcactat gattccggac 3720 ggctatacat attggtctcg tgctgttgac gtggtgcgct ggtcagcatt tcctgcttgt 3780 cgtgatcgtc cgtatatgtg tacattttaa ggatcatagg aacacttcgt gtgtgaaatg 3840 attcatttgt attaaaactt tgacgaacga agcacatact actgtggtgt tatggggaaa 3900 accaccatga tatactttgt tcgtcaaaat gttaattata taagtgcaaa agagagcttt 3960 tttatgtata taacaaacga gcttcgcatt aagttagtta aatagaatgt gagtaaggtc 4020 aactagttcg tcggagaaca tctatcaatc atggataccc aaatgtgttg atgctcaata 4080 ctaggcattt atgaaaaaaa tgatttgaaa gctgacatta gcttccacgc caaagattaa 4140 acaaactagt taacccacac cactagttat ttaaaataag tataaacata ctcaccgtgt 4200 ctacatgcac cgcatcttgg gtggcctcgt ctacaaaaga catcgcttgg taggcgaacc 4260 accgtggccg gaacacgtcg tcattgccta aaataaaaca ataataaatt gggatgagaa 4320 ataataaatt aactgtatat atattttcct acatgtacct gcaccggttt gcgtcgtctt 4380 cttgaccttg gccttgttgg tccgataaat gccaaggagt cgggaccaca gatgtttcgc 4440 atcttgtggc gaaacctcct ccttggccgc tatcgcagcc cacgcgtcgc ccttcaaaac 4500 tacattttta tagtttttat tttttttctc ccacaaaaca cggtgctttt gcacctcggc 4560 aatgaagcgg aggctttttt catgatcctg gaatgaaaaa gaaaattcag ttaaatactt 4620 ccctttttta atttaaatta aaacaacaga aaggtcgaca atcgcgtgat aattcggtca 4680 tgctagccgt cacaccggaa ccacagtgat aataataaca cgtgctattt ttatattaaa 4740 tgtatacaaa taggccggta aattttgctt acaattttat aaaaaaaagt tctaccaagc 4800 aatatgacaa agccgttcct catgaataaa aaaacacgct ctatactaaa aaccgctaca 4860 ataccaaaac gaacaaccag tttcacacac tctgcaatgt agtttaaatg atgaaataga 4920 tggacaaacc ctacaacaaa ttatttaagc acttgtgaaa cctacaaaca ctacattttt 4980 caataaaatg ttcagcacaa ttgcggcccc gagccgagcc ggcgattcac agacaacagt 5040 tcacacacac acacacacac acacacaatt ttcgttaaaa acagttgtac ttactatttt 5100 ttcattgcgt cgactgcttc ctgcttgttc catggctaga aatacaatat gtatgttaaa 5160 ttaccacata ttttcaaaaa aacatacacg acaaaatatc gaacatactc accgagctgt 5220 ttgtttactg gtttgtgatc aaaattatag gctcctcgac ccaaaacgct ttttgacaga 5280 taaattatac cagcctctag cttcgacccg ccgccgctag atggcgtacg cgttcacaaa 5340 aattacactc aggagtacga tcaatgtagc ccggcct 5377 // ID DongAG repbase; DNA; ANG; 3848 BP. XX AC AB097127; XX DT 04-JUN-2009 (Rel. 14.06, Created) DT 04-JUN-2009 (Rel. 14.06, Last updated, Version 1) XX DE Anopheles gambiae non-LTR retrotransposon DongAg - a partial DE sequence. XX KW R4; Non-LTR Retrotransposon; Transposable Element; DongAG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3848 RA Kojima K.K. and Fujiwara H.; RT "Cross-genome screening of novel sequence-specific non-LTR RT retrotransposons: various multicopy RNA genes and microsatellites RT are selected as targets."; RL Mol Biol Evol 21(2), 207-217 (2004). XX DR EMBL/GenBank/DDBJ; AB097127; Positions 1 3848. XX FH Key Location/Qualifiers FT CDS 18..3644 FT /product="DongAG_1p" FT /note="reverse transcriptase and restriction FT enzyme-like endonuclease domains." FT /translation="METRSMRKRTTRLPEEGAPTGAGPGTGDRASIQRLED FT EMVQERSFSQRALPVPRTQNRNGSPINHQGNAASANVAVADRQQSLILAGG FT RRQRIMWTREMNHYVIRCYYVYTRMETDMPGRVKMLGMFNDRFPRFAHQLD FT LSKLYIRQRAIILPEELEFIKLEVRREFGEEEAGWRESSRISARLNTIDQN FT TSRASEDRDLDEPTAPGLSVDIQHQMATAVTQFHGTDPLSRHRLPKLHYSY FT RLKTAVSIINQDVLPQYLDSVGSIEDLQLIVYSAAVAVVRTLWLRTYPQGD FT SEGRPCSKAEKPAWMRRLENRINATRTKIGRMQEYQRGNSSMKVVRQIAEM FT VKPKELRDLTDANITEVLDIHLQRLSALAKRLRRYAECSKRKEQNRMFNIN FT EREFYNWIRNDKPNFREGLPDIGDFTQFWANLCEKPVQHNSEGMRLAEDER FT FSDGIEDMPVLVVNAQDIREATQYTRNGAAPGPDFVYNFWYKKLITIHEQI FT AACFNTVLEDSRKLPKFITGGVTYFLPKDQNTKNPAKYRPLTCLSNLNKVL FT SSVITQKVKDHCDTNNVMTEEQTGRRKNTQGCKDQVIIDAVIVGQAAKKQR FT NLDMAYIDYKKAYDSVPHSYLLKVLQLYKVDGNVIKLMQHAMGMWSTSLHV FT TDGKVVLRSRSLNIRRGIFQGDTFSTLWFCLAMNPLSRTLNQQCNFGYLLK FT SEEISTRITHTFFMDDLKLFAETVQKMHHLLKNVQGFSNDIKMEFGIGKCR FT SIHLHRGQVLDADSFRANEQEEIRHMVQGETYKFLGFLQLRGIHYAVIKKE FT LQDKFLHRVSCILKSFLSVGNKVKAINTFAVALLTYSFGVMKWSNTDLEAL FT ERTIRVVSTKHQMRHPKASVERVILPRKIGGVGIIDIQALCISQIHQLRSY FT FVESQNRHELYRTVYKADHGLSALHLAQQDYQLNCNIKTVDGKGATWKQKE FT LHGTHTHQLNLEHIDKVSSSTWLVRCDLFCETEGFMVAIQDRVIATWNYRR FT CILREDVEDRCRKCNSGGESIEHVIAGCPVLAGSAYLDRHNDVAKIVHQQL FT ALRHKLVERFLPCYRYLPDPVQENDCIKLYWDREIITDILIRANRPDILVY FT EKRKKRATIDIDIAVTLDHNVQTTFSTKVMKYHDLAEELKQTWYLEDIRIV FT PVIISATGIVPMALLRSLDELELQRELPRIQKAVILRTCSTLRRFLNPYN" XX SQ Sequence 3848 BP; 1202 A; 816 C; 947 G; 883 T; 0 other; gaaggctaac cacaataatg gagacacgat ctatgagaaa aaggaccaca cgattgcccg 60 aggagggagc ccctactgga gctgggcctg ggacgggaga cagagcaagc atccagcggc 120 tggaagatga aatggtgcaa gagcgttctt tcagtcaacg ggctctgccc gtaccacgaa 180 cgcaaaacag aaacggcagc cctataaatc accaaggcaa tgctgcatct gctaatgttg 240 ctgtggctga tagacaacag tcactcattt tggcaggagg ccgacggcag aggattatgt 300 ggacgagaga gatgaaccat tacgtgatcc gttgctacta tgtttacacg aggatggaga 360 cggacatgcc cggcagagtg aagatgctgg gtatgttcaa tgaccgtttc cctcggtttg 420 cgcatcagct tgacctgagc aagttgtata tacggcagcg agctattatt ttgcctgagg 480 aactggagtt catcaagctg gaggtgcgga gggaatttgg agaggaagag gcaggctggc 540 gcgagtcgag taggatttct gctaggctta atacaataga ccaaaataca tcaagggcga 600 gtgaggatcg tgatttggat gaacccaccg ctccaggatt gtcagtggat atccaacacc 660 agatggccac agctgttacg cagttccatg ggacagaccc cttgtctcgt caccgactac 720 caaaactgca ttattcttac cgcctgaaaa cagcagtaag catcataaac caagatgttc 780 tacctcagta tttggatagc gtagggagca ttgaggatct gcagttaatt gtgtattcgg 840 ctgcggtggc tgttgtacga acgctatggt tgcggaccta tccgcaagga gacagcgaag 900 gtcgaccatg ctccaaggct gaaaagcccg cctggatgcg acgtctagaa aaccggatca 960 acgcaacacg gacaaagatt ggtcgaatgc aggaatatca acggggaaat tcatctatga 1020 aggtggtacg tcagattgct gaaatggtta aacctaaaga actacgagac ctcactgatg 1080 ccaacataac ggaggtactc gacatccatt tacaacggtt gagtgccctt gcaaaacgat 1140 tacgacgtta tgctgaatgc tcgaagcgga aagaacaaaa tcgaatgttc aacattaacg 1200 agagagaatt ttacaactgg atccgaaatg ataagcccaa ttttagagaa gggctcccgg 1260 atattggcga ctttacacag ttttgggcca atctatgtga gaaacctgtc caacacaaca 1320 gcgaaggaat gaggttagca gaagatgagc gcttcagtga tggtatcgaa gacatgcccg 1380 tgctagttgt gaatgctcaa gacatacgtg aggcaacgca gtacaccagg aatggagctg 1440 caccaggacc cgattttgta tacaattttt ggtataaaaa gctaatcaca atccatgagc 1500 agatagcggc atgcttcaat acggtgttgg aagattcgag aaaactacca aaatttatca 1560 ccgggggagt tacttacttt ctaccaaaag atcaaaacac aaaaaatcct gcgaagtata 1620 gaccacttac ctgtctttct aacttaaaca aagtgctgtc gtcagtgata acgcagaaag 1680 tgaaagatca ttgcgatacc aacaacgtaa tgaccgaaga acagacagga cgtcgaaaaa 1740 acacgcaagg ctgtaaagac caggtcatta ttgatgcagt cattgttggt caagcagcca 1800 agaaacaaag aaatctggat atggcataca tcgattacaa gaaggcgtat gattcagtac 1860 cccattcata ccttcttaag gtactccagt tgtacaaagt agacgggaat gtcatcaagc 1920 tgatgcagca cgcgatgggt atgtggagta catctctaca cgttaccgac ggaaaagttg 1980 tactacggtc aagatcactc aatatcagga ggggtatttt ccaaggtgac acctttagta 2040 cgctgtggtt ttgtctagct atgaacccgc ttagcagaac actcaaccag caatgcaact 2100 ttgggtattt actcaaaagt gaagaaataa gcacgagaat cacccacacc ttctttatgg 2160 atgacttgaa gctgttcgca gaaacagtac agaagatgca ccacctgttg aagaacgtgc 2220 agggattcag caacgacatt aaaatggaat ttggtatcgg taaatgtcga tcaattcatc 2280 tacaccgagg tcaagtattg gatgccgata gcttccgtgc caacgaacaa gaggaaatcc 2340 gccacatggt tcaaggtgaa acttacaagt tcctcggttt cctgcagctg aggggtattc 2400 actatgcagt gatcaagaaa gagctacagg acaagttctt acatcgtgtt agctgtatcc 2460 tgaagagctt tttgtcagtc ggcaacaagg tgaaagcaat aaacacattt gcggtggctc 2520 tgttgaccta cagctttgga gtaatgaaat ggtctaatac tgacttggaa gcgttggagc 2580 gaacaattcg tgtggtttcc actaagcacc aaatgcgtca cccaaaagcg tccgtcgaga 2640 gagtaatcct gccacgaaaa ataggagggg taggaatcat tgatattcag gcactttgta 2700 tttctcagat ccatcagctg cgaagttact tcgtggaaag ccaaaaccga catgaattat 2760 accgcactgt gtataaagca gatcacggat taagcgccct gcatctagcg cagcaagatt 2820 accagctgaa ttgcaacata aaaaccgtcg atggaaaagg cgcaacgtgg aaacagaagg 2880 agttacatgg gacgcacacc catcaactga atctggaaca tatcgacaaa gtgtcatcta 2940 gcacttggct tgtgaggtgt gaccttttct gtgagacaga aggtttcatg gtagccatcc 3000 aagaccgggt aattgcgacg tggaactatc ggcggtgtat attgcgtgaa gacgtggagg 3060 accgatgcag aaagtgcaac tcaggaggag aatcgattga gcatgtcatt gccggctgtc 3120 cagtgctagc tgggtcagcg tatctcgatc gccacaacga cgttgccaaa attgttcacc 3180 agcagcttgc actgaggcac aagttggtag agcgattttt accctgctac cgatacctcc 3240 cagatccggt ccaggaaaat gattgcataa agctgtattg ggatcgcgaa attataacgg 3300 acatcctcat ccgtgccaat aggccagaca tcttagtcta cgagaaaaga aagaaacgag 3360 cgaccatcga catcgacatt gctgtaacgt tagaccataa tgttcagaca acattttcca 3420 ccaaggtgat gaagtatcat gatctggcag aggagttgaa gcagacgtgg tatctggagg 3480 atatccgcat tgttccggta atcatctcgg cgaccggaat tgtacctatg gccctcttac 3540 gttccctgga cgagctcgaa ctgcagagag aactacccag gattcagaag gcggtgattc 3600 ttcgaacatg tagcacttta agaaggttcc tgaatcccta taactaacat ccggtgcaaa 3660 ctcattaaca ttaagaaaag agagaggaga aatgagaatg agattcattc acctttggca 3720 tttgaatagc ccggggtagg tgaaaagttc ccagcatatt gctgagaagt gacaaaattc 3780 ggataataat aataataata ataataataa taataataat aataataata ataatatgca 3840 taataata 3848 // ID BEL-21_AG-LTR repbase; DNA; ANG; 227 BP. XX AC . XX DT 01-SEP-2010 (Rel. 15.09, Created) DT 01-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE BEL-21_AG-LTR. XX KW BEL; LTR Retrotransposon; Transposable Element; BEL-21_AG-LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-227 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1434-1434 (2010). XX DR [1] (Consensus) XX CC LTR belonging to Bel21_AG. XX SQ Sequence 227 BP; 76 A; 39 C; 54 G; 58 T; 0 other; tgttggaaac tcgtgagtgc ccaacagtct gtcaggtgac aacgctagag aagagagtga 60 gagggaggat aaaaacggta aattaggaga taaggagaag ggttggtccg ctcctaacca 120 gttagttgaa attcgacaat cggtacaagt ggactaattt tttgatttaa cctcatacca 180 cagtaaataa agtgttttct ctaaaaaatc caccgcgttt ttcaaca 227 // ID GYPSY11-LTR_AG repbase; DNA; ANG; 811 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY11-LTR_AG is an LTR of retrotransposon GYPSY11_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY11_AG; GYPSY11-I_AG; GYPSY11-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-811 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY11_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 165-165 (2003). XX DR [1] (Consensus) XX CC GYPSY11-LTR_AG is a long terminal repeat of GYPSY11_AG (its CC internal CC portion is deposited as GYPSY11-I_AG). XX SQ Sequence 811 BP; 319 A; 192 C; 123 G; 177 T; 0 other; tgtagcatgc acatgcacat gctatactgt tttcaatcag ttcacacaca tttcttatta 60 cacttaataa acacaaaatt gtcaacacaa cacaacacac aaaaagacca aattccaacc 120 actcaagata ccaacacaac ccaaaccaca taaataacct tacgtcagga attgtaagtc 180 agcagaaaaa cccttatcca aaaacactta aaacacacaa gcgacacaca accccggagg 240 tgcgaaaaaa tcttacccag cacaaaacgc ttcaagtggg taaagcgtaa aaacacagca 300 ttgcataata gcaaccaaca gttaataaat aatgaaatag gagacaccgt acctcgtgtg 360 tacaaacaga accgtaagcg taggaaacaa aacaccgcgt aagcgtcaga accaagaact 420 gaccttacat aataagcaca accatatgta aactcaaaac ttgacattca gcataattga 480 cacaaaatag aaagtaaagt aaaaccaaat ctaactaaaa ggaaatgaat cgtataaata 540 aagatgtaag atgagaccga gtgctgctca gttacacaca gtacgcacag tgacacggtt 600 cagtggagcc gttcaagtct tatcgtaatt cgcacaaaca acacaaatat gttaaattaa 660 ttcattgtgt ccactcgtta cctggagccc tgatggctcc cagtgatcat ccgaaactaa 720 tgctttgttt ttgacgaatt gacctcgcga gtgatgttta atccttctgc accgctgctc 780 taacaattgc aacctgatac agaatattac a 811 // ID Clu-11_AG repbase; DNA; ANG; 899 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-11_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-899 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1435-1435 (2010). XX DR [1] (Consensus) XX CC TA TSD. >98% identical to consensus. XX SQ Sequence 899 BP; 287 A; 161 C; 175 G; 276 T; 0 other; tagcggaatt tggggcaagt gtgccatacg gggcaagagt gccaccaagc atttttcctt 60 aaaaacttta gtttaatgat caattgtgta tacagttggt tcctttccaa tgactaggta 120 tactgagccg aatttcatat agaccaatcg atatttccca ttgttttgaa ctgatttata 180 ttgagtttca aaattgagca gaatattata attttttaat ccaaccaaat aagctctcaa 240 aaatcatgcg aacaatcaac cgagaaaata tttacaccac cgtatagctg agaaaacgtg 300 aaattttttg tggggttagt tgaaaaaaac tttctttttg tcactgaaaa attcaatttc 360 aaaaaagtta caacttttgg ggcaagtgtg ccatctctgt ttggggcaag tgtgccacca 420 tctttcatag gcgcgagctg tcaaaaatgt aaacattgtt gtagcgtgct gcggaatggt 480 gcgctattgt gagcggattc cacgccagaa aaacagaaca gtaacaaacc atattgtggt 540 acagttggtg gtcaaaaagg gttcattggt gactaaatac atgtaggaat acatacacta 600 gtggtacaaa cgtaataatt tccaattatc taagaaatac ggttatttta accaggtggc 660 cgtcttgccc catacgagtg gcactcttgc cccatagccc aaaaaacaac gtctttttgg 720 acccttttta aaacgcttca aaaactgttt tatttgcact tttttcaagc gaaactcatt 780 tataagtgag gaaatagatg taaaatggtt atgagattta aatcggcttt tgaaaaacac 840 gttttagtga gttataactt gaaatgctta aggtggcaca cttgccccaa gttccgcta 899 // ID BEL5-LTR_AG repbase; DNA; ANG; 210 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL5-LTR_AG is a long terminal repeat of the BEL5_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL5-I_AG; BEL5-LTR_AG; BEL5_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-210 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL5_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(4), 70-70 (2003). XX DR [1] (Consensus) XX CC BEL5-LTR_AG flanks an internal portion of BEL5_AG (deposited as CC BEL5-I_AG). XX SQ Sequence 210 BP; 73 A; 50 C; 43 G; 44 T; 0 other; tgttcgcgca acgcgaatgt tctcgcaacg cgaatgatga gaactcaacc cgtttgtaaa 60 cacaacaaca catgtgtcat ttgcgcacag acggagaaac gggaagcggg tgtcaaaaaa 120 ccacatggcg atacagtagc agcgaataaa gaactctaca tttttctaca gcaaaaaaaa 180 tccagtgtta tcacttgagt tatcccacca 210 // ID Clu-137B_AG repbase; DNA; ANG; 1168 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-137B_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1168 RA Jurka J.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1450-1450 (2010). XX DR [1] (Consensus) XX CC 2bp TSD. >95% identical to consensus. XX SQ Sequence 1168 BP; 376 A; 190 C; 189 G; 405 T; 8 other; cccaagtagc cttaagaaac tttatttgat tattaacagt gttttaaagc ctttaaaaat 60 caaatagagt ttcttaagag aatgtgacat ttgtcagcgt cacagcataa cgacgtattc 120 cgttaccctg ttttttttcg attcgggcca atataacaca cagtgttttt caaattttat 180 ctacaccagt ttcattgttt caatattcgg aaaattattt ttttattctt atttatgtaa 240 taatgcgcat ttattttcac aacaanatat ttgtttgttn aacggtacgn aaaatatgtt 300 tttctttttt tccttnactt gaatctacgg tgctaataaa aaaaaaataa taaaaaaata 360 atgtttcaaa nttttnaata ttatatgcga tgagtttgat gaaggatatg ctaatgagat 420 aaatnttcat aaaaaaatat tttttgtaca aaataatata ctcttgatga tgataatttt 480 tgagacatac tgtacatggt cacgtacatg gcgcctccat ttcttatgtc aaaaagttcg 540 tcaagcttcg tgtgtattgt gtgtagatgg caacaaaacc aagcgtctgc gacaatttct 600 atcagctatc gtttaatttg tgtatcatca tgcttttaca acaggtcagt gtatcgatat 660 taccaatctc ttcacatgag ctacataagt tttatatttc gcagcatttt cgattcggat 720 agcgttcgga catcatgttt tccgtccacg gctgcagcta acctccatgc taaattgatg 780 gaagaaaccc atggaatgaa tgaaacttat gtgatacagt gacaatatgc agcggcaaaa 840 cgattaaatg tcgaataaaa gcattcgtgt tgcatttatg catttattcc atgaaataat 900 aacacccatt gttgtaactt aacgaagatc tagaaaattc gtatagaaag agaaacagca 960 atacagccat ccctgtcgat attttcgtta attttgatct ttttcgttga aaataccggc 1020 taaattttaa attttaacga aacattgacg aagccaggaa ngcgttacgc agtgttttaa 1080 gccccgaaat tactgctgta tggaaagcta aaagtaatac ttttaggcat acttttaacg 1140 gtttcttaac aggattgtgc tacttggg 1168 // ID GYPSY39-LTR_AG repbase; DNA; ANG; 353 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY39-LTR_AG is an LTR of retrotransposon GYPSY39_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY39_AG; GYPSY lineage; GYPSY39-I_AG; GYPSY39-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-353 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY39_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 71-71 (2004). XX DR [1] (Consensus) XX CC GYPSY39-LTR is a long terminal repeat of GYPSY39_AG (its internal CC portion is deposited as GYPSY39-I_AG). XX SQ Sequence 353 BP; 106 A; 85 C; 82 G; 80 T; 0 other; agttatgtac acatgcgcta cgcgctatga tacaggtgct gagtaagaaa acggtcgcaa 60 gcgagccatc gatgttactc cacgcgcgga gtcaagttcc aacgggaact ccactggaag 120 caggttccca cgaccaggtg tggtgtcgta tgttcagacg gaccgaggca acatgatcat 180 cataaacgtg gacgacaacg tgaccggcaa acctggcagc accgaaaaca ccagcctagc 240 taaatagtac cgaatccaga agttagcttt agtcttagtt tagcagttcg caaataaaga 300 tccccagtaa tgtttttttt taaaacttac tccgggctat cgtaaacata att 353 // ID HAT1_AG repbase; DNA; ANG; 3702 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE HAT1_AG is a hAT-like autonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; 8-bp TSD; KW Autonomous DNA transposon; HAT1_AG; HATN4_AG; hAT superfamily; KW transposase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3702 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "HAT1_AG: a family of autonomous hAT-like DNA transposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 58-58 (2003). XX DR [1] (Consensus) XX CC HAT1_AG is a young family of autonomous DNA transposons that CC belongs CC to the hAT superfamily. HAT1_AG elements are less than 1% CC divergent CC from the consensus sequence. The genome harbors 6 HAT1_AG CC elements. CC HAT1_AG has imperfect 11-bp terminal inverted repeats (3 CC mismatches). CC HAT1_AG encodes the 603-aa HAT1_AGp transposase (pos. 1473 - CC 3284). CC The N-terminus of HAT1_AGp (pos. 10-50) is composed of the CC BED zinc finger. XX FH Key Location/Qualifiers FT CDS 1473..3281 FT /product="HAT1_AGp" FT /translation="MMAPTNATTSPVWDHFSPVETGAKCLYCLKVFKYTKG FT TTSNLKRHLNLVHKTVPYLKQKQPIPQTINIDDEAGPSAVNFQPSNQYFNS FT NMSIQGYLKKPINSETKKVLDRMLLDLICKECLPFNLVESEIFKKFVYTLN FT PNYIMPTRKSLSNALLPSVYNQEFEKAKEKLSTAKAIAITSDGWTNLNQIS FT FFALTGHYIDENCKLSSILIECSEFENPHSGRNIANWIQGTLNKFDIEDKI FT VAMVTDNASNMKAASTELNFCHIPCFAHTLNLIVRDAIKKSVLPVVEEVKR FT VVMLFKKSPKASQMLADTQKKLNLDQLKMIQEVSTRWNSGYDMLNRFYKNK FT IALLSCADSLKMKISLESHDWEAIEQIVRVLKYFYSATNIVSAQKYITISH FT VGLLCNVLLTKTSQFRNDEDIAENIQNLVALLIEGLQNKLKIYRSNEQILK FT SMILDPRIKQLGFQDDVEKFKNICESIISELLPLQKPAVEVEKVVKKVSKD FT VDMLFGDLLKNKGAQNYKTPRQIAENELHQYLSVENIDLENDPLLWWKEHQ FT VLYPSLYTLAMSTLCIPGTSVPCERLFSKAGQIYSEKRSRLAPKKLQEILF FT IQQNA" XX SQ Sequence 3702 BP; 1286 A; 637 C; 686 G; 1093 T; 0 other; tagagttgtg cctcaagaac cagaactgta cgattcattc aattcatcct aaagaacgaa 60 cgacctacgt tcatctaatc aacgttcatc gattctttat aaattataat gtattgagtt 120 gttcatgaaa ataacccggg taggaacccg tcgaaatgaa gaacgtccta gtacgccagg 180 tcgttcattc cccccataca aatatgaacg tgaatcgtat gaccaggacg ttcacgaaaa 240 gaacgtccta cccgtcgaaa taaagaacgt cctagcaagc caggtcgttc attcccccca 300 ttcaaaatga acgtgaatcg tatgaccagg acgttcacga aaagaacgtc ctacccgtcg 360 aaataaagaa cgtcctagca agtcaggtcg ttcatttttt cccatacaaa tatgaacgtg 420 aatcgtatga ccaggtcgtt cattctcctg atacaaaaat gcgctgtccc acatgataga 480 catgtgtcga tgaaacaaaa tcagcatact ttctcactca gcccaacaaa acatccgatc 540 tggcattctg cactggcttt ctgcggagag cacagtacaa gataagagag cggatagaat 600 gtttcgttga tctacatcgt tcactctctt tctgatcctc ggagacgatg ccgcttctgc 660 aatatgcgct ctctttttct tcccgctgca atactttcag ctgatagtta ccgagagaac 720 gatcacagag atagagagtc aaaaaaatga cgaggacggg gggtggtatg tgggatgagt 780 ttttttacct acttgtgagc gaggtacagt caaagttcaa taattgaacg aaattttggt 840 ttgtatattt ttcgatggat atttgatagt aggttgatag atcaactaaa aagcatgcct 900 ttatattagt gaatctaatt tcatagaaaa tcgcaaaact gtcgcctata atgaaaaaaa 960 aacttcattt attttgtgca ggaaagaacg tgaaaattta tcaccaaacg tgacactaac 1020 caaacagtga gctgccaatc gaagttcaat tagtgagttt gaactgtact agtcgttttg 1080 tttgagagag aaatgacctt ctcgcgcttt gaggaagaga ggaaaattac actaatacgt 1140 ttatggtcac taaatccata tgataaatac atagatcacg tgtgttcggt tcgttgttgt 1200 ataggtgatt ttgaccgtta tttttgaata cgttcgattt cttcaatgtg ttgtaggctt 1260 atggtaataa acattgctga aaccgtgatt gtattgataa attgatcttt gatatgccat 1320 ttgttgtgat cttttgactt gtaaaccaca aattgatcta cgctccatcg aaataagtga 1380 aaatttaata attgcacggc gatcaaaggt aacattagtc ttggtagtta aaaacaaaaa 1440 cataaagtgt tgtatgaaac atttccacag atatgatggc tccaacaaac gcaacaacaa 1500 gccctgtctg ggatcatttt agtccggtag aaactggcgc aaagtgcctt tattgtttaa 1560 aggtgtttaa gtatactaaa ggaactactt cgaacttgaa gcggcatttg aatttagtgc 1620 ataaaactgt gccgtattta aagcaaaagc aaccaattcc tcaaactatt aacatagacg 1680 atgaagcggg accttctgct gtaaactttc agccatcaaa tcaatatttc aattcaaata 1740 tgagcataca gggttatctg aagaaaccca ttaatagcga gactaaaaag gttttagata 1800 gaatgttgct agatctaatt tgcaaagaat gtttgccatt taatttagta gaaagtgaaa 1860 ttttcaaaaa attcgtttat acattaaatc cgaactatat tatgcctaca cgaaaaagtt 1920 tatcaaacgc cctactacca agcgtatata atcaagaatt tgaaaaggct aaagagaaat 1980 tatcgaccgc caaagctata gctattacgt cggatggatg gacaaacctg aaccaaataa 2040 gtttttttgc cttaacaggt cattatatcg acgaaaattg caaacttagt tctattttga 2100 tagaatgctc ggaatttgaa aatcctcata gtggtaggaa tatagctaat tggattcaag 2160 gtaccttgaa caaatttgac atagaggata agattgttgc aatggttact gacaatgctt 2220 ccaatatgaa agcggcatca actgagttga atttttgtca cataccatgt tttgcacata 2280 cgttaaattt gattgttcga gatgctataa aaaaaagtgt gctaccagtt gtagaagagg 2340 taaaaagagt agtaatgtta tttaagaaaa gtccaaaagc ctcacaaatg ctagctgata 2400 cacaaaaaaa gctcaattta gatcaattga aaatgataca agaagtgtca acgcgatgga 2460 attcggggta tgatatgctt aatcgatttt ataagaacaa aattgcatta ctctcctgtg 2520 cagatagttt gaaaatgaaa atatctttag aatctcatga ttgggaagca attgaacaaa 2580 ttgtgagggt tctaaaatat ttctattctg ctacaaatat tgtatccgcc caaaaataca 2640 taaccatttc acacgtggga ttactatgca atgtgctgtt aaccaaaaca tcacagttta 2700 gaaatgatga ggatatagca gaaaacattc aaaatttagt agctttgctc attgaaggtc 2760 tacaaaacaa gctaaaaatt tatcgttcca atgagcagat acttaaatct atgatattag 2820 atcctaggat taaacaactt ggctttcaag acgatgtgga aaaattcaaa aacatatgtg 2880 aatcaattat atctgagctg cttccgttgc aaaagcccgc agtagaagtc gaaaaagtag 2940 taaaaaaagt tagcaaggat gtggacatgc ttttcggcga tttattaaaa aacaagggag 3000 ctcaaaacta caaaacacct agacaaatag ctgagaatga actacatcaa tatttaagcg 3060 ttgaaaatat tgatttagaa aatgatccgc ttctttggtg gaaagaacat caagttcttt 3120 atccatcatt gtatactctt gccatgagca ctttatgcat tcccggaacg tccgttccat 3180 gtgaaaggct tttttctaaa gcgggacaga tatactctga aaaaagatct cgactagcgc 3240 caaaaaaatt gcaggaaata ttgtttattc aacaaaatgc ataatcaaaa cgtcgttgca 3300 aaatttgtaa catggagtat atgaaataat tgttttgtaa tgcacttttt gtaattaagt 3360 aagacttgaa ttatctagat gtaagtgtta tttttgtaat ataaaaaatg aattcaaaat 3420 ggtgttaaaa tgtgttggtt gccttaatca taatagatgc atcacttaat aacataatta 3480 ggaatacacg agttttttag ttcagtgact aagaactgtt atgcttatta tatgaaactg 3540 ttaacaaaca cccataatat gaaactatcg cgaaataaga tcgatttacc atactgtggt 3600 tagttgacgt gaacgattga atcgcgattc acgctcagtt catgttaatg aaagaaccgg 3660 taaattcacg ttcatggtaa tgaatcgaat tgcccaaccc ta 3702 // ID GYPSY24-LTR_AG repbase; DNA; ANG; 192 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY24-LTR_AG is an LTR of retrotransposon GYPSY24_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY24-I_AG; GYPSY24-LTR_AG; GYPSY24_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-192 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY24_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 14-14 (2004). XX DR [1] (Consensus) XX CC GYPSY24-LTR_AG is a long terminal repeat of GYPSY24_AG (its CC internal CC portion is deposited as GYPSY24-I_AG). XX SQ Sequence 192 BP; 53 A; 39 C; 27 G; 73 T; 0 other; tgttatatat gaaccctggt atttgacagt tgtgtcataa cgcctgacag gttgacccta 60 cctttcgttt gttgttaaat gtcattcctt ctttacacct tattgaattc gtttggttac 120 tttttgctac aactccttaa acaaacctca tgtgaacttt tctgcctaat aaagagaatt 180 aaatacataa ca 192 // ID RETRO99_AG_LTR repbase; DNA; ANG; 285 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO99_AG DE retrotransposon - a consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; MAG; NINJA; RETRO99_AG_I; RETRO99_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-285 RA Jurka J. and Drazkiewicz A.; RT "RETRO99_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 25-25 (2002). XX DR [1] (Consensus) XX CC Related to NINJA from Drosophila simulans and MAG from Bombyx CC mori. CC 5 bp target site duplication. XX SQ Sequence 285 BP; 88 A; 48 C; 71 G; 78 T; 0 other; tgttaggaaa aggcagtgct atagatgagt ccaacccgga cgtataaacg aaatgaagtg 60 aaatgacagc taaggaaagt gacagttgaa atgacaggga aacggttgca cgcgcttcaa 120 gccagtagcg agtgaagcga gcagcaggtg acaccgtttt aaagtttaat tgaagctcgg 180 tattgctatt ttatttggat ttgtatttgt cttcaggaga ataaatagtt aaactaaaac 240 tttcgccttg tggctgttcg ctctcgcact caaactgtgt ttaca 285 // ID RTAg3 repbase; DNA; ANG; 6448 BP. XX AC AB090812; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon RTAg3 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; RTAg3. XX NM RTAg3. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6448 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090812; Positions 1 6448. XX FH Key Location/Qualifiers FT CDS 1206..2738 FT /product="RTAg3_1p" FT /translation="MMASSGMSTRASARSASVDCRSSLASGSKLFAPEPRV FT ALPRVSVTGINKPTVATKAASTTPELELLKATIQQLEEQNLEMKEQNFRLA FT EQITRMCQLLQEEKEEAKRREEKLKAQMEKLAAAHQRDRNLLNSLLAAKVA FT GGQPSASSRQPPTPLPRRSSAQPQQQQQQQQRNQQEQEQPRASTSHAVMLP FT RSEASTAVRGDVVPELTFSEVVRRRYRGKATGKPRSQQQPQQQQQPQQKQQ FT QLQRRQQQQQQHQGQRYVPPQLRQQAHQQQQRQQQKVRPRPDKIEVVPSAG FT HSWYTLYKTVRDAVKQDPHKGLADHLKMGKRSHAANFRMELSNSANASLVR FT AEGQEIVGDAGLARVITDMADVLITNIDPLATEEDIKKAIEREHQEPIEIV FT RVSVWELQDGTQRARVHLPAKAAAAFEGSKLRLCGCISKIRGVEKAAPERQ FT RCYRCLERGHLAHACRSSTDRQQLCIRCGSEGHKARDCSSYVKCAACGGPH FT RIGHMSCEHPASRST" FT CDS 2669..6187 FT /product="RTAg3_2p" FT /translation="MCGLRWTSSHRAHELRTSGFAFNLKVMQINVDHCQAG FT QDLALQAAREHRADVLLLSDIYRPPANNGRWAYDAAKSVAVVATSSYPIQR FT VLRSAVPGIVAAQIGGIVFISCYARPRRPEEDYEGFLAAVQLEASTHSQVV FT IDGDFNAWHTEWGSARNSQRGEDLLQLIQSVQLQVINSGNEPTFIGRGAAT FT SSVIDVCFATPSIARPETWEVHEFARSDHQLITYSVGEAEQQSRGLSTGGP FT SAGRQRVICAGRRWITTQFHVDSFRSALEDVNFAEQATTHAGLVDAMVDAC FT DIVMQRAPNVLQHQHRDVYWWAPVIEELRNECIAARERMRLTTDLQERSLA FT AAEHRTAKTRLEKAIKVGKRAEFAKLIDIAEENELGVGYQVVLSHLRGSRV FT PPETDPVELGRIVTDLFPTHPPVYWPETDDVDSGASDFDRVTPEELQEIAA FT HMAIRKAPGLDGIPNAAVKVAIEKYPGVFCRVYQDCLNTGTFPQQWKRQRL FT VLLPKPGKAPGESSSYRPLCMLDALGKVLERLILNRHLEDPDSPQLSDAQY FT GFRRGRSTISAIQSLVDAGKASRSFGQTNNRDKRCLMVVALDVRNAFNTAS FT WQSIANALREKGVPSGLLRILQSYFTDRELIFNTSEGPVVRCVSAGVPQGS FT ILGPTLWNVMYDGVLRIPLPDEAKVIGYADDLVVLAPGRTPEESAAVAEAA FT VSAVDQWMQQHHLELAPAKTEMTIISSLKHPPSHISIDVRGTAVPYSRSIK FT HLGVLVHDHLSWIPHVTAVTQRAVQIAQAVGRLMPNHRGPKMSKSRLLAAV FT ADSVMRYAAPVWHEALNTRECRRLLERVQRKSAIAVARTFRTVRYETAVLL FT AGLLPICRAICEDTRVHSRRGTAAGAQLRKEERETTIAEWQATWDSDAAGH FT QASGYVRWAHRLIPDIGAWQSRKHGEVNFHLSQIISGHGFFRKYLADMKFT FT SSPDCPNCPGVRESAEHAMFACLRFAEVRERLMDGVNPDTLLTHMLQSQQN FT WSNVCEAAKQITTILQREWDDFRTSLFEQGVLADNAQLRNADVLRQERLRR FT YNETRNAARRAATQQRQAERLPPPPPSPRTERRREVNRLAVARLRERQRAA FT RAEMHGAYQPAPSNDSDDDDDDVENRQATNAASTSEAARTAAEESRVGLTE FT AEAAAAVEAELSSR" XX SQ Sequence 6448 BP; 1605 A; 1761 C; 1882 G; 1200 T; 0 other; gtgtacgccc cccccccacc tccggttgcg ctagaattcc gggtgtggaa atttggggaa 60 aaatttgtgt tttcggcccc ggaaacactc aattttagtg aaattggtgg tgctgaatcg 120 ttttatcgtg tttttgagac gatctgagca gtgcaaagtg tcaaaaagtg tttgtgtttt 180 tttccccata cattttgtat gggaaacttt gtatacacat atttcagtga aaactgcacg 240 gattttgtcc ggatgtgtct tgatagttgc gcgtagtgac gttccaagtg ttgtgcgaga 300 aaacggacgc gaaatcggcg gaaaaacgca aaaataaaaa agtgtcgggg gtgctaccgc 360 cggagcacgt gttttcgtaa aaacggaata aatagttaat cgtgaataac tccgtgagtt 420 ttggtccgaa tcgagtacgg ttttcaccgt tgtgctagtt ttaacgcgta caaaaagatc 480 caatcaaaaa actagtgaaa atcattaaaa aaaatttgac atttctgggt gacagttcag 540 cgatacccgt aaagcaaatt tggtttggga agcagttccc gaacagcaaa atttgaattt 600 accgttagtg ctcgaaaact gctcaaatcc tggaaagtgt gcgtgaaatt cagtgaagtg 660 gtgtagcgtc agcgggagct cgaatctgct cgattcgagg cactcaaaaa ctgctcgaat 720 ttttcgaatc tgctcgaaaa attcggaaaa ttcggagaag ctgtagggtg gtagagggcg 780 acataacaaa cacgggaaaa ctattttcca ccctttcccc ctccctccct cccggtgaaa 840 agtccctaaa gcaaaaaact aggtccaaaa aagggtcgaa aaagcgggca gcgcagtttg 900 gacgtaaaaa cgtgccggga aatttcggga ttggtgcgga tccattcctt acgtaagcga 960 cgcaccctct cgtaattttc tccacccggc acccctcccc cccagcccac caggggggtt 1020 gcaagtcgga ccgtagtgga aaaaccggcg tggaagaagg attctacagc agcgagtgag 1080 cggcagaaca tcagctagac ggcgctgcat acctggggtc atcccgaccc cccggggtca 1140 accgaccccc ggggtcacat cgacccccag ggttatctaa cccaccaact tgtacgtcgg 1200 caaagatgat ggcatcatcc gggatgtcaa cccgcgcgag tgcgaggtcg gcctctgtgg 1260 attgccgtag cagcttggca tccggttcga agctgtttgc gcccgagcct cgagtggcac 1320 tgccaagggt tagcgtcaca ggcatcaaca agcccaccgt agcaaccaag gccgcctcga 1380 ccacaccgga gctcgagttg cttaaggcaa caattcagca actggaggag cagaacttgg 1440 aaatgaagga gcaaaatttt cgcctcgcgg agcagataac tcgcatgtgc caactgctgc 1500 aagaggagaa ggaggaggca aaacgtcgag aggagaagtt gaaggcacag atggagaagc 1560 tcgctgccgc acatcagcgc gaccgaaact tgctcaactc gctactggca gcaaaggttg 1620 ccggcggaca accgtcagct agttcgcgtc aacctccaac tccgttgcca cgccgatcct 1680 ctgcgcagcc gcagcagcaa caacaacagc agcagcggaa ccagcaagag caggagcagc 1740 cccgcgcgtc gacgtcgcat gcagtcatgc tgccgcgtag cgaggcatcg acagccgtcc 1800 gcggagacgt cgtgccggag ctcacattca gtgaggtggt gcgtcgcagg taccgtggca 1860 aggccactgg caagccacgc tcccagcagc agccacaaca gcagcagcag ccgcaacaga 1920 agcagcagca gcttcagcgt agacagcagc agcagcagca acatcaggga cagcggtatg 1980 ttccgccgca actccggcag caagcacatc agcagcagca gcggcagcag caaaaggttc 2040 ggccaaggcc ggacaagata gaggtggtcc cgagtgccgg acactcctgg tacactttgt 2100 acaaaacagt tcgggatgcg gttaaacaag acccgcacaa gggccttgca gaccacctta 2160 aaatgggcaa gcgcagccac gccgcaaatt tccgaatgga gttaagtaat tcggccaacg 2220 ccagcctggt ccgcgcagaa ggtcaggaga tcgtcggtga cgccggactt gcccgggtga 2280 tcaccgacat ggccgatgtc ctgataacga acatcgatcc tctggcaaca gaagaggaca 2340 tcaaaaaggc cattgagaga gaacaccaag agccgatcga aatcgtccga gtgagcgtgt 2400 gggagcttca agatggcact cagcgagctc gagtccacct gcccgctaag gctgccgctg 2460 ctttcgaggg gtcgaagctc cgactgtgtg gctgcattag caaaatcaga ggcgttgaaa 2520 aagcagcacc tgagcgccag cgctgctatc gctgcctgga gcgcggccac cttgcccacg 2580 catgccgttc ctcgacagac cgtcagcaac tctgcatccg gtgtggcagt gaaggtcaca 2640 aagcccggga ctgctccagc tacgttaaat gtgcggcctg cggtggacct catcgcatcg 2700 ggcacatgag ctgcgaacat ccggcttcgc gttcaactta aaagtgatgc aaatcaacgt 2760 ggatcactgt caagcagggc aggacttggc gctccaagca gcgcgtgaac accgtgctga 2820 cgtcctgctc ctgtcggaca tctaccgacc accggcgaac aatgggcgtt gggcgtatga 2880 tgccgccaaa tcggtagcag ttgtggctac aagttcctac ccaatccagc gggtgttgcg 2940 cagtgctgtg cctggaattg tagccgcaca gattggcggt atcgtcttca taagctgcta 3000 cgcgcgaccg agacgcccag aggaggacta tgaaggcttt ctggccgcag ttcagctgga 3060 ggcatcaacc cactcccagg tcgtcatcga cggcgatttt aacgcctggc acacggagtg 3120 gggtagtgcc aggaacagcc agagaggtga agatctgctg cagcttatcc agagcgttca 3180 gctacaggta atcaactccg ggaatgaacc cacattcatt ggcagaggag cggccaccag 3240 cagcgtcatt gacgtctgct tcgccactcc gtccatcgct cggccagaaa cgtgggaggt 3300 gcacgagttt gcccggtccg atcatcaatt gattacatac agcgttgggg aagcggaaca 3360 acagtcccgc gggttgtcga ctggtggtcc gtcagccggc cggcagcgcg ttatctgcgc 3420 tggtaggcga tggattacca cgcagttcca cgtagacagc ttccgttctg ctctcgagga 3480 cgtgaacttc gcggaacaag cgacgacaca cgctggccta gtcgacgcta tggtcgacgc 3540 gtgcgacatt gtcatgcagc gggcccccaa cgtgttgcag catcaacatc gcgacgtcta 3600 ttggtgggca ccggtaattg aagagctgcg gaatgagtgc attgcggcgc gtgagcggat 3660 gcgcctaacc accgatctgc aagagaggag tctcgccgca gccgagcacc ggactgcgaa 3720 gactcggcta gaaaaagcca tcaaagtagg caaacgtgca gagttcgcca agcttataga 3780 catcgccgag gagaacgagc ttggagtggg gtatcaggtt gtcctgtctc atctgcgcgg 3840 cagtcgtgta ccgcctgaga cagacccggt cgagctggga cggatcgtta ccgatctgtt 3900 ccccacccac ccaccggtct attggccgga aaccgacgat gtcgattccg gagcgtccga 3960 ttttgatcgc gtgacccccg aagagctgca ggagatcgcg gctcatatgg ctatcaggaa 4020 agcgccagga ctggacggga tccccaatgc tgcggtgaag gtcgcgattg agaagtaccc 4080 gggggttttc tgccgcgtgt accaggactg cctcaacact ggtacgtttc cgcaacagtg 4140 gaaacggcag cgcttggtac tgctgcccaa gccaggcaaa gcccccggag aaagcagctc 4200 ctacaggcca ctgtgcatgc tggatgcact aggcaaagtg ttggagcggc ttatcctcaa 4260 cagacacctc gaggacccgg attcaccgca gctctcggac gcgcagtacg gctttcgtcg 4320 cggacgatcc accatcagtg ccatccaaag cctggtggac gcaggcaagg cgtcccgatc 4380 gttcggccag actaataatc gcgacaagcg atgcctgatg gtggttgcgc tggatgtccg 4440 caacgcattt aatactgcca gctggcagtc gatcgccaat gcgttgcgag aaaagggggt 4500 cccttcaggg ctgctgcgga tactgcagtc ctacttcacg gatcgggagc tcatctttaa 4560 caccagcgag ggacccgtcg tacgttgcgt cagcgcggga gttccacaag ggtccatact 4620 gggcccgaca ttgtggaacg taatgtacga cggggtgctg cggattcccc tacccgacga 4680 ggccaaggtc attggctacg ccgatgatct tgtcgtcctg gccccgggta ggacaccgga 4740 ggagtctgca gcagtggcgg aggcagcggt gtcagcagtc gaccagtgga tgcagcagca 4800 ccacttggag ctggcaccag ctaagacgga gatgacgatt atctcaagtc tgaagcatcc 4860 tccaagccac atctccatcg acgtgagagg aactgctgtc ccatattcga ggagcatcaa 4920 gcacttgggt gtactggtac acgaccatct atcgtggata cctcacgtga ctgcagtgac 4980 gcagcgggcg gtccagattg cgcaggcggt tggtcgactc atgccgaacc accgggggcc 5040 gaagatgtca aagtcccgac ttttggcagc ggtggctgac tcggtgatgc gttacgccgc 5100 acctgtatgg cacgaggcgc tgaacactcg cgagtgccgc aggctgctag agcgagtcca 5160 gcgcaaatca gctatcgccg tggcccggac gttccggacg gttcggtacg agaccgcagt 5220 gctgctcgcg ggacttctgc cgatctgcag agcaatctgt gaggacacca gggtgcacag 5280 ccgccgtggg actgcagccg gtgcacaact gaggaaggag gaacgtgaga cgaccatcgc 5340 cgagtggcaa gcaacatggg acagcgatgc agctggtcat caagccagtg gttacgtccg 5400 gtgggcgcac cgccttatcc cggacatcgg cgcatggcag tcgcggaagc atggagaggt 5460 gaactttcat ctgtcgcaga tcatctccgg tcatggattc ttccgtaagt atcttgcgga 5520 tatgaaattc acctcgtccc cggactgccc aaattgccct ggcgtaagag agagcgccga 5580 acacgcgatg ttcgcttgtc tgcgcttcgc cgaggttcgc gaaaggctga tggacggcgt 5640 caaccccgac acgctgctga cccacatgct ccagagccag cagaattgga gcaacgtctg 5700 cgaggcagcc aagcagataa caaccatcct gcagcgcgaa tgggacgact tccgcacgtc 5760 gttgtttgag cagggcgtac tagctgacaa cgcccagctc cgcaatgcag atgtccttcg 5820 tcaggaaagg ctgcggcgtt acaacgaaac ccggaatgca gcgagaagag ccgcaacgca 5880 gcagcgacag gcagaacgcc tgcccccacc accgccctca ccaagaactg agcggagacg 5940 ggaggtgaac cgtcttgcgg tggcgagact aagggaacgt cagcgagcag ctcgtgccga 6000 aatgcacggc gcataccaac cagctccatc taacgatagc gatgacgacg atgacgacgt 6060 tgaaaaccgt caagcaacca atgcagcttc aacgtcagaa gcagcgcgaa cggcagctga 6120 agagtcccgc gtagggctga cagaagccga ggccgcagca gcggttgagg ctgagttgtc 6180 ctcccgctag gatgggtgat agaacaagca ggaagccttg agagggagtc cttataaaac 6240 aaaatgggaa gaaatagaac gaggaaggga aaaaaataaa taaaaataaa tgagttaggt 6300 gcgctttgca cggatgtagg ccgctcgaaa gagcagaaaa acccccttta gacccttcgc 6360 ggggcaaaag tgtggcttag gtgagggttg ggtctagaca gtaaaatgaa atgaactgaa 6420 taaacaaccc gaatacttaa aaaaaaaa 6448 // ID GYPSY41-I_AG repbase; DNA; ANG; 6601 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY41-I_AG is an internal portion of retrotransposon GYPSY41_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY41-I_AG; GYPSY41-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY41_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6601 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY41_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 74-74 (2004). XX DR [1] (Consensus) XX CC GYPSY41_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY40_AG, GYPSY42_AG, GYPSY43_AG, GYPSY44_AG, CC GYPSY45_AG, CC GYPSY46_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY41-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 360-aa GYPSY41_AG1p gag-like CC polyprotein (pos. 501-1580), the 1180?aa GYPSY41_AG2p CC pol-like polyprotein (pos. 1538-5077) and the 519-aa GYPSY41_AG3p CC env-like polyprotein (pos. 5014-6570). CC The sequence of the LTRs flanking GYPSY41-I_AG is deposited as CC GYPSY41-LTR_AG. XX FH Key Location/Qualifiers FT CDS 5014..6570 FT /product="GYPSY41_AG3p" FT /translation="NSYNGKKYQEEQKQVKENQVPVIRSFRFLYLSQAWRS FT KMQYFPITFMLILLTQLSQSKELEIIDLNRQPIFFLKTRTCRLQTGSIKFI FT HPINMLTLENAINTITHFSYENINNELKEIVRLKVKLLYSNFQQLKPKHRT FT ARSLEILGTAWKWIGGSPDADDLRIINTTMNELTENNNKQYRINKQFDHRL FT RTLTDTINQLTKEREQVMLNELETIKTIMNIDIINHVLEEIQEAISWTKVS FT VVSNKILSSPEINSIKTILEDQGVKVELPDEALKLVQPIIAINSNSILYIL FT KIPQLADEEATMLEVFPLSIDNRIIVETPTHLIKTRNKVFKPAKPDEYIQN FT QYREYIDKCTSNLILGRKSDCSTARKNNTTIKLISDGLIIVDNAKGAALSS FT SCGPDDKLVSGNLLIRFNDCEVTIMNQTFSSKTISSTVEPYWGAVSITEVR FT WQHHKPMIRQDAFENIGTMQHSYLQQFNSAWNWSLLGGVLVSTIFTLSLAI FT FVFTFYKRSIRTIADVLPKIADA" FT CDS 501..1580 FT /product="GYPSY41_AG1p" FT /translation="MKATERFDSHRDLSTSDNLETEGDKMEEIATQLVEMM FT RAITSLQNQYTALSASTSSSNAGNNRAFDDYFRIPDPIKSLPTFEGNRKQL FT ASWLSTADNTLALFKDLVPAAVYQMYVTAVTNKICGKAKDILCLSGSPQNF FT DEIKEILISSLGDRQELSTYKCQMWQNKMTDGMSIHKYYHQTKEIVQNIKT FT LAKQNEQYRTNWVAINAFIDEDALAAFIAGLRGNYFGHAQAARPKDIEDAY FT AFLCKFKATEQNAGSLTKNVQTPSNKPPFKNKFNQNESTSYNKPTKAISEK FT KFSIKNSDKPEPMDVDASMRSKYAQNKKQFHNNEVETEQESNDSDSDDETD FT HFNEVNFRLAGSLKNNT" FT CDS 1538..5077 FT /product="GYPSY41_AG2p" FT /translation="RSKFSPSRKSEKQYLNSKKKYNYLPYLRTKQGLNLLI FT DSGANKNLIQPGVLKTKKEIKQIEITNIVGKQIIDTCGKTNLLYKEIPSQK FT YYELKFHNFFDGLIGSQFLAENEAILNYRKQTLEISKVIMPFEKYFPNEKN FT YNHVVTLPTNTDGEWIVYEPTKLCKKITVQPGVYSAKNKKTTILLQTNRPK FT PPNIQHKALEITVNNFETVTPLPMKPESKITSEMLSEIIRTSHLSTLEKDH FT LFRTIIKNQNVLLKAGEKLSATPDVKHKITTTNDAPVFTKSYRYPHAFKND FT VEEQINELLRNGIITHSTSPYSSPIWVVPKKVDASGKRKIRVVIDYRKLNE FT KTIDEKFPIPQIEEILDSLGKSVYFTTLDLKSGFHQIEMDSNDKGKTAFST FT AQGHFEFNRMPFGLKNAPAAFQRAMNSVLTGLIGNICFVYLDDIIIIGKNL FT ENHIENLNTVLERLSKFNLKIQLDKCEFLRKETEFLGHVITQEGIKPNPDK FT ITKILEWKLPSTQKEIKQFLGLSGYYRRFIKDYSKLTKPLSKCLKKDTKIN FT TQDEEYKTSFNSLKQIIASDQILAYPDFERPFILTTDASNYALGAVLSQIQ FT EGKERPIAFGSRTLNEAESRYSTTEKEALAIIWSVQKYKSYLYGHKFTLVT FT DHKPLTFIKTSTKNSKILRWRLELENFDFDIQYKEGKANVVADALSRKTEI FT LTNTNINQDSSISGTPKNNVSNTIDVNFEESSSISESLQHNNPSPNNTNSD FT SQTMHSADTSDDYFIHFSERPINYYRNQIIFRKSHITTDITETPFNNYKRA FT IICRNDFDELTILDSLKNFHNNKQTAIMAPDESTISLIQSVYRQYFNQHGH FT FVLTHLQVEDVSNEQRQDIIIAKEHERAHRGIHEVHNQLTRCYFFPHMMTK FT IKKLINLCKICNVHKYERKPYNIKITPRPIETTPFSRVHIDIFGIDKHNYL FT TFVCAFSKFLQTIEIPSRNLTDIRKALAHFITTFGAPRKIICDHETTFRSL FT QLQSFLANLGTELEFSSSSETNGQVERTHSTIIELFNTNKHKFRDLSSPEI FT IKVVTALYNETVHSSTGFTPNEIIFNRTSNRNPEQIIQTTRNIYEKVSQKL FT HNASRNMQKYNDEKETPPEIETGKQIFVKKGVRKKLDPRFNEKTCLNANDK FT TVTMARNIKRNKNKLRRIKSQ" XX SQ Sequence 6601 BP; 2545 A; 1420 C; 1067 G; 1569 T; 0 other; ggcgcagccg tccggagtga tttaagtgag tgtaattgtg tcaaaagtgg aaaagtaaga 60 gactggaaaa actctaacta aagtgatccc gcaagtgata aaagttatcc aacgcatccg 120 gagcaacaca ggaagacaac atcgatcgaa gctgtattcc gccgaggagt tgaggatccc 180 ccacgcagtt acccacctgg aaggaaaacc agcatcgtta aatcacgaac ccgcgacacg 240 acagcaagaa aaaaaaatcg taagtaattc ttttattttc ataaaaagtg caacagtgtt 300 tcagtggaaa tcagtgcttg aaaaaccagt tttactcgcg tccttaggaa atactaccag 360 ataggtttct taggtagaaa gtgacttact accagatagg agtagctttt tctcatcgtg 420 ggagtactac cagatagggc ccgtgataca aagaaaagaa agagatcaac ccagtgtcgt 480 acacaacgaa aaattcgtac atgaaagcaa ctgaaaggtt cgattctcac agagatcttt 540 caacctccga taatcttgaa acagaaggag acaaaatgga agagatcgca acacaattgg 600 tcgagatgat gcgagccatc acttctctcc agaaccaata cacagcgcta agcgcaagta 660 cctcatcaag taatgcaggt aataataggg catttgacga ttattttcgc attcctgacc 720 ctataaaatc attgccaact tttgagggca atcggaaaca actagcatca tggttatcaa 780 ccgccgataa cacactagct ttgtttaaag acctagtacc tgcagcagtg taccaaatgt 840 acgttactgc agtaacaaac aaaatttgtg ggaaagcaaa agacatccta tgtctttcag 900 gaagtccaca aaatttcgac gaaattaaag aaattttaat ttcttcgcta ggtgaccgac 960 aagaattgtc cacctataaa tgccaaatgt ggcaaaataa aatgaccgat gggatgagta 1020 ttcacaaata ttaccaccaa actaaagaaa ttgtgcaaaa tattaaaacc cttgcaaaac 1080 aaaacgaaca ataccgcaca aattgggttg caatcaatgc attcattgat gaagacgcac 1140 ttgctgcatt catcgctgga cttagaggaa attactttgg gcacgcccaa gccgctcgac 1200 ctaaagacat tgaggatgcg tatgcattcc tttgtaagtt caaagcgaca gaacaaaatg 1260 caggcagcct taccaaaaat gttcaaaccc catccaataa accacctttt aagaataaat 1320 tcaaccaaaa tgaaagtacc agttataata aaccaactaa agccatttca gaaaagaaat 1380 tttcaatcaa aaactcagat aaacctgaac caatggatgt agatgcttca atgcgcagta 1440 aatacgcaca aaataaaaag caattccaca acaacgaggt tgaaacggaa caagagtcca 1500 atgacagtga cagtgatgat gaaactgatc attttaacga agtaaatttt cgcctagcag 1560 gaagtctgaa aaacaatact taaattccaa aaagaaatac aactatctcc catatttacg 1620 cactaaacaa ggtctcaatc tattgatcga ttccggcgca aataagaatc ttatccaacc 1680 aggtgtttta aaaacaaaaa aggaaatcaa acagatcgaa atcactaaca tagtgggaaa 1740 acaaattata gatacctgcg gaaaaacaaa ccttttatac aaggaaattc cctcacaaaa 1800 atattacgaa ttaaaattcc acaatttttt cgatggattg attggctcac aatttttagc 1860 agaaaatgaa gctattctca attaccgaaa acaaacactt gaaatttcaa aagtaattat 1920 gccatttgaa aaatatttcc ctaacgagaa aaactataac catgtagtaa ctctcccaac 1980 caatacggac ggagaatgga ttgtttacga accgactaaa ctttgtaaaa aaattacagt 2040 tcagccaggc gtatactcag ccaaaaataa aaaaactaca attttactac agacaaatag 2100 accaaaaccc cctaatatac aacataaagc gttagaaatt acagtaaata atttcgaaac 2160 tgttactcct ttacccatga aacccgaatc aaaaattaca agcgaaatgc tgagcgagat 2220 aattcgtaca tctcatcttt caactttgga aaaagaccat ctttttagaa caattattaa 2280 aaatcagaac gttctattaa aagcaggaga aaaactttca gctacaccag atgttaaaca 2340 caaaataact accactaatg atgcaccagt ttttacaaaa tcatatcgat atccacacgc 2400 atttaaaaat gacgtagagg aacagattaa tgaactacta cgaaatggta tcataaccca 2460 ttcgacaagc ccctattcct cgccaatatg ggtagtcccc aaaaaggttg atgcctcggg 2520 aaagagaaaa ataagagtag taatcgacta ccgcaagttg aatgaaaaga ccattgacga 2580 aaaattcccc atcccacaaa ttgaagaaat actagacagt ttgggtaaat cagtttattt 2640 tacaacactt gacctcaaat caggattcca ccaaatcgag atggattcta acgataaagg 2700 gaagacagcg ttttctacag cacaaggcca ctttgagttc aatcgaatgc cttttgggtt 2760 gaaaaacgcc ccagctgctt ttcaacgcgc tatgaacagt gtgctaacag gactgatagg 2820 aaatatttgt ttcgtgtacc tcgatgacat tataatcatc ggtaaaaact tggaaaacca 2880 catagaaaat ctaaacacag ttttagaaag gctttcaaaa tttaacctca aaattcaact 2940 agataagtgc gaattcctta gaaaagaaac agaattttta ggacatgtta ttactcagga 3000 aggtataaaa ccgaatccag ataaaatcac caaaatccta gaatggaaac tgccatcaac 3060 acagaaagaa atcaaacaat tcttaggttt atcaggctac tatcgcagat tcatcaaaga 3120 ctattcaaaa ttaacaaaac ctctttcaaa atgcctaaaa aaggatacta agataaacac 3180 gcaagatgaa gagtataaaa catctttcaa tagtctaaaa caaattatcg cttcggatca 3240 gatattagca taccctgatt tcgaaagacc gttcattcta acaaccgatg caagcaacta 3300 cgctcttggc gcagttttat cgcaaatcca agaaggaaaa gagcgaccaa ttgcattcgg 3360 aagcagaaca ttaaacgaag ccgaatccag atactccact acagaaaaag aagccttagc 3420 gattatttgg tctgtccaaa agtataaatc ttatttgtat ggtcataaat tcacactcgt 3480 tactgaccat aaacctctta cattcatcaa aacatccact aaaaactcaa aaattcttcg 3540 ctggcgccta gaactcgaga atttcgattt cgacatccaa tacaaggaag gaaaggccaa 3600 cgtagtagca gacgcgctaa gcaggaaaac ggaaattctt acaaatacca atattaacca 3660 agattcttcc atttcaggca ctcctaagaa caatgtttcc aatacaattg acgtcaactt 3720 cgaagaatca tcatctattt cagaatccct tcagcataat aacccctcac cgaacaatac 3780 taactctgat tcacaaacca tgcattcagc tgatacatcc gacgattatt ttatccattt 3840 ctccgagaga ccaatcaatt actacagaaa tcagataatt ttccgaaaat cccatataac 3900 aactgacatt acggaaaccc cttttaacaa ctataaaaga gcaattattt gcagaaatga 3960 ttttgacgaa ttaacaatac tagattctct aaaaaatttc cacaataaca aacaaaccgc 4020 gattatggca ccggatgaat caactatttc actcattcaa tccgtttatc gacaatactt 4080 caaccaacat ggacattttg ttcttacaca cttacaagta gaagacgtta gtaacgaaca 4140 acgccaagac atcatcatag ctaaagaaca cgaacgcgcc catagaggta ttcacgaagt 4200 tcataatcaa ctaacaagat gctatttctt tccacacatg atgacaaaaa ttaaaaaact 4260 aattaatctt tgcaagatct gtaacgtaca caagtatgaa cgcaaaccgt ataacatcaa 4320 aatcacaccc cgacctattg aaacaacccc tttcagtcgc gttcacatcg acatctttgg 4380 aatcgacaaa cacaactact taacgttcgt atgtgctttt tcaaaatttt tacaaaccat 4440 agaaattccg tccaggaact tgacagacat aagaaaagct cttgctcatt ttattacaac 4500 gtttggcgca ccgagaaaaa ttatttgcga tcacgaaacc acatttagaa gtcttcaact 4560 tcaatcattt ttagccaatt taggaacaga attagaattt tcttcatcct ccgaaaccaa 4620 tggacaggta gagcgaacac acagtacaat tatagaattg ttcaacacca ataaacacaa 4680 gttcagagat ttaagctctc cggagatcat aaaagtagtg acagcactat acaacgaaac 4740 agttcactcc tcaacaggat tcacgcccaa cgaaatcatt tttaacagaa ctagcaatcg 4800 gaatccagaa caaataattc aaaccactag aaacatttac gaaaaagtat cccaaaagct 4860 ccacaatgca agtagaaaca tgcagaaata taacgatgaa aaagaaacac caccggaaat 4920 agagacaggt aaacagatat ttgtaaagaa aggcgttagg aaaaaattag acccgcgatt 4980 caacgaaaaa acttgtctta atgcgaatga taaaacagtt acaatggcaa gaaatatcaa 5040 gaggaacaaa aacaagttaa ggagaatcaa gtcccagtaa tacgttcttt ccgttttctt 5100 tatctttccc aggcttggag atcgaaaatg caatactttc cgataacatt catgctaatc 5160 ctgcttacac aactctcgca aagtaaagaa ttagaaatca tagatttgaa caggcaaccg 5220 atattttttc ttaaaaccag aacttgtaga ttacaaacag gaagcataaa attcatacat 5280 cctataaaca tgttaaccct tgaaaatgct attaacacca ttacacactt ttcgtacgaa 5340 aacattaata acgaacttaa agaaattgtc agactaaaag taaaattatt gtactcaaac 5400 ttccaacaat taaaacctaa acatagaaca gctagaagtc tagaaatctt aggaacagct 5460 tggaaatgga taggaggaag tcctgacgcc gacgacctca gaatcatcaa caccacaatg 5520 aacgagctaa ccgagaacaa caacaaacag taccgaatca acaaacagtt cgatcacagg 5580 ctacgaaccc tcactgacac cataaatcaa ttgacaaaag aacgagaaca agtaatgctg 5640 aatgaactag aaaccattaa aaccataatg aacatcgata tcatcaacca tgttttagaa 5700 gaaattcaag aagcaatcag ttggactaaa gtatcagtag ttagcaacaa gattctatca 5760 tcaccagaga taaactccat caaaaccata ctagaagacc aaggagtaaa agtcgaatta 5820 ccagatgaag cgctaaagct agtacaacca attatcgcga tcaactcgaa ttcgatactc 5880 tacatactga agatccctca gctagctgat gaagaagcaa caatgcttga agtatttcct 5940 ctcagtatcg ataacagaat catagtagag actccgacgc accttataaa aacacggaac 6000 aaagtgttca aaccagccaa acctgatgaa tacatccaaa accaatacag ggagtacatc 6060 gataaatgca catccaatct catcctaggg agaaaaagcg attgttctac tgcaagaaag 6120 aacaacacga cgataaagct aatatccgat ggacttatca ttgtcgacaa cgcaaaagga 6180 gcagccctga gttcaagctg tggacccgat gataaactcg tctccggaaa tcttctgata 6240 cgcttcaacg attgcgaggt aacgataatg aatcaaacat tttcttctaa gactatctcc 6300 agcacagtag agccatattg gggagcagta tccatcaccg aagtcagatg gcaacaccat 6360 aaaccgatga taaggcaaga cgcattcgag aacattggaa cgatgcagca ttcttatctc 6420 cagcagttca acagcgcatg gaattggagc ctacttggag gagtattggt ctcaacaatc 6480 ttcacgttat ccctggcgat cttcgttttc acgttctaca aacgatcgat acggaccatc 6540 gcagacgttc taccgaaaat agcggacgca tgaggacacg ctcttcttca cccccccgag 6600 g 6601 // ID piggyBacN1_AG repbase; DNA; ANG; 1661 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE piggyBacN1_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW piggyBac; DNA transposon; Transposable Element; Nonautonomous; KW nonautonomous DNA transposon; piggyBac superfamily; KW piggyBacN1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1661 RA Kapitonov V.V. and Jurka J.; RT "piggyBacN1_AG: a family of nonautonomous piggyBac-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(3), 66-66 (2003). XX DR [1] (Consensus) XX CC There are ~20 copies of piggyBacN9_AG in the genome, CC they are ~94% identical to the consensus sequence. CC piggyBacN1_AG copies are flanked by the TTAA target site CC duplications. CC The consensus sequence has 17-bp terminal inverted repeats CC (2 mismatches). The internal portion form an imperfect palindrome CC (pos. 200-1626). It main part (pos. 625-1251) was formed from CC a MARINERN12_AG-like element. CC Classification: a nonautonomous piggyBac-like DNA transposon. CC The genome harbors several subfamilies of piggyBacN9_AG. CC This family was derived from to the piggyBac1_AG-like autonomous CC transposon. XX SQ Sequence 1661 BP; 505 A; 319 C; 331 G; 506 T; 0 other; ccgtcatgtg tacaaccgcc tacccgggta agtttcgcat ctttattcca aaataacttt 60 ggattaaaaa atcgtatcga cttcaaattt taaaaatgcc tcagaaaaga tgttttctac 120 ccatgtgcaa aaaataaaaa atatcaaaaa tgtttaaata ttttttttta aattatttaa 180 gtttatatac ccttcacatg tacaaccgcc tacccgggta ggccatacat tttgtatgga 240 agaattatgt ttttcgtggt taaaagcgta tatcttttga attacttgtt agatttgaat 300 gaaaattgtc ttataggatc acaataatgt tttataacat atcgttgtaa aattagaatt 360 ttaaaactcc tatcaatatt tttttcaatc aaaaatgtgc tgtaactcca acaccccaac 420 cggccttgcc gcatgttttg agaggatgct tatgtctttt tcttttcttt caaatgttgt 480 attacagttg tgtacagtta aattttattg cttgttgtgt aacaatatat tgaaattaac 540 actgaaatca cctcttttaa aatgatacca accactacaa gcaaacgaaa tagaattggc 600 tgaaagttaa cacacacatt catacttttg tcatagtgtt actgccgaat agggatgggt 660 gttttggagc gcaccaatgg ctccgaagtc ggctctgcac ccacggctcc agagccggat 720 ccacaccaac agctccggag ccggatccgc accaacagct ccggagccgg ctccgcacca 780 acagccccgg agccggcgac gaatcaatgg ctccggagcc ggcggtgaat caacggctcc 840 ggaccaacgg ctctggacta acgtttcaat agagacggca ttgtatgata gcctaaaaaa 900 cgaatctatc ttctcgcaag tgtagaagct aacacccgat gtgattgtgt gattgttgaa 960 caatgttcaa aacttatcga ttttttttat agattccttt acaacattat ttactcagct 1020 atttatggat cttcccgatt gaaactttat tgaaaaggat ttttttgtgc aatctgttaa 1080 aaccggctcc ggagccgtga gtgcggagtc gttggtgcgg agccggttcc ggagccgctg 1140 gagcggagtc agctttggag ccgtgggtgc ggagccgtgg gtgcggagcc ggctccgttg 1200 ggttggagtg gaggaggttc cgaagttgtc tggagccggt cggagccggt ttttaataaa 1260 acatgatgag catattttat agatcgtcaa catatatttt tatacaacaa gcaaaacaat 1320 tgaaaaatgt atataacttt tgaaagaaac gtaagtaaaa cttagaaaaa gactaatcat 1380 ccacttaaaa catgcggcct ggccagttat ggtggtggag ttagaacata tttttgattg 1440 aaaaaaatct tgataggatt tttaaaattc taattttact acgatatgtt ataaaacatt 1500 attatgaacc tttaagacag ttttcattca aatcttacaa gtaattcaaa agatatacgc 1560 ttttaaccac gaaaaacatc attgttccat acaaaatgta tggcctaccc gggtaggcgg 1620 ttgtacgatt acgtaaagtt ttttggtcgt acacaagacg g 1661 // ID GYPSY61-LTR_AG repbase; DNA; ANG; 242 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY61-LTR_AG is an LTR of retrotransposon GYPSY61_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY61_AG; GYPSY61-I_AG; GYPSY61-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-242 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY61_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 162-162 (2004). XX DR [1] (Consensus) XX CC GYPSY61-LTR is a long terminal repeat of GYPSY61_AG (its CC internal portion is deposited as GYPSY61-I_AG). XX SQ Sequence 242 BP; 65 A; 47 C; 54 G; 76 T; 0 other; tgttgggagc ctgtgtttac cacgaaatat gtgcctctca taggtgacag ttgctggtca 60 ccttgatgac gtgtgcaccg ggatgctgga gagcgaacgg gagaaaggct ctctcctgta 120 ccgaccgtcg acgagacagt atcgcttcct taccgtttta atatatcgtg ttgtttattt 180 atattatact aatattttaa tgtgttactt ttagaaatat acgttaacca aagaacacaa 240 aa 242 // ID MARINERN13_AG repbase; DNA; ANG; 467 BP. XX AC . XX DT 05-MAR-2004 (Rel. 9.02, Created) DT 05-MAR-2004 (Rel. 9.02, Last updated, Version 1) XX DE MARINERN13_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN13_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-467 RA Kapitonov V.V. and Jurka J.; RT "MARINERN13_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 4(2), 43-43 (2004). XX DR [1] (Consensus) XX CC There are ~100 copies of MARINERN11_AG in the genome. CC They are ~95% identical to the consensus sequence. CC MARINERN13_AG copies are flanked by 2-bp target site CC duplications. CC This element has imperfect 16-bp terminal inverted repeats. CC Putative classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. XX SQ Sequence 467 BP; 154 A; 97 C; 75 G; 140 T; 1 other; cccaattgac attttcgagc acgaataatc gggaattcga gcaaattccc ttaaactgga 60 gccttaaact tcccttaaac tgcaccagtt taagggaatt tagaaaaagt cctttaaaat 120 caactgtgat gcggcttttt cgttactttc ttctagattc gctcggaaat gatgtttcct 180 atcgttatag ttggtcatgt acccatttta tacttaaata atatccattt tttataaaat 240 aacgtcacac aaaattagct tttgtttttc atcaaaaaaa raccatagct atgaatgaaa 300 aatgagctcg acggcatcgt ccatcgatag aggaacgtct aatttacgaa cacaccaaat 360 cttggcgctt tcaacacact tgatcacaca caaatccaca atttcaggcg tatcgagtac 420 taatcgagca gatatgggaa aacaatttcg agcaaatgtc acttggg 467 // ID BEL4-I_AG repbase; DNA; ANG; 5410 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL4-I_AG is an internal portion of the BEL4_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL4-I_AG; BEL4-LTR_AG; BEL4_AG; Bel clade; integrase; peptidase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5410 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL4_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(4), 68-68 (2003). XX DR [1] (Consensus) XX CC BEL4_AG is a family of Bel/Pao-like LTR retrotransposons. CC BEL4-I_AG, an internal portion of BEL1_AG, is flanked by CC BEL4-LTR_AG CC LTRs. The BEL4-I_AG consensus sequence was reconstructed based on CC multiple alignment of 10 copies; they are less than 1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes one 1761-aa BEL4_AGp Bel-like CC protein CC (pos. 95-5377). CC BEL4_AGp is composed of the peptidase (pos. 158-300), CC reverse transcriptase (pos. 800-930), and integrase (pos. CC 1480-1625) CC domains. XX FH Key Location/Qualifiers FT CDS 95..5377 FT /product="BEL4_AGp" FT /translation="MPVDHSPVKEVVPAANNMEESQSSTVREFGSVPDVTK FT LDRQRTHLEWQLDRLEQMFVKHSTDVHSLKTIADRVKVLALDYKQWYDSIL FT DVVSDDQAQEAIERYGVFDDKVFELSKNIEYQLASQIPLPTTPNPFFGSEQ FT KPTTVPVRARLPEIVLPHFDGNIRDWPAFRDAFQSLIHSSEQLTECDKLHY FT LAASLTKDARAVIDALEITSKNYDVAWKLLSERYENKYLIVKTTVEALFNI FT SPLKRECADSLSRLVDDFERNLRMLEKEGEKPDAWSTLLVFRLSSLLDPTT FT LRHWELHRKSTTIPTYKDLVQFVRNHCHVLKSFSKPASGARTGDTMRSNPR FT VQTIHAATSAVHSVAYNEKCKLCGVSKHSVFRCELMNNMSVADRKQLVQSK FT GLCFNCLSPAHRLRQCTSSGCKICQQRHHTLLHEAAPATEASDAPSTSSTC FT PPQSLTHCSIQIENSVVLLQTVLVQVEDNHGRCHLARALLDSGAQLNIVTE FT RLTQRLGVAKRRENHRIGGIGEVSVTSQHSAVLKIHSLDSEYTASGKFHVL FT SKLTRELPSSRINTSSWQIPRQVQLADPSFHSPGPIDLIIGAELYYDVVKE FT GLIKLSHERVTLQNTAFGWVIAGRVNVHAPPPPSSIVGHVCSTSIEEQLSK FT FWELESCRATSTLSVEESNCEKQFATTTTRDTDGRFIVQLPKREEKLALLG FT DSKGIATRRFLALERRLSSNASLKTAYTQFIEEYAELQHMTEVAESDATTS FT SPSYYLPHHCIVRPDSTTTKLRVVFDASCASDTGTSLNDALMIGPTIQDDL FT MSILLRFRMSKFALVADIEKMYRQINIAAIDRPLQRILWRNSPTEPIRTFQ FT LNTVTYGTSCAPYLATKTLQVLSQVGASTHPEAATILGRDFYMDDMLTGVN FT SIPEGQRVCQQLIDLLASGGFCLRKWATNNRQIFEHLPQHLQDERTILNLD FT AKSPIIKTLGLKWNVSTDAFVFNIPRWNADNIITKRNALSDVAKLFDPIGL FT VGPVIIQAKLFLQELWRCQIAWDEPLTPALQNRWLLFREKLAMLQTIHIPR FT WLLTDQRATNLQMHCFCDASEKAYGAAIYLRSTNTDGRVTTNLITAKSKVA FT PLADSRKQKRVCLPRLELSAALLLAHSYEKVSDALKLQVETIFWSDSTIVL FT HWLSATPSRWKTFIANRVSEIQHITHGKEWRHVPGTDNPADIISRGMDADQ FT LETSTLWWHGPDWLAQPSEEWPNTHQPRQEEFTTDELEERPICMAVQSVAP FT NELFSLRSTFTGLQRLVAWLRRFRHNTNPANHQQRRLDHHLSLEELAESTL FT CLVRLAQAESFPEDIKHLSKGDSVGNNSPLKLLAPFLQDGLLRVGGRLRHA FT PIPFDRKHPYILPANHPLTNQIATLYHRTYQHANPQLLIASMRERFWPLRA FT KNLARRIVHSCYKCYRCRPTPAQQLMGDLPAERVTPTSTFLHTGVDLCGPI FT HYRHTSRKAQLIKGYVAIFVCMAVKAVHIELVADLSTNAFLAALRRFIGRR FT GKPAIIECDNARNFLGASREIASLSKQFNHQWQTSVIKSCIDDGIQFKFIP FT PRSPNFGGLWEAAVKSFKTHFKPTVGNAILTSDELNTLLIQIEGCLNSRPL FT TPLSNDPSDLEVLTPGHFLIHRPIVSLAEPSLEKLPFNRLDRWQKVQEFVR FT RLWKRWSTDYLSGLQQRTKWTKQKDNVKLDTMVLLKEDGLPPSKWCLGRVT FT QIIKGADDNIRVVIVKTKDGDFKRSISKICVLPTDEPSSSS" XX SQ Sequence 5410 BP; 1466 A; 1514 C; 1272 G; 1158 T; 0 other; tttggtcctt cgaaccggat tagtgtgcga gtgatccaca gttcgcgttt agggaagttt 60 gtgaaaagta aagtttgtgt tagaaagtga aactatgccc gttgaccatt ctcccgtcaa 120 agaagtggtt ccagcggcca acaacatgga ggaaagccaa agttccaccg tcagagaatt 180 tggctccgtt ccagacgtca ccaaattgga ccgtcaacgc actcacctgg agtggcagct 240 tgaccgtcta gaacagatgt tcgtcaaaca cagcaccgac gttcactcgc tgaaaaccat 300 agcagaccga gtgaaggtgc ttgctttgga ctacaaacaa tggtacgact cgatcttgga 360 cgttgtgagc gacgaccagg ctcaggaggc aatcgagcga tatggcgtgt tcgacgacaa 420 agtattcgag ttaagcaaaa acatcgaata ccagttggca tcgcagatcc cattgcctac 480 aacaccaaac ccatttttcg gcagtgagca gaaacccacg acagtcccgg tgcgtgcaag 540 acttcccgaa atcgtgctac cacatttcga cggaaatatt cgcgactggc ctgccttccg 600 tgacgcgttt caatcactga tacattcgtc agaacagctg accgaatgtg acaagctgca 660 ttaccttgct gcgtcgctga cgaaggacgc acgcgcggtg atcgacgcat tggaaattac 720 atccaaaaac tacgacgtag cctggaagct gttgtccgag cgctacgaga acaagtacct 780 catcgtgaaa accaccgttg aggcgttatt caacatatcg ccactgaaga gagaatgtgc 840 agattctcta agccgactcg tcgacgactt cgagcgtaac ctacgaatgc tcgaaaaaga 900 aggcgagaaa ccagatgcat ggagcacgct gctggtgttc aggctaagct cgctgctgga 960 tccaacaacc ctgcgtcatt gggagcttca ccggaagtct acgaccatcc caacgtacaa 1020 agacttagtg caattcgtcc gcaaccattg ccatgtactg aagtccttct ccaagccagc 1080 cagcggagcc agaaccggag acaccatgcg cagcaaccct cgagtgcaaa ccatccatgc 1140 tgcgaccagt gctgtgcata gtgttgcgta caacgaaaag tgcaaattgt gcggtgtttc 1200 gaaacattca gtgttccggt gtgaattaat gaacaatatg agtgttgcag atagaaagca 1260 actagtgcag tcgaaaggat tgtgcttcaa ctgcctttct ccagctcatc gtttacgaca 1320 gtgcacatca agcggatgca agatctgcca acaacggcat catacgttgc tgcacgaagc 1380 agcgccagca actgaggcgt cggatgctcc atccacgagt tctacctgtc ctcctcagtc 1440 ccttacccac tgttccatcc agatcgaaaa cagtgttgtg ttgctgcaaa ccgtgctggt 1500 ccaggtagag gacaatcacg gacgatgtca tctagcacgt gccttgctgg attccggagc 1560 acaactcaac atcgttaccg agcgccttac tcaacggctg ggtgtggcga aacggcgaga 1620 gaatcatcgc atcggaggaa tcggagaagt ttccgtgacg tcacaacatt cggctgtgtt 1680 aaagatccat tctttagaca gcgaatacac agcgtccgga aagttccacg tactcagcaa 1740 gctcactcgt gagctcccat caagccgtat caacacaagc agctggcaga tacctcgtca 1800 agttcagctg gccgatcctt cgtttcacag tcctggcccg atcgacctca tcatcggagc 1860 agagctgtac tacgacgtcg ttaaggaagg actcatcaag ctatcccacg aaagagtgac 1920 actccaaaac actgcttttg gttgggtgat agcaggcaga gttaatgttc atgcaccacc 1980 accaccttca tccatcgttg gacacgtctg cagtacgagc atcgaagagc agctgagcaa 2040 attctgggag ctggagtcat gccgagctac cagcacatta tccgtcgagg agagcaactg 2100 tgagaaacag ttcgcaacca ccaccaccag agacaccgat ggtcgattca tcgtgcagct 2160 accgaaacga gaggagaagc tggctcttct tggggattcg aaagggatag ccacacgccg 2220 gttccttgcc ttggaacgtc ggctatcctc gaatgcctca ttgaagacag cttacacaca 2280 attcatcgag gagtacgcag aacttcagca catgacagaa gtagccgaga gtgatgctac 2340 aacttcatcc ccctcatact atcttccaca tcactgcatc gtgcgaccag atagtaccac 2400 cactaaactc cgagtggtgt tcgacgcttc gtgtgcatca gacaccggaa catctttaaa 2460 cgatgcactg atgatcggac ccactatcca agacgacctg atgtccattc tattgcggtt 2520 tagaatgtcg aaatttgccc tggtagcaga catcgagaag atgtaccggc aaatcaacat 2580 agctgccatc gatcgtccgc tgcaacgaat tctgtggcga aattctccaa ccgagccgat 2640 acgaacgttc cagctcaaca ccgtcaccta cggcacatca tgtgcaccgt atttggccac 2700 caaaaccctg caagtgctat ctcaggtcgg agctagcacc catcccgaag cggcaaccat 2760 cctcggacga gacttctaca tggacgacat gctgactggc gtaaacagca ttcccgaggg 2820 tcaacgagta tgtcagcaac tcatcgatct cctggcttct ggtggattct gcttgcggaa 2880 atgggccacc aacaacaggc agatcttcga acatctgccc cagcatctgc aagatgaaag 2940 gacgattctc aacttggatg cgaagtcacc gatcatcaag acactcggac tgaagtggaa 3000 cgtttccacc gacgcttttg tattcaacat cccgcgctgg aacgcagaca acatcatcac 3060 caaacgaaac gcgctttcgg atgtcgcgaa actgttcgac ccaatcggac tggttggacc 3120 agttatcatc caagccaagc tgttccttca agagctgtgg agatgtcaga tcgcctggga 3180 cgagccgtta acaccagcac tacaaaaccg atggcttttg tttcgcgaga agttggccat 3240 gctgcaaacg atccacattc cacgctggct cttaaccgat caacgagcaa cgaatctaca 3300 aatgcattgc ttttgtgatg catccgagaa ggcgtacgga gcggcgattt acttgcgttc 3360 gaccaacacc gatggacgtg tgacaaccaa cctcatcact gcaaaatcca aggtggcacc 3420 actagcagac tcccgtaaac aaaagcgtgt ttgcttaccc cggctggagc tttctgcagc 3480 actgctactg gcacactcgt acgagaaggt gtcagacgcc ttgaagcttc aggtcgagac 3540 catcttttgg tctgactcaa ccatcgtcct gcattggttg tctgcaactc cgtcacgctg 3600 gaaaaccttc atcgctaacc gggtgtccga gatccaacac atcactcacg gcaaggagtg 3660 gagacacgta ccaggaacgg acaatcctgc ggacatcatc tcccgaggaa tggatgcaga 3720 tcagctggaa acttcaaccc tttggtggca cggaccagac tggctggcgc aaccatcaga 3780 ggaatggccg aacactcatc aacctcgaca ggaagaattc accacggacg aattggagga 3840 gcgaccaatc tgcatggctg tacaatccgt ggctccgaac gaacttttta gcctccgttc 3900 aacgttcacc ggactgcaac gtctggttgc gtggctaaga agattccgac acaatacgaa 3960 tcctgctaat catcaacaac gcagattgga tcatcatctc agcttggagg aactagccga 4020 atctacactg tgtctagttc gcctagctca agctgaatca ttcccagaag acatcaaaca 4080 tctatcgaaa ggcgattcgg tcggcaacaa ctcaccttta aaactactag caccgtttct 4140 acaagatggc ctgctacgag tgggcggaag attgcgacat gcaccaatcc cgttcgaccg 4200 aaagcatccg tacatcttac ccgccaacca tccgctgacg aatcagattg caactctgta 4260 tcatcggacc tatcaacatg caaatccaca actgctaata gcgagcatgc gagaacgatt 4320 ctggccactg cgagcaaaaa acctggccag aagaatcgtt cactcctgct acaaatgcta 4380 ccgttgccgc cccacacctg cacaacaact catgggcgac cttccagcag agagagttac 4440 accaacgtca actttcttac acaccggagt cgacttgtgt ggaccgatac actatcgaca 4500 cacatctcgg aaggcgcaac tcatcaaggg ctacgtagcg attttcgtct gtatggcagt 4560 gaaggctgta cacatcgaat tggtcgcaga cctgtctacc aacgccttcc ttgcggcact 4620 tcgacgattc atcggacgac gcgggaaacc ggctatcatc gaatgcgaca acgctaggaa 4680 tttcttgggc gcctcccgag aaattgcctc cttgtccaag caattcaacc accagtggca 4740 aacatcagtg attaagtcct gcatcgacga tggcatccag ttcaagttca tcccacctcg 4800 ctcacccaac tttggaggtc tttgggaggc ggcagtgaag tctttcaaaa cacacttcaa 4860 gccaaccgtt gggaacgcca tcctcacgag cgacgaactc aacacgctac tgatccagat 4920 tgaaggatgc ctcaactcta gaccacttac accactctcc aatgatccat ctgatctgga 4980 agtgctgact ccaggtcact tcctcattca tcgccccatc gtatccctgg ccgaaccatc 5040 gctggaaaag ctgccgttca accgtctcga tcgctggcaa aaggtgcagg agtttgtccg 5100 tcggctatgg aaacgttggt caacagacta cttgtccgga ctacagcaga gaaccaagtg 5160 gaccaagcag aaggacaacg tgaagctgga taccatggtg ctgctgaagg aggacggtct 5220 acctccatcg aaatggtgtc ttggccgcgt cacgcagatc atcaagggag ctgacgacaa 5280 catccgagtg gtcatcgtca agacgaaaga tggagacttc aagcgttcca tctctaaaat 5340 ctgcgttctt cccaccgacg agccatccag ttcatcctag ttgaattaga taattcaacg 5400 cgggggagta 5410 // ID BEL4-LTR_AG repbase; DNA; ANG; 316 BP. XX AC . XX DT 02-MAY-2003 (Rel. 8.04, Created) DT 02-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL4-LTR_AG is a long terminal repeat of the BEL4_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL superfamily; BEL4; BEL4-I_AG; BEL4-LTR_AG; KW Long terminal repeat; RETRO25_AG_LTR; retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-316 RA Jurka J. and Drazkiewicz A.; RT "RETRO25_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 8-8 (2002). XX RN [2] RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL4_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Direct Submission to Repbase Update (13-APR-2003). XX DR [1] (Consensus) XX CC BEL4-LTR_AG flanks an internal portion of BEL4_AG (deposited as CC BEL4-I_AG). XX SQ Sequence 316 BP; 89 A; 73 C; 60 G; 94 T; 0 other; tgttacggca agcgtaatgg taacgcgtca cccttagtac gcctagcgta aggtagaggg 60 cgctttcaag aaatttacct acgtgtaaat tttacttaat cataattcta ttacgcagtg 120 caaattgctg tagatcggta tcgtgcagct cgaattcccg caagtgcagc tcgaattacc 180 gcaagtgcag ctcgaattcc cgaattctgc cgtatataag cgagtgtttt tcctgaataa 240 atttagtcaa gttccagagt tcaaagcaac aacatcgtgt cttcatcatc ttcaaacttc 300 ctcttcatca ttaaca 316 // ID GYPSY13-I_AG repbase; DNA; ANG; 5967 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY13-I_AG is an internal portion of retrotransposon GYPSY13_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY13-I_AG; GYPSY13-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY13_AG; mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5967 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY13_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 168-168 (2003). XX DR [1] (Consensus) XX CC GYPSY13_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY9_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, CC GYPSY14_AG, CC GYPSY15_AG, GYPSY16_AG, and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY13-I_AG consensus was reconstructed after multiple CC alignment of 11 copies. CC The consensus encodes the 405-aa GYPSY13_AG1p gag-like protein CC (pos. 1072-2286) and the 1246-aa GYPSY13_AG2p (pos. 2193-5930). CC The sequence of the LTRs flanking GYPSY13-I_AG is deposited as CC GYPSY13-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1072..2286 FT /product="GYPSY13_AG1p" FT /translation="IKQKKYNNNISITSHSMLLKINQNIIRLKNIEKKLSC FT DRKFRRCLLDHYSGSAKFCFDKIKYYLDSIQETEPVSELKRILKEASDIYA FT NIQSRIQFHQTNKIQVKFKTLAKLAIVFNRWNQNKMETFDIKTATALVQIY FT DGNTDGLENFVDSSNLLKELCPKNPQMLVKFLKTRLIGKARLGLPPNIDDF FT DSLINDIRSRCQEKLNPDKIISQLKSIRQTDTKALCDEVELLSTKLKNVYL FT QLHIPEKVANDMSLKHGINALIEKVHNQETKIVLKAGQFASVSDASEKVLE FT NERNSNGSQILAFNRNYYDNRKFVRNKYPPNTYNRFQPNTNNRWFSNRSPL FT GNNQATSNRYQNQRYPNNSFRKQFPNTNTRRVYNTQAAEEEHFLGVPQALE FT EIIQPTNDGQH" FT CDS 2193..5930 FT /product="GYPSY13_AG2p" FT /translation="SLQYTSCRRGTFFRGTTGVRRNHSTYKRWSTLAINLN FT ANNFIKIKVEMAMGEISILIIDTGADVSLFKVDKIKPTQQVYAQNRINLTG FT ITTESVSTLATTSTNITFGNASINHTFHIIPAEIDIKADGILGRDFFTKHR FT CVIDYEHWLLNFNCNGVTISHPIEDSINDGFILPLRSEVIRKINLPDLVED FT SIVLASEILPGVFCGNSVVSKQNQYVKFVNTTESNVYIEKDSFKPQIDPLK FT NYIQKNLKFKSSKTNESKIKTLLSKIDSQNVPKYIIPHLEKILSEYHDIFC FT TDNDQIPTNNFYEQSIQLKDNVPCYIPNYKQIYSQSDEIKNQVDKMLKNDI FT KEHSVSPYNSPILLVPKKSGDNEKKWRLVVDYRQLNKKIMPDKFPLPRIDD FT ILDQLGRAKYFTTLDLMSGFHQIPLHKNSRKYTAFSTSSGHYQFTRLPFGL FT NISPNSFQRMITIAMAGLTSESAFVYIDDIIITGCTMQHHLENLIKVFNRL FT RNYNLKLNPEKCIFFQNGVTYLGHNITYKGIYPDESKFETIKKYPVPINVD FT EVRRFVAFCNYYRKFVKNFAELVRPLNNLLKKGATFTWSNECQHAFDTLRR FT SLVSPTILQYPNFEKEFIITTDASDVACGAILSQITDGNDLPIAYASKSFT FT KGEKSKPTIEKELTAIHWAVNYFKPYIFGKKFNVRTDHRPLVYLFNMKNPT FT SKLTRMRLDLEEFDFEVEFVQGKTNTGADALSRIVIDSDKLKHMQKNNANE FT TNDKSLLAVKTRAMTRLNNFKLPPELNEGEKDSITKPTLWQTENPTEVNKL FT LKILSKTEYGKIKITVFNNNYNKELGNTTIHCNEDGSQALESALLKVQQIA FT KEYNRDKITISLEDSLFRQYSFHTIKEIADNAISGLEIIAFIPPKWISKKE FT EIENILENYHMSPSGGHVGQYRLYLKIREKYKWKNMKEEIKNYVRGCKVCK FT VNKIVRHTKEKSAVTTTPLKPFEVVCIDTVGPLPKTNKNHRYIVTMQCELS FT KYIVLIPTENKEANTIARAMVENCILKYGRFLEMKSDQGLEYNNEVLKKVA FT ELLGIKQTFATAYHPETIGSLERNHRTLNEYLRAFTNEHGDNWDDWLKFYE FT FNYNTTPHTDTNYSPYELIFGKKAILPQDLCHGNIEPVYNYDAYFNEIKFK FT LQKSNEIARKNLIEQKERRQIRANSNVNPLIIKIGDAVYLKNENRKKLDPI FT YLGPYIVVKLDNVNCTIKNTNTNKLSTVHKNRLIKG" XX SQ Sequence 5967 BP; 2347 A; 1006 C; 1075 G; 1539 T; 0 other; tggcgaccgt gacaagtgtg aaactgtgat gcaaaaaaaa aaaaaaaaaa aaactcgcgt 60 aaatacgaag ttaattcgcg agtgcgtaat attccttccg gtcagaagaa tattctagta 120 tctgcgttga tgtttagcgt gaaagagtga gagagtgaag acgagagtgc agcgcaacgc 180 tacaattgta cgataactcg aactatcgtt attcaaacga ggtgcagtga taaaatatac 240 gggtttcctg aaaaatcacg aaaattgcac aataactcct gtcgtgtatc gtcaacgaac 300 caaggtgcag tgaaagagag agtgtatcct ggttataaaa aaaaaaaagt ggccaggaaa 360 agttatatga atgctagcaa acattcggtg ataaaaaaca aaaaaaaaaa gcagcgatag 420 ttaacggtac cgtgcgtaaa gtgatctaag tgcagtgacc tcgatcggac aagattatcg 480 cacgtacgag taagccacat ttcgtaccat cggcgataag catcacagtt tgaaccggca 540 tgaccttttg cgacaagtga ccaagtgtcg tttgcacgag agagacataa cgttgttgcg 600 agaagaagga agaacggagc agccgatatc aaacggcaat acagcgtgcg gtgaaggagc 660 gacggcgaag cagtgagctg tgagagagcg agcagcgatt tgcagtggat aggagcgacg 720 gtgcagcagt gtgctgtgag agaacgagca gcgatttgca gtggataaag cgagataacg 780 ctatacttat cgatcgggca gcggtacacg gagtaaggat caagcgaaaa aaagggagaa 840 agtaaggtag gctacacact caatatagtt gcaatttact catttaaata ttttttgtaa 900 ttgctcaatt aaattctttt ttatgaaacg tacaatatta ccattttcca ttacaatatg 960 cacatactat tggggtgtta gtctaaaaaa aagaaaagaa aaaaaaatat tgcaattatc 1020 tgggtgtcaa aaaaaataat aataataaaa aaaatataaa ataaaaaata aataaaacaa 1080 aaaaaatata acaataacat ctcgatcaca tctcattcaa tgctcttaaa aataaatcaa 1140 aatattataa ggcttaaaaa tattgaaaaa aaattatcgt gtgatagaaa gtttcgcaga 1200 tgtttgcttg atcattacag tggttccgct aaattttgtt ttgataagat caagtattac 1260 ctggattcaa ttcaggaaac tgaacccgtt tcagagctga aacgaatttt gaaggaagcg 1320 agtgacatat acgccaatat ccaatcgcgc atacaatttc atcaaacaaa caaaattcaa 1380 gttaagttta aaactttagc aaaactagct atcgtgttca accgttggaa tcaaaacaaa 1440 atggaaactt tcgatatcaa aacagccacc gctttggtgc aaatatacga tggcaataca 1500 gatggtttgg aaaattttgt ggactcttca aatttactaa aggaattatg tccaaaaaat 1560 ccacaaatgt tggtaaaatt tttaaaaaca agattaatag gaaaggcgcg tttaggactg 1620 ccgcctaata ttgacgattt tgactcgtta ataaatgata tacggtccag atgtcaagaa 1680 aaacttaatc cagataaaat aataagtcaa ttgaaatcta ttcgccagac ggatactaaa 1740 gctctttgtg atgaggttga attactcagc acaaaattaa aaaacgttta tttgcagctg 1800 cacattccgg aaaaagttgc aaatgacatg tcattaaagc atgggattaa tgctctgatt 1860 gaaaaagttc ataatcagga aacaaaaatt gttctaaaag caggtcaatt cgcatctgtt 1920 tcagatgcat cggagaaagt attagaaaac gaaaggaata gcaacggttc acaaatatta 1980 gccttcaaca gaaactacta tgacaataga aaattcgttc gtaataaata cccgccgaac 2040 acatacaata gattccagcc aaacacaaat aatagatggt tttccaatag atctcccctt 2100 ggtaacaatc aggcaactag caacagatac caaaaccaaa gatatcccaa taactctttc 2160 agaaaacaat ttccaaatac aaatactcgt agagtctaca atacacaagc tgcagaagag 2220 gaacattttt taggggtacc acaggcgtta gaagaaatca ttcaacctac aaacgatggt 2280 caacattagc gataaactta aatgctaata actttatcaa aattaaagta gaaatggcaa 2340 tgggtgaaat aagcatatta attatcgaca cgggagcaga cgtgtcattg ttcaaagttg 2400 ataaaataaa accaacacag caagtctacg cgcaaaatag gataaactta acgggcataa 2460 caaccgaatc ggtatcaaca ttagcaacta ctagcactaa tataacattc ggaaatgctt 2520 ctattaacca cacttttcat atcattcctg cagaaataga tatcaaagcg gatggtattt 2580 tgggaagaga ttttttcact aaacatagat gtgttataga ttatgagcac tggcttttaa 2640 atttcaattg taacggtgtt acaatcagcc accctattga agatagtatt aatgatggtt 2700 tcatattgcc tctacgcagt gaggttattc gaaagataaa tttgcccgac ctcgttgaag 2760 actctattgt ccttgcgagc gaaatattac caggagtatt ttgtggcaac tctgtggtat 2820 caaaacaaaa ccaatatgta aaatttgtaa acaccactga aagtaatgtg tatattgaaa 2880 aggattcatt taaaccacag attgaccctc tcaaaaatta tattcagaaa aatttaaaat 2940 ttaaaagctc taaaacaaat gaatctaaaa taaaaacatt attgagcaaa atcgatagtc 3000 agaatgtccc caaatatata atccctcatt tagaaaaaat attatccgag taccatgata 3060 ttttctgcac cgataatgat caaatcccaa ctaataattt ttacgagcaa tcgatacaat 3120 tgaaagataa cgtaccatgt tacattccaa attacaagca aatatattct caaagtgacg 3180 agataaaaaa tcaggttgat aaaatgctta aaaatgacat aaaagagcat tcggtttctc 3240 catataattc accaatttta ttggtaccaa aaaaatctgg agataacgaa aaaaaatgga 3300 gattggtagt agactacagg cagttgaaca aaaaaataat gcctgataag ttcccgctac 3360 ctagaatcga tgatatatta gaccaattag gaagagcaaa atattttact accttggacc 3420 ttatgtccgg gtttcaccag atacccttac acaagaactc tagaaaatat actgctttct 3480 caacttcttc aggccattat caatttacaa gactcccttt tgggctgaat ataagcccta 3540 atagctttca acggatgata acgatagcaa tggcgggact tacttcggag agtgcttttg 3600 tttacataga tgatattatt ataacgggtt gcacaatgca acatcatcta gaaaacttaa 3660 taaaagtgtt taacaggcta agaaattaca acttgaagct aaatccagag aaatgtattt 3720 ttttccaaaa cggtgtgacc tatttaggtc acaacataac atataaaggc atatatcccg 3780 atgagtccaa gttcgaaacg ataaaaaaat acccggtacc aataaatgtg gatgaagtta 3840 gaaggtttgt agcgttctgc aattactatc gcaagttcgt taaaaatttt gctgaattgg 3900 taagaccgct aaataattta ctaaagaaag gagctacatt cacgtggtct aatgaatgtc 3960 agcatgcttt cgacactctt agaagaagcc ttgtatcacc tactatatta cagtatccaa 4020 actttgagaa agagtttatc attactactg atgcgtccga tgtagcatgt ggcgctatct 4080 tatcgcagat aacagatggt aatgacctac ctattgcata cgcgagtaaa agttttacta 4140 aaggagagaa atctaaaccg actatagaaa aagagctaac agcaatacat tgggcagtaa 4200 attatttcaa accatatatt tttgggaaaa agtttaatgt acgaaccgac cacagaccat 4260 tggtatatct attcaacatg aaaaatccta cttcaaaact aacacgcatg aggttagacc 4320 tggaggaatt tgattttgaa gtggaatttg tgcaaggaaa aacgaacaca ggagctgacg 4380 cactatcgag aattgtcatt gattcagaca aattgaagca tatgcaaaag aataatgcaa 4440 acgaaacaaa tgataaatca ctactagcag taaaaacaag agcaatgaca agactgaaca 4500 atttcaagct tcctccagaa ttaaacgaag gtgaaaaaga ttccataact aaacctacac 4560 tttggcaaac agaaaaccca acagaagtaa acaaactttt gaaaatccta tcaaaaacag 4620 aatatggcaa aatcaaaata acagttttca ataataatta caacaaggag ttaggaaata 4680 caacaattca ttgtaatgaa gatggaagtc aggcgctaga gtctgctctt ctaaaagtac 4740 aacaaatagc aaaagaatac aatcgtgaca aaattacaat ttcattagaa gattccctat 4800 tcaggcaata ctcttttcac accataaagg aaatcgcaga taatgccatt tccggtttag 4860 agatcattgc ttttattcca ccaaaatgga tttcaaaaaa ggaggaaata gaaaatattt 4920 tagaaaacta tcacatgtca ccttcgggag gacatgtcgg acagtatcga ttatacctca 4980 aaattagaga aaaatataaa tggaagaata tgaaggaaga aattaaaaat tacgttagag 5040 gctgcaaagt atgcaaagta aacaaaatcg taagacatac aaaagaaaaa agtgctgtta 5100 cgactacacc tttaaagcct ttcgaagtag tctgtataga cacagtgggt cccttgccga 5160 aaaccaataa aaatcataga tacatcgtta caatgcaatg tgaactctcg aaatacattg 5220 ttcttattcc aacagaaaac aaagaagcga acacgatagc tagggccatg gttgagaact 5280 gcatattaaa atacgggagg tttcttgaga tgaaatcgga tcaaggcctt gaatacaata 5340 acgaggtatt aaaaaaggta gcggagttgc tggggatcaa gcagactttt gccactgcct 5400 atcacccaga aaccataggt tccttggaaa gaaaccacag aacactcaat gaatacttaa 5460 gagcatttac aaatgagcat ggtgacaatt gggatgattg gctaaaattt tacgaattta 5520 attataacac aacacctcat acagatacga actatagccc atacgaatta atatttggaa 5580 aaaaagcaat tcttccccaa gatttatgcc atggtaatat tgaaccagta tacaattatg 5640 atgcatattt taatgaaatt aaattcaaat tgcaaaaatc aaacgaaata gcgcgtaaaa 5700 atttaattga gcaaaaagaa agaagacaga ttagagcaaa tagtaatgta aaccccctta 5760 tcataaaaat aggagatgcg gtttatctaa aaaacgaaaa tagaaaaaaa ttagatccga 5820 tatacttagg accgtacata gtagtaaagt tagataatgt taactgtaca ataaagaata 5880 caaatacaaa caaactaagt acagtgcata aaaatagatt aattaaggga tgaaaagcat 5940 acgcttttca tttctaaagg ggggagg 5967 // ID Clu-44_AG repbase; DNA; ANG; 555 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-44_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-555 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1443-1443 (2010). XX DR [1] (Consensus) XX CC TTA TSD. XX SQ Sequence 555 BP; 176 A; 99 C; 121 G; 159 T; 0 other; gagagcggta cttgttccga cgcaatgcgt tgcgattttg ccaaaatgta tgggatttga 60 cagacgcatg cgttaacgac gcacgacgca gcgtcaaagt agaaaaattt ctatttcgac 120 gcacgtcgtg atgcgtctgt caaaatggtt taaccatatc gggtgaaaat cgatattttt 180 tcacgttaaa ttgtttcgag cgcagtttcg cgtgtaattg gactactaaa taattttatt 240 aacaaaacaa aataaatgta gagaaagtaa tttcgtaatt gctgcaaaac aggtgtgtgc 300 gggtcagtac gcaatatgca ataaaaaagc gaaaacatgc tttgcactgt tttgagcaga 360 actggaaatg attttacagt tgtaatttaa cattataatc acatcagaac ctataaataa 420 gaggcattat tgctgtttgt tgatgttttt ctttgtaaaa tgtgttgaca tatccatgca 480 gaacgcagcg aacagatacc agcgtgcgtc acgcaatgcg ttgcgatacg ctgcgtcgaa 540 acaagtacca agccc 555 // ID AgaP8MITE2450 repbase; DNA; ANG; 2456 BP. XX AC DQ301485; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP8MITE2450 P MITE, complete DE sequence. XX KW P; DNA transposon; Transposable Element; Nonautonomous; KW AgaP8MITE2450. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2456 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-2456 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301485; Positions 1 2456. XX SQ Sequence 2456 BP; 602 A; 628 C; 544 G; 682 T; 0 other; caaggttatt agactctata caggttacga cagagcgatt gagggggtta ccattttgta 60 aaccttgttt acaatttgtc tgtcaaaatc gacgctggtc agctgatggt tgtaataaac 120 aacataagta agattgcaga agctgaaact aacgcttcaa gttttcttca tcctaccacg 180 ttccatcgtt agtagaatac gacttctttg ttggcctgtt caatgctgag taagttctgc 240 tgatttgcca aggatatcat cccattgttt acccagtacc gtttgctctc caggtttgaa 300 attgcagtca cagcttctgg ggacgagggt ttatgctgct agacggcatt gttcaaaagg 360 gtgagaaatt atgaagtcct gaacgcgacc ctactgcgtt gggtgggaag cggcgaggct 420 ccattccgat tcgccacaat cgtcgggttc aggttgatcg tctatgctgc tcctgttcta 480 gttccggttc ataggaaacg aatgttcgtg ctgtcccggc tgcagtcttg ttccagtggg 540 ggtcgaggat tcatgccgcc aacggcgggg gtccatttca tttatgctcg agtcaattac 600 aacatcagag cgaaaaggat gtttgaagca tttgtgatgg gtaccacatc atcataggag 660 acaaatcccc acatttgtgt gagacgagtg cattgcacgg cagacaaaag agtgtgttta 720 aaggttggtc aagtaggtgc tctcggtagg ggtgcacggt gcggctggtg tgcggcccgt 780 acagtatgag tatcaacatg tttattttct aatcattatt gttacaacgg gctcgttaaa 840 cattaactca cttttacctc cttctcctac tccttggtca gcgatccaac aactcacgca 900 cgttcttcca ttcacacact agtgaaaggc gggacaactg gaactcaccg ggtcggagca 960 atccgaagca tcccggagtc gtgcaggtca acctgaaacg tacaaataga tgtagtaggt 1020 gaaacgattt cgttagatac gagaaagcat aagcgattcc gacccattgg tgcagcgagt 1080 acggatcgga tattgatctg acccaaactc gctgcaccaa cgggtcggag ccgcttatgc 1140 tttcttacaa ctactaatca ttcgagtcaa ctcgaagtcc ttacttcctc gggtcggatt 1200 tgtttccgac tccgcgttac cctattgcct ttttttcaac atacgttctt gttatattaa 1260 tactatcatc gttatttttt acaaacattt aagccctcgt cccagtccta ccctgtgtcc 1320 cgcgagggat cttggaaagg aattgttgtt tgctcataat gagcctctct ttcccctttc 1380 acctatcacc gttagctgcc tatagctgcg acccactctg ccgcatttag cccattgcgg 1440 ggctcctcaa cagacggtcg gtctgagctt ccctcaatta ctcggccgtc atcgtcttcc 1500 tccgatgcaa tattcgggcc gttccttttc gtcgaacaat tgccccagca tgttcgttga 1560 ccggtgtcgc tgtcggtaca gactcattct tacctgattc gcctcacgtc gggaagcggt 1620 agatggtgac ggtggagggg aagaggaccg tccgccccgg agttcctctc tttcccgcgt 1680 acgctgctct ctacgtctcg cataacgtcg ataattacga gcggcaacca cctccgctct 1740 atgtgccgta atttcctcca gctcggcctc gcgctaagcc atgtcctcag ctgcgcgatt 1800 ggccctttcc gcatcccagg aacgctgcaa ttccacagtt atttgtcgag cggcttcaca 1860 gatacgactc caggtggctt ggtctcgcaa catatgctgc tgaaggttat ctggccgaac 1920 cggatctgag ccaccctcgc caagcagctc ctgtcggaca ctagcaaaac ggggacactc 1980 gaacatcaca tgttctgcat cctctggtac accgacacac cactggcagt ccgaggacga 2040 cgtgaaaccc ttgatacata agtactccca gaaaaagcca tgtcccgaga gcacctgtgc 2100 caaatggaag gacatgtctc catgtttccg tgcctgccag gttccaatgt tggccaacac 2160 gcgatgagct catgtcttgg attctcactt actcgattct cccttgtcac aactgttaat 2220 ttatcacaaa attagtaatt ataaaattac aaaataatta aataaagtaa aacttacaat 2280 atttaacacc gaacactata ccattgcacg attgcaatcc agacaaattg taaacaaacc 2340 agtcgaattg taaacaaacc attccaattt gtctgtcaaa attttcatta atttttttat 2400 ttgaaatcgg ccaaaacggg cttttgtcgt aacctgtata gagtctaata accttg 2456 // ID RETRO933_AG_LTR repbase; DNA; ANG; 347 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO933_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; NINJA; RETRO933_AG_I; RETRO933_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-347 RA Jurka J. and Drazkiewicz A.; RT "RETRO933_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 20-20 (2002). XX DR [1] (Consensus) XX CC Related to NINJA from Drosophila simulans. 5 bp target site CC duplication. XX SQ Sequence 347 BP; 116 A; 65 C; 82 G; 84 T; 0 other; tgttgtagaa aacaacgagg tcggcgaaaa gtgacaaatt gttgtttaca atttgtgttt 60 attgataagc cgttgacagc cgaaccgaaa cgtcgccgat ttgccgttac cagagggagc 120 agaaaaaaac atcgccgcgc gaatcgaaaa acggccatcg cgcgagcgcc atgagatcaa 180 gtcagaaagc gtaagaaagc gtacgtgaaa ttttttaacc accgatccgc cgcgaaattt 240 actttgctac gttgtaagat tttgagcaat aaagtagtgc agtaaactac gagttaaaag 300 atcacgtttt aagtttgagt tgaagaaaga actttgcttc tagaaca 347 // ID GYPSY8-I_AG repbase; DNA; ANG; 5904 BP. XX AC AAAB01008859.1; XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE GYPSY8-I_AG is an internal portion of the GYPSY8_AG LTR DE retrotransposon. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW CLP protease; GYPSY8-I_AG; GYPSY8-LTR_AG; GYPSY8_AG; Gypsy clade; KW RETRO23_AG_LTR; aspartyl protease; integrase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5904 RA Kapitonov V.V. and Jurka J.; RT "GYPSY8_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(5), 89-89 (2003). XX DR Genbank; AAAB01008859.1; Positions 1735905 1730002. XX CC GYPSY8_AG is a young family of Gypsy-like LTR retrotransposons. CC GYPSY8-I_AG, an internal portion of GYPSY8_AG, is flanked by CC identical GYPSY8-LTR_AG LTRs. CC The internal sequence encodes the GYPSY8_AG1p 369-aa protein CC (pos. 1088-2194), and the 1218-aa GYPSY8_AG2p pol-like protein CC (pos. 2203-5856), composed of the aspartyl protease (aa 9-159), CC reverse transcriptase (aa 313-486) and integrase domains (aa CC 942-1100). GYPSY8_AGp1 is not homologous to gag-like proteins CC encoded by known Gypsy-like elements. This protein is similar CC remotely to the CLP proteases, although the significance of this CC similarity is low. XX FH Key Location/Qualifiers FT CDS 2203..5856 FT /product="GYPSY8_AG2p" FT /translation="MTGIKTNNFIKVKLNIANGKNTTLVIDSGAEVSLFKA FT SSLKKQIQLKKNKELTLIGITTDTMKTKGYTQATIHFGDKNVEHTFFIIKD FT LPTQADGVLGMDFISKFQCDILFSTWMLQFRNGNDIIEHPVDDSINGIIEI FT PPRSEVIRKLTLKPITEDSVIFSKEIKPGVFIGNTIISKTDPNIKLINTTE FT STAFINTGTIRPQIEPLKNYEIFLANNYNTPERTRIIQDKVHIEQVPEIAK FT RNLKNLIAEFSDIFCLENEPITTNNFYKQPIELMDNNPSYIPNYKQIHSQA FT DEINQQVDKMLKNDIIEHSVSAYNSPILLVPKKSIDGSKKWRLVVDFRQLN FT KKILPDKFPLPRIDTILDQLGRAKYFSTLDLMSGFHQIELEPASRRFTAFS FT TPTGHYQFTRMPFGLNISPNSFQRMMAIAMAGLSPELAFVYIDDIIVTGCS FT AQHHISNLSKVFNKLRKCNLKLNPEKCCFFKTEVTYLGHKITDKGIYPDDS FT KYEIIEKFPVPKNANDARRFVAFCNYYRKFVQNFAKIAKPINNLIKKDVKF FT DWTKECQEAFEKLKQSLLSPTILQYPDFTKQFIITTDASDTACGAVLSQIT FT DGNDLPVAFASKSFTPGEKNKPIIEKELTAIHWAINYFKPYIYGRKFIVKT FT DHRPLAYLFGMKNPTSKLTRMRLDLEEFDFDVQFLAGKANVAADALSRVVI FT TSDELKAQIPTNIMLATTNNIQNLKSTILMVHTRAMVKQKEAKKVTPHKPI FT DTRSDQPTMWTTDTPSKTRKLLKIRTSISDNNITFEVCNSTYKKVLGKVNA FT KAEKNGSQALELALLNICKIANNYKNKKLAWSLHDQIFTQYSHQTLKEIAN FT RVIDKYEFILFTPPRWVETEHDRLRIIHDYHMTPSGGHVGQFRLYRKIRDT FT YTWKNMRNDIKNFINKCEACLVNKVNRHTKEQTVITTTPNKPFNVISIDTV FT GPLTKTNNNYRYAITIQCDLTKYVVIIPIHNKEANTIAKALVENFILTFGT FT FIELKSDQGLEYNNEILNKITEILQIKQTFSTAYHPQTIGALERNHRCLNE FT YLRSFVNEHHDDWDDWIKFYEFVYNTTEHTDTGYTPYELIFGRKANLPQEI FT YQNKIEPVYNVEQYYNEMKFKLQKSREIAQKNLVSSKENRQNILNKNTNSL FT KLEIGDIVYLTNENRKKLDPVYIGPFTVTNIQEPNCTIIHKTTNKSSTVHK FT NRLIKSKE" FT CDS 1088..2194 FT /product="GYPSY8_AG1p" FT /translation="MITISDRIEILKHIFKKIQDPNQRTCTKTQLRIQAEE FT TFKEIQKEIEKNKFKYTFNKLLEFSKISNALIHNIIAMSTSKTNDDSHNKT FT SDCSTNSLEDKNTNLTLLTTNKLSFKLLAQTISAFLQLHKKSKMAFDLNSV FT GFSVLTGMQPFEGKASELTKFIDTVNTVKSIISSSNSLIAINLLLTKLGKE FT PRELFKELPKSFDEIIVTIKGRYTNTISVAKIYKQLCDRRKGKFENFNTFA FT KSLNELADQLQTAYIQENMTPDIAKTLTQQNVIIKLKEAGISRDSQLILDI FT KEFSSITDMLDTVKSYEGKNPPHEQNSRTTNFTPNANAFKAAHSRVMHTQI FT QTNTPDKKEEEEGDRFLDDDQNQYPQ" XX SQ Sequence 5904 BP; 2416 A; 1131 C; 933 G; 1424 T; 0 other; tggcgaccgt gaccttaatc tgcaatcgac aggcagaagt ataagcaaat tatttacaag 60 taaaatgtga taagtacaac ggtaacgtaa aacgtaagtg tcgtgtgaca aaccaaaagt 120 attgtgacca taaccggcat tcgacgggac gtgagaaaaa aaaaaagtgt caaatgtgca 180 ttttttttta acgaaacatt caacaacgct gctggcacaa gtgacccata tgttataaat 240 aataacggtg aaagaacagc cagaagcaag tgcagtgcag aaaaaaaaac aatatttagt 300 gcaatgtagt gcaaaaggaa tgtcaaattg atgaacatca cagtgttgtg agaacctacc 360 ccaatgtgta aaaacgtaac gcttatgcga tcgatttacc aagtagagta accttaaaat 420 gggtaacgac gcatcaaacc ctaaagctaa tgttgatggt gataatgatc ttaccattat 480 tcaaacacaa aatgttcatt cagagcagca tgaaacccat gaattaaaac tgaacattgt 540 gctgatactt cttgctatca tactcatgca aaaagtaatt aaaaccattt tcaaaatatt 600 gaaaactcta gcaaaacgtg atgcaatcaa taacatacaa ctgcctatat aatgtaactc 660 attttggtaa tctatacgat tcgtggagca attgatatgg gacaataaag tgcatagcga 720 aactacggtt cacaagcagc gattgaatga aacaaccttt cccaacgtgg aagacgaaaa 780 tacaagacga gcccactgcc caatctggat tggcaggcaa gaatagacaa gtgtcccgac 840 aagacatcga gcgacgactg atcgacgacg tcatcgtaga atatcgagta tgcaccaaat 900 accgattaaa cacccagcac cggtatagat tcagcgtagc atcatcatca gtagcaacaa 960 aatgcacgca gcatatccca ttaaggtatt gaaactttag aacataatca aatcaaatca 1020 aatcaaagtt ttcaagctag gaaatgagtt ctataaaaag attacatgaa aacaattcaa 1080 aacaggaatg attactataa gcgatagaat tgaaattttg aaacatatct ttaaaaaaat 1140 acaggacccc aatcaaagga cttgtacgaa aacacaactc agaatacaag ccgaggaaac 1200 atttaaagag attcaaaaag agatagaaaa aaataaattc aaatacacct ttaacaaact 1260 attagaattt agtaaaattt ctaacgcact aattcataat atcattgcta tgagcacatc 1320 aaaaaccaac gacgactcac ataacaagac atctgattgc tcaacaaata gcttagaaga 1380 caaaaatacc aacttaacat tattgacaac caataagtta tcgttcaaac tcttagcaca 1440 gaccatatca gcctttctac aactccataa aaaatctaaa atggctttcg atcttaatag 1500 cgtagggttc tccgtactca cgggcatgca accgtttgag ggtaaagcat ccgaattgac 1560 taaattcatc gatacagtca atacagtaaa aagtataatc agttcatcaa attcccttat 1620 agcaatcaac ctcctactca caaaactagg aaaagaacct agggaattgt ttaaagaact 1680 gcctaaatcc ttcgatgaga taattgttac catcaaagga aggtatacga ataccatctc 1740 agtcgccaaa atttacaaac aattatgcga cagaagaaaa ggtaaatttg aaaattttaa 1800 cacttttgca aagagcctca acgaattggc tgatcaactg caaacagcat atattcagga 1860 aaacatgaca cccgacatag caaagacact aacacagcaa aacgtcataa taaaactaaa 1920 agaagcaggt atcagtagag acagccaact aattttagac ataaaagagt tttcaagtat 1980 caccgacatg ctggatacag tgaaaagtta tgaaggaaaa aacccgccac atgaacagaa 2040 tagtcgcacc acaaacttta caccgaacgc aaatgcattc aaagcggcac acagcagggt 2100 tatgcataca caaattcaaa ccaacacacc cgacaagaaa gaagaagagg aaggggatcg 2160 ttttttagac gacgaccaaa accaataccc acagtagttt tgatgacagg gataaaaaca 2220 aacaacttca taaaggttaa attgaacata gctaacggga aaaacacaac attggtcatt 2280 gactccggag ctgaggtatc actttttaaa gcctcaagtt taaagaaaca aatccagtta 2340 aagaaaaaca aggaactcac actcataggt atcacaacag atacaatgaa aacaaaggga 2400 tacactcaag caaccattca ttttggtgac aaaaatgtcg agcatacatt ttttattata 2460 aaagacctac caactcaagc agacggagta ttaggaatgg actttatcag taaattccaa 2520 tgcgacatcc ttttttcgac ttggatgtta cagttccgta acggaaatga tatcattgaa 2580 caccccgttg atgacagcat caatggaatt attgaaatac caccaagaag tgaggtaatt 2640 cgcaaactca ctttaaaacc aataacagaa gattcagtaa ttttttctaa agaaataaaa 2700 ccaggagtgt ttatcggtaa cacaataatt agcaaaactg atccaaacat caagcttatc 2760 aacacaactg agtcaacagc ctttattaat acaggcacaa tcagaccaca aatagaacct 2820 ttaaaaaatt atgaaatttt cttggcaaat aattacaata ccccagaaag aactagaata 2880 attcaagata aagttcatat agaacaagta ccggaaatag caaaaagaaa tttaaaaaac 2940 cttattgctg aattttcaga tatattttgt ctcgaaaatg agcctatcac aacaaataat 3000 ttctacaaac aaccaataga actaatggac aacaacccat catacatacc taattataaa 3060 caaatccatt cccaagcaga tgaaattaat caacaggtcg acaaaatgct gaaaaatgat 3120 ataattgaac attcagtttc agcttataat tcgcccattc tcttagtccc gaaaaaatcg 3180 attgacggaa gtaaaaagtg gcgattagta gttgacttca gacaactaaa taaaaaaatc 3240 ctgccggata aatttcctct ccctagaata gatacaatat tggaccagtt aggtagagca 3300 aaatatttta gcacacttga cttaatgtca gggttccacc aaattgagct agaaccagca 3360 tcaagaagat tcactgcatt ttccacacca acaggtcatt atcaatttac gagaatgcct 3420 tttggtttaa atataagccc aaatagtttt caacgtatga tggcaatagc tatggcagga 3480 ttatcacctg agctagcatt tgtttacatc gacgatataa tcgttacagg gtgcagcgca 3540 caacaccata taagtaactt aagtaaggtt tttaacaaat taagaaaatg caatcttaag 3600 ttaaatccag aaaaatgctg cttctttaaa acagaagtaa cctatctagg tcacaaaata 3660 acagacaaag gtatataccc ggatgactca aaatacgaaa taatagaaaa atttcctgta 3720 cccaaaaatg caaatgatgc cagacgcttc gtagcatttt gtaattatta cagaaaattt 3780 gtacaaaact ttgcaaaaat agctaaacct attaataacc ttattaaaaa ggacgtaaaa 3840 tttgattgga ccaaagaatg tcaggaagct tttgaaaaac ttaaacaaag tttactctca 3900 ccaacaatac tacaatatcc cgattttaca aaacagttca ttataacaac tgatgcatcg 3960 gatacagcat gtggcgctgt attgtctcaa ataactgacg gaaacgattt accagtggca 4020 tttgccagca aaagtttcac gccaggggag aaaaacaaac caataataga aaaagaactg 4080 acagcgatcc attgggcaat caattacttt aaaccatata tatatggtcg aaaattcatt 4140 gtaaaaacag accacagacc attagcatac ctatttggga tgaaaaatcc aacatctaag 4200 ttaactcgta tgagattaga tctagaagaa tttgactttg acgtacagtt cttagcaggg 4260 aaagcaaacg tagcagcgga cgcgttgtct agagtagtca ttacctcaga tgaacttaag 4320 gctcaaatac caacaaacat aatgttagca acaactaaca atatacaaaa tcttaaatca 4380 acaatcctca tggttcatac cagagcaatg gttaaacaaa aagaggctaa aaaagtaaca 4440 cctcataaac ccatcgatac caggtctgat caaccaacga tgtggacaac ggatacacca 4500 tccaaaacaa ggaagcttct aaaaattaga acaagcattt ccgacaacaa tataacattt 4560 gaagtgtgca acagcactta caagaaagtc ctagggaaag ttaacgcgaa agcagaaaaa 4620 aatggaagtc aagctttaga gcttgctctt ctaaacattt gcaaaatcgc caataactac 4680 aaaaacaaaa agctagcttg gtctctacac gaccaaatat ttacacagta ctctcatcaa 4740 acccttaaag aaatcgccaa cagagtgatt gacaagtatg aattcatact gtttacacca 4800 cctaggtggg ttgaaacaga acatgatcgg ctaagaataa tccatgatta tcacatgacc 4860 ccttcgggag gacacgtagg ccaattcaga ctttacagga aaataagaga tacttacacg 4920 tggaaaaaca tgcgcaatga tatcaaaaat tttatcaata agtgtgaagc atgcttagtt 4980 aataaggtaa acagacacac taaagaacag acagtaatta caacgacgcc taacaaacca 5040 tttaacgtta tatctattga cacagttgga ccattgacaa agaccaataa caactacagg 5100 tacgcaataa ctatacaatg tgacttgaca aaatatgttg taataatacc tattcacaat 5160 aaagaagcaa acactattgc aaaagcatta gtagaaaatt tcatattaac atttggaact 5220 tttatagagc ttaaatcaga tcaaggctta gaatacaata acgaaatcct taacaaaata 5280 acagaaatct tacaaattaa acaaacgttc agtacagcat accatcctca aaccatagga 5340 gcactagaaa gaaaccacag gtgtttaaat gaatatctta gaagctttgt aaatgagcac 5400 catgatgact gggatgactg gattaaattt tacgaatttg tttataacac cacagaacac 5460 acagacacag gttatacgcc ttacgaatta atttttggca gaaaagcaaa tctacctcaa 5520 gaaatttacc aaaacaaaat agagccagta tataatgttg aacaatatta taatgaaatg 5580 aaatttaaat tacagaaatc acgcgaaata gcacagaaaa accttgtgag ctctaaagaa 5640 aataggcaaa atatactaaa taaaaatact aactccttaa aacttgagat aggcgacata 5700 gtttacttaa caaatgaaaa taggaagaaa ttggatcccg tttatattgg accatttaca 5760 gtaacaaata tacaagaacc aaactgcacc ataatacaca aaaccaccaa caaatcgtca 5820 acagtacaca aaaacaggtt aattaaatca aaagaataac ttcaatcatt ttattattta 5880 cgttattctc ctaaagggag gagg 5904 // ID Clu-39B_AG repbase; DNA; ANG; 1110 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-39B_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1110 RA Jurka J.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1441-1441 (2010). XX DR [1] (Consensus) XX CC TA TSD. >93% identical to consensus. XX SQ Sequence 1110 BP; 390 A; 203 C; 183 G; 328 T; 6 other; tagggtaact gtaccagtat tcggcagtgt acctgttttc ggcagggtaa ctaaaaacgt 60 gtaattacgc aaaaacgcgt tgaaaattgt ctcaacacat attttttaaa tagagatatg 120 ctcatttgtt attcatatac gaagtttcac atcatttcct tgcatatttt tcaaaatatc 180 aaacaaaatg tgcagagttc aacatgtttc ggcagatttg ctgtcagcca atagttcggc 240 aggtttcatt attaagagaa ccagagcact aaaacaatga aagtgtggtt tttatgaaag 300 attctatggt tgcaganttt atnctagccc gcttggaatt ttaaaatcac taaaaacaag 360 taattattca ctttaaatgc tgccgaaaat nggtacacgg taaatgttga tttttgacag 420 atcaatgctg tgggctcctg ccgaaacttg aaacaaaaac acacttttga tttcgttgaa 480 tattcattta aaacttgatc agaanactgc aatgcgtatg tcgacacata aaaccacatt 540 tcttgtatca aatatcacca atttcactat aaatnatcca ataacttgag agaaaaataa 600 taatttgtga gccagtcgga gccgtactct atagccgtca ggcagtctca tttctccaat 660 gcctactttg tttgtaactc acatcaaagt gactgtaaaa attgccagca aaatgcccaa 720 aatgggcaac ataaaattaa taatttaagt acttttcaac accaatctta ggaaagtgag 780 agtgtaaaac tgttttaaac atttttgtct acctattggc attaattaaa acatttaccg 840 acccgtattc gcgttgccga aaataggagc acagcctgcc gaaaatagga ccaaactgcc 900 gaaaatagga acaaaaacag tgtttgcatt ttcacgaata tcgctagaaa atgcatttga 960 aacggcaaat aaaaaatagc acaacataat acgatagttt atcgttcaaa ataacgtcac 1020 tttcatgaaa agtaataaaa attgtaagtg tacgagcagt tttngtgaaa ctgctgcact 1080 aacctgccca atactggtac atttacccta 1110 // ID COPIA5-LTR_AG repbase; DNA; ANG; 108 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA5-LTR_AG is a long terminal repeat of the COPIA5_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA5-I_AG; COPIA5-LTR_AG; COPIA5_AG; Copia clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-108 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA5_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 56-56 (2003). XX DR [1] (Consensus) XX CC COPIA5-LTR_AG is a long terminal repeat of the COPIA5_AG LTR CC retrotransposon. There are ~5 copies of COPIA5-LTR_AG in CC the genome. XX SQ Sequence 108 BP; 37 A; 20 C; 18 G; 33 T; 0 other; tgttgaagag caagccttga gtaaaatagt aggctatgat tattaatgta agattctaga 60 aatatagtct tgttccaacc agcaaccttt acacagctac tctctaca 108 // ID GYPSY20-LTR_AG repbase; DNA; ANG; 270 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY20-LTR_AG is an LTR of retrotransposon GYPSY20_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY20-I_AG; GYPSY20-LTR_AG; GYPSY20_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-270 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY20_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 6-6 (2004). XX DR [1] (Consensus) XX CC GYPSY20-LTR_AG is a long terminal repeat of GYPSY20_AG (its CC internal CC portion is deposited as GYPSY20-I_AG). XX SQ Sequence 270 BP; 84 A; 69 C; 46 G; 71 T; 0 other; tgttatatct gagctacact tctggcagca ctggagctgt catacgctat tccccataca 60 caccgtggta tgaacaacta ccttccatac acaccgtggt atgaatagct agattctata 120 tacactgcga tcgcttccgc taccgttcta cgacacttga agaagagaag ccgaactaca 180 agcactacag gactacaaat atatttagat agagataaaa cagttcggtt gtacaattta 240 cttaattacg gttccgcccc atctacaaca 270 // ID BEL9-I_AG repbase; DNA; ANG; 5691 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL9-I_AG is an internal portion of the BEL9_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL9-I_AG; BEL9-LTR_AG; BEL9_AG; Bel clade; integrase; peptidase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5691 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL9_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 47-47 (2003). XX DR [1] (Consensus) XX CC BEL9_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL9-I_AG, an internal portion of BEL1_AG is flanked by CC BEL9-LTR_AG CC LTRs. The BEL9-I_AG consensus sequence was reconstructed based on CC multiple alignment of 18 copies; they are ~1% divergent from CC the consensus sequence. CC The consensus sequence encodes one 1746-aa BEL9_AGp Bel-like CC protein CC (pos. 234-5471). CC BEL9_AGp is composed of the peptidase (pos. 131-263), reverse CC transcriptase (pos. 760-900) and integrase (pos. 1427-1596) CC domains. XX FH Key Location/Qualifiers FT CDS 234..5471 FT /product="BEL9_AGp" FT /translation="MDKKIKAVQLKKRIALENIKSLERFQAKYSSDDAKQI FT PEVLEDLAKHKEEFFTAVSKLEELEDKDEAVEASIMERIDIEERCRKLKSF FT LRERQPKEEGSLNDTTGLASSTLAFGRPHAPNLRLPKIELPTFDGDHTKWL FT SFRDRFIAMIDASAELPSIAKLQYLLSSLKGDAAVPFEHTPLTADNYSVTW FT AALLKRYDNSRLLIREYYRKLHYLPGVQLVCVDKLTHLVDEFTRFVNGLKK FT LNEPVDSWDTPLSNMLLMKLDRETLLAWEKHSVHFTTDKYKDVIDFVQDRI FT QILKSTNNFVKDQAASGIKVAGLIRQPGQRRFIANAATSRSAPAASTAHTQ FT QPKCPLECSEDHTLRNCPVFIAKEVQQRRDVVASKRLCWNCLSSNHQVRAC FT KSDYSCRTCRERHHTLLHHSPPYAPPATVTLSAQSNEDNVFLATANIQIKD FT DYGNTHEARALLDSGSMSNFIAEEFARKLLTSRKRVNVAVSGIGNAVQQIK FT GSIVATVQSKTQPFATEMTFLVLDTPSANIPTSPTDVSSWKMPDVALADST FT FNSPGQIDIVIGGDTFWELHTGRKRSIGRGKPWLVETHFGWVVTGNTHHSS FT VGPRLCHLSAYDTPLEETMQRFWESETIAEDPVLSVEENACEKHFAATTVR FT NSSGRYVVSLPFNSNPNIVLGESKEIADRRLRCIERRLNTNAKMKEEYVKF FT MKEYEHLGHMKRLTSPANDSVEHYYLPHHAVIKESSTTTKVRVVFDASCKT FT SSGYSLNDKLLVGPVVQEDLLSIILRFRSRAIALTADVEKMYRQILHSPHD FT RNYLRIRYREHPADPISTFELQTVTYGTASAPFLATRTLKQIALDHKEEYP FT LAMNAVMNDFYVDDLLTGTDDLSEAIVIQRQISDMLNSAGFTLKKWASNRS FT EALKNVPSEDVAVQLSHEWKSSKQVSTLGIVWEPATDTLRFRIEIPPTTPS FT MTKRLILSYIAKIFDPLGLLGPTIIIAKMFMQQLWALKIHGKAYDWDSELP FT SHLQHEWSKFHSTLSSLRNLTVPRYISQCTATSLQIHIFADASQLAYGACC FT YIRAESMEGVTVQLLTAKSKVVALSNSHSIARLELCAARLATLLYEKVQQS FT LKISATTICWTDSMTVLHWLNSAPNRWKPFVANRVAKIQHTAGIQCWKHVP FT GSDNPQADDISRGLTPEKLLVCERWWHGPHWLARNSEEWPQNTPSPSEDES FT AEEEKLSSRVASTALICEFRNSLFSRFSIYHKLQRVVAHCLRFIQNAKRRV FT GNKVHAKDIPPLTVDELKAAELKLCYLSQQDTFSEEIQHLQKGKEIPKNSK FT LKWISPFIDTQGILRIGGRLSNAHLSESEKHPVILSSKHPLSALLAVSIHL FT SKLHAAPQLLLTTLRQSFWIIGGRNLCKSVYHSCHACFKAKPTLIKQSIAD FT LPTSRVTPTRPFSVCGVDYCGPIYIKQTIRNRSPIKAYIAIFVCFSTRAVH FT IELVGDLTSTAFINALRRLIARRGQISELHSDNATTFKGAAHELNRVYKML FT KSDEHDRAAIFDWCAMNHMKWKFIPPRAPHFGGLWEAAVKAAKKHIVRTIG FT TTSITQESMLTLLAQVEQCLNSRPITPLSDEPSDLEPLTPGHFLVGGNLQA FT VPIIDYTETPSNYLREYQLVQKHLQTIWARWYPEYLQQLQARAKYCNGKSA FT VLKENTLVIIKEDNVHPTSWPMGRIVAVHPGKDDVVRVVTLRTASGKQIVR FT AANRLAVLPNPDVISNLEQKETTGTE" XX SQ Sequence 5691 BP; 1615 A; 1392 C; 1335 G; 1349 T; 0 other; tttttggtcc ttcgaaccgg atcgtgatat acggagaaga aataagtttc attctgcatc 60 aagagtggaa ttcggtattc gtcgtcaagt gtgcaagtca ttccgtgaca attcccgcca 120 tcgcatatcg ccccgcatca agggcaacgg tcggttacac gccatcattt tgaatttcgc 180 gccatcgctt tgttcgttat tttcgtgtga tatcgcagta ttttctttgt gcaatggata 240 agaaaattaa agcagtgcaa ctgaaaaaga ggatcgccct ggagaacata aaatcgctgg 300 aacggttcca ggcgaaatat tcgagtgatg atgccaagca gattccggag gtgttagaag 360 atctggcgaa acataaggaa gagtttttca ccgcagtttc gaaactggaa gagcttgaag 420 ataaagacga agcggtcgaa gccagcataa tggaacggat cgacattgaa gaacgctgtc 480 gcaagctaaa atcatttcta cgggaaagac agccaaagga agaaggttcg ctcaacgata 540 caacgggctt ggcttcctca acgcttgcat tcggtcgacc ccacgcgcca aatttacgtt 600 tgcccaaaat cgaacttcca acatttgacg gagatcacac aaaatggctt tctttccgag 660 atcgcttcat cgcaatgatc gacgcttcag ccgagcttcc atctatcgcg aagctacaat 720 acttactgtc atcgttgaag ggggacgcgg cggtaccctt cgagcataca cctttaacgg 780 cggacaacta ttcggttacc tgggcggcgc ttcttaaacg gtacgacaat tctcgtcttt 840 tgattcgcga atactatcgc aaattgcact accttccggg agtgcaattg gtgtgcgttg 900 acaagctcac gcacctggtg gatgaattca cccgcttcgt caacgggttg aaaaagctga 960 acgaaccggt tgactcgtgg gacacacccc tctcaaacat gctgctgatg aagttggatc 1020 gagagacatt gttggcttgg gagaaacatt ccgtgcactt cacgacggac aaatataagg 1080 atgtgatcga cttcgtgcaa gatcgtatcc aaatcttgaa atcgaccaac aacttcgtga 1140 aggatcaagc agctagtggt atcaaggtgg ccggtctcat tcgtcaacca gggcaacgga 1200 gattcatcgc gaatgcagct acatctcgct cggctcctgc tgcatcgact gcgcacaccc 1260 aacagccaaa gtgtccattg gagtgttccg aagaccacac actgcgcaac tgtccagtgt 1320 tcatcgccaa ggaggtccaa cagcgacggg acgtcgtcgc atcgaagcgg ctgtgctgga 1380 actgtttgag cagcaatcat caggttagag cgtgcaagtc ggattattcg tgtcgcacgt 1440 gtcgtgagcg tcatcacaca cttctacatc attcaccacc ctatgctcca cccgcaacgg 1500 taacattgtc agctcagtcg aatgaagaca atgtgtttct ggcgacggca aacatccaga 1560 tcaaggatga ctacgggaac acccatgaag caagggcgtt gttggattcg ggatccatgt 1620 cgaatttcat cgctgaggag ttcgcacgga aactgctgac gagtcgcaaa agggtcaacg 1680 tcgctgtatc gggcatcggc aatgcagtac agcagatcaa gggttccatc gtcgctaccg 1740 ttcagtccaa gacacaaccc ttcgcaacgg agatgacttt cttggttctg gacacgccat 1800 ccgcaaacat ccctacatca ccaacggacg tctcttcatg gaaaatgccg gacgtggcat 1860 tggcggacag cacctttaac agtccggggc aaatcgacat cgtcatcgga ggcgatacgt 1920 tctgggagct ccacaccggt cgcaagcgct ctatcggtag aggcaaaccg tggctggtcg 1980 aaacccactt tggttgggtt gtcaccggca acactcatca ttcgtcagtc ggtccgcggc 2040 tgtgccatct atctgcatac gacaccccac tggaggagac catgcagcgg ttctgggaga 2100 gtgaaaccat agccgaggat cctgtgctat cggttgagga gaatgcttgc gagaagcatt 2160 tcgcagcaac aactgttcgc aactcaagtg gaaggtatgt cgttagtttg ccatttaact 2220 ccaaccctaa tatcgtttta ggagagtcga aggaaatagc cgatcgcaga ctgcgttgta 2280 tcgaacggcg gttgaacacc aatgctaaaa tgaaagaaga gtatgtgaaa tttatgaaag 2340 aatatgagca tttggggcat atgaagcggc ttaccagtcc tgcaaacgat tcggtagagc 2400 attactacct cccacatcac gctgtcatta aggaatcaag cacaaccacg aaggtgcgtg 2460 tcgtgttcga tgcatcctgt aagacttcga gtggttactc attgaacgac aaactcttag 2520 tgggaccagt cgttcaagaa gatcttttat cgattatcct tcggtttcgt tctcgtgcca 2580 ttgctctcac tgcagacgta gagaagatgt atcggcaaat tttacatagc cctcatgacc 2640 gtaactatct gcgcatccgg tacagagaac atcctgcaga tcctatatcg acatttgagc 2700 tacagacggt tacgtacggc acagcctctg ctccattttt ggcaaccagg accctaaaac 2760 agattgctct tgaccacaag gaagagtatc ctttggcaat gaacgcggtc atgaacgatt 2820 tttacgtaga tgatttgcta acgggtaccg atgatttgtc cgaagcaatc gttatacaaa 2880 ggcaaatctc agacatgcta aattcagctg gtttcacgct gaagaaatgg gcatcgaacc 2940 gctccgaagc attgaagaac gttccttcag aagatgtggc ggtacaactc tcgcacgagt 3000 ggaagagctc gaaacaagta tccacactag gcatcgtttg ggaaccggca actgatacac 3060 tacggtttcg tattgagata ccacctacaa cacccagcat gacgaaaagg ttaattttgt 3120 catatatcgc caagatattt gatcccctcg ggctactggg cccaacgatc atcatcgcaa 3180 agatgttcat gcagcaacta tgggctctca agattcatgg aaaggcatat gactgggaca 3240 gcgagctacc atcgcactta cagcatgaat ggtcgaaatt tcactctaca ttatcttcac 3300 tacgcaattt gacagtccca cggtacatat cgcaatgcac ggcaacaagt ctgcaaattc 3360 atatctttgc tgacgcatca caactagcat atggtgcttg ttgctacatt cgggctgaaa 3420 gcatggaagg agtcaccgtg cagctgctaa cagccaagtc aaaggtcgtt gcgttatcca 3480 attcacattc catagctcga ttggaattat gtgcagcacg actagccaca cttctttacg 3540 agaaagtcca gcaatcactg aaaatttctg ctaccaccat ctgttggacc gattccatga 3600 ctgtccttca ctggctgaat tcagcaccaa atcgatggaa gcccttcgtt gcaaacaggg 3660 ttgcaaaaat tcagcacacg gctggaatac aatgctggaa gcatgttcca ggctcggaca 3720 atccccaagc agacgacatt tcgcgaggtt taacgccgga aaagttgcta gtgtgtgagc 3780 gctggtggca cgggccacat tggttagcac gcaactcgga agaatggcca cagaacacac 3840 catcaccaag cgaagatgag agcgcagaag aagaaaaact atcgtcacgg gttgcaagca 3900 cagcattaat ctgcgaattt cgaaacagtt tgttctcacg attttcgatc taccacaaac 3960 tgcaaagagt tgttgcacat tgtttgcgct ttatacaaaa cgcaaagcgc cgcgtaggaa 4020 acaaggtcca tgctaaggat atcccaccgc tcactgtaga cgaactcaag gcggcagaac 4080 tcaagttgtg ttatctttcg caacaagaca ccttttccga ggagatacaa cacctgcaga 4140 agggcaaaga gattccgaag aactccaaac tgaaatggat ttcccctttc atagatacgc 4200 aaggtattct gcgcattggt ggccggctca gtaacgcaca tctgtcggaa tcagaaaaac 4260 acccggtaat attatcatcg aaacatccac tgtccgcact actagctgtt tcgatacact 4320 tgagtaagct gcatgctgca ccacaactgc ttttaacaac actacgccaa agcttttgga 4380 taattggcgg tcgcaattta tgcaagtctg tgtaccacag ttgccacgca tgttttaagg 4440 ccaaacccac acttattaag caaagtatcg ccgatttgcc aacatcacga gtcacaccaa 4500 caagaccatt ctcagtatgc ggagtagact attgcggacc aatctatata aaacaaacca 4560 tacgcaacag aagtccgatt aaagcataca tcgccatatt tgtatgtttt tcaacaagag 4620 cggtacatat cgaactggtt ggcgatttaa catcaacagc atttatcaat gcacttcgtc 4680 gtttgattgc acgtcgtggt caaatcagtg aactgcattc cgacaatgca accaccttta 4740 agggagcggc acatgagctg aatcgcgtct acaagatgct aaagagcgac gaacacgatc 4800 gagctgctat atttgattgg tgcgcgatga atcatatgaa gtggaagttt atcccaccaa 4860 gagcaccaca ttttggaggt ttatgggagg cggcggtgaa ggcagctaaa aagcatatag 4920 tcagaacaat aggaacaaca agcatcacac aggagagcat gcttacccta cttgcccagg 4980 tagagcaatg tttgaattcg cgaccaatta cacctctatc cgatgagccg tcggacttgg 5040 aaccattgac accgggacac ttcctcgtcg gtggcaatct gcaagcggta ccaatcatcg 5100 attacaccga gacaccgagc aactatttga gggaatacca gttggtacaa aaacatctgc 5160 aaaccatttg ggctcgatgg tatccggagt acctgcagca gttacaagct cgagccaaat 5220 attgcaacgg gaaatcagcg gttctgaaag aaaatacact ggtgattatt aaggaagaca 5280 atgtacatcc tacctcgtgg ccgatggggc gcatcgttgc agtacaccct ggaaaggacg 5340 atgttgttcg cgtcgttaca ctgcgcactg cttcagggaa gcaaatcgtc cgcgcagcta 5400 atcgtctggc ggttttgcct aatccggacg taattagcaa cttagagcag aaggaaacca 5460 ctggcactga gtaacgcgca gctgcaagcc acgcgacatt gtttacattt cgcactcatg 5520 gcagaacaag atacacgcaa cacacattca cacatctaca cacacgagca gtatgatagg 5580 agataagcgt caggttcaag agtcgaactg tttttgaatt cttttttgaa ctcatttttg 5640 gatttatatt ttgcatgaaa tttaaagaag ttcttctttg gtggccggga a 5691 // ID BEL16-LTR_AG repbase; DNA; ANG; 287 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL16-LTR_AG is a long terminal repeat of the BEL16_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL16-I_AG; BEL16-LTR_AG; BEL16_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-287 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL16_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 40-40 (2003). XX DR [1] (Consensus) XX CC BEL16-LTR_AG flank an internal portion of BEL16_AG (deposited as CC BEL16-I_AG). XX SQ Sequence 287 BP; 85 A; 51 C; 75 G; 76 T; 0 other; tgttggaatg taagggttat gaaacggtca ttttgaattg tttgcggttg ttttgtcagt 60 tgggaattaa aagttaaatg tattttctgg cagcactgcc gatcgacaat ttgtgattaa 120 gtatgtgtgc gaataaagcg gcactagcgc atgaaactcg atacgagccg gacgtgttct 180 ttactttgtc tcctttggcg atcgaagacg acacaacaaa acacaacgta gggcgtagag 240 gcgtcaaggg ggaaaggaac caacaaacca tgttccagaa cgcaaca 287 // ID GYPSY37-LTR_AG repbase; DNA; ANG; 171 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY37-LTR_AG is an LTR of retrotransposon GYPSY37_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY37_AG; GYPSY37-I_AG; GYPSY37-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-171 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY37_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 67-67 (2004). XX DR [1] (Consensus) XX CC GYPSY37-LTR is a long terminal repeat of GYPSY37_AG (its internal CC portion is deposited as GYPSY37-I_AG). XX SQ Sequence 171 BP; 53 A; 44 C; 38 G; 36 T; 0 other; tgtaggatga gagggctacc ccccgaaaac gtcaaaatga cagctagata gcgagggcat 60 gacagcaggc gatgagcgac acaatcttac cgcgagcgtc accgtgaaca gaagcaagaa 120 tatacgaatt tcctttaatc acctacaccg ctcgtttcct tatttcttac a 171 // ID TransibN1_AG repbase; DNA; ANG; 978 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 13-JUN-2005 (Rel. 10.07, Last updated, Version 3) XX DE TransibN1_AG is a Transib-like DNA transposon - a consensus DE sequence. XX KW Transib; DNA transposon; Transposable Element; Nonautonomous; KW TRANSIBN1_AG. XX NM TRANSIBN1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-978 RA Kapitonov V.V. and Jurka J.; RT "TRANSIBN1_AG: a family of target site-specific nonautonomous RT TRANSIB DNA transposons from African malaria mosquito."; RL Repbase Reports 3(4), 83-83 (2003). XX DR [1] (Consensus) XX CC TransibN1_AG is a family of nonautonomous DNA transposons that CC belongs CC to the Transib superfamily originally identified in Drosophila CC (see CC description of TransibN1-Transib4 in drorep.ref). CC TransibN1_AG is characterized by a remarkable target site CC specificity. CC Its copies are inserted into the CCagtGG target site, and CagtG CC is CC a 5-bp target site duplication. There are ~100 copies of CC TransibN1_AG CC in the genome. CC The TransibN1_AG consensus sequence was reconstructed based on CC multiple alignment of 30 copies. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC TransibN1_AG occurred recently (in the last 1 Myr). CC TransibN1_AG has 17-bp terminal inverted repeats (2 mismatches). CC Original name was TRANSIBN1_AG (replaced by TransibN1 in May CC 2005). XX SQ Sequence 978 BP; 332 A; 167 C; 163 G; 314 T; 2 other; cacagtgggc aaccgccata caaacgccgg gatgaaaatc aattcctcgt gctattgcar 60 tttwtcttca ttcaatacaa ttgctcttac tatacagggt agtcctatac taaaatcgtc 120 aagacagcga ataaaactta ataattatcg ctcacaacat tgcattattg cgtatcagtt 180 aacagcatca ataataattg ttaggaatta aaacgaaggc ggaataagtt tctgactgaa 240 aacgaatttt taaagtatta cgcactaaaa aagttgtgtt tttcatggtt tgtttggaaa 300 agagccgatt cctatcttac gacctttttt aaactgtttt tcctctcttc agctttgttt 360 tctgcctgtt tgtttcaaat gcccgtttat gacaggtagt tggatactgg tggtgtatgg 420 ctcatacaac aacacgtcaa aatcgctcgc gctattttca agaaaaagtt taataatttt 480 cggggtcgca gatggtcgca gctaatttta acgacaatgc gtaggaaaat tgttgatctt 540 tccaatgata tacgactcac gagtgaaatc gagtcgaatc ataaaaaaaa tcactctcca 600 aaataaaaat atccaaaaat tcagcagtga tgtttggttt tcaatcattt atgaacttta 660 aaaacaagtt tttgcaaaat attaagacat aacatcaaag tatgacaaaa aacctttcca 720 acgacacatt gattatcaaa atctaaccat catatactaa aatatgatgg tttatgttcg 780 gtcgaaaaat agctcaaagt tgggacaaaa aacccaaagt ttacactttg atggcctata 840 tctcagtaag tttaagataa aaacgtgaaa tattttggtt tcaactaagt ttaagtatct 900 attttaagaa aatgattatg gtgtaaacct gcgatgaagt tggtttttcg atttttatac 960 aggcgtttgc ccaatgtg 978 // ID RETRO76_AG_LTR repbase; DNA; ANG; 461 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO76_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; NINJA; RETRO76_AG_I; RETRO76_AG_LTR; ROO; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-461 RA Jurka J. and Drazkiewicz A.; RT "RETRO76_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 16-16 (2002). XX DR [1] (Consensus) XX CC Related to NINJA from Drosophila simulans and ROO from Drosophila CC melanogaster. 5 bp target site duplication. XX SQ Sequence 461 BP; 121 A; 96 C; 116 G; 128 T; 0 other; tgttgcgcag cgaccacatc gaaggtggct gtttgaattt ataattaaat aatttatatt 60 ttgttcttag taggctatga tagcttgtaa tttagagtat aaaaggagga aataattata 120 ataaatcagt cttacaccag catttgatga gtgcgtttct ttgtttggtg ttaagattct 180 cccgggtgaa gctttgggcg aatacgattg ccttctcgac aacgctttac gaaatcagta 240 gcggcgctac cacggcatag acctcccaat ttacattgga aaagaggtaa ctagcctggt 300 aggccaatac tgaaagtagt ttggtgaagg agagttggac cgttgacgct cccgaaaccc 360 ttttgcccgg gacgttctgt tccgtggcag cagacggttt aaccaagcgc tccagaacga 420 ccgaaacctt cgcggtcggc ttggccgtcg ggcccggaac a 461 // ID GYPSY16-LTR_AG repbase; DNA; ANG; 750 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY16-LTR_AG is an LTR of retrotransposon GYPSY16_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY16_AG; GYPSY16-I_AG; GYPSY16-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-750 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY16_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 174-174 (2003). XX DR [1] (Consensus) XX CC GYPSY16-LTR_AG is a long terminal repeat of GYPSY16_AG CC (its internal portion is deposited as GYPSY16-I_AG). XX SQ Sequence 750 BP; 297 A; 127 C; 137 G; 189 T; 0 other; tgtagcatag gttcgacacc cgacactatt agacttagca aattagaatt agattagaat 60 agcaaagaca caattattag tataacatcc gctataatta tctgcagtaa attcattaca 120 tcctagacac aaaccgcact aaggctagaa tgaaacagct aatgatagaa tgaaatagca 180 tgataagtgg cgtgatcgta ttcaccaaca cactcaaaag aaatagtttt tacaagtata 240 aaattttaga ttaaacatta tcagaacata aattactgta attagatgca cacaaatatt 300 agacataaaa gtgcaatatg ttctaatccg aaccaactag ataaataggg aaaatttaat 360 agagttcatc caattgcgca gtaggataat acacatggaa ggacgatgtt acctaatgta 420 cgcaaaaagt cacgtacgca taatgcagga gatagagcag gataataaat tgtaatcatg 480 cataacggag cggataagct aaatatgccc ataagcgcaa tgtaattatt tcgtataaaa 540 ggaagctcgc gcgtagctag aggagagttg ttgtatcagt aaaatcacgc agtccagtcc 600 gccgcgcaaa atttagttcg caaatcatta gaagccaacc acgctattaa gtttaagtaa 660 agtggcaaaa ataaagtgag tttatataaa agtgcaaata caatccgtct ttagtcagga 720 acggcactaa gtgatgacaa gttcactaca 750 // ID GYPSY22-I_AG repbase; DNA; ANG; 4383 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY22-I_AG is an internal portion of retrotransposon GYPSY22_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY22-I_AG; GYPSY22-LTR_AG; GYPSY22_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4383 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY22_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 9-9 (2004). XX DR [1] (Consensus) XX CC GYPSY22_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY23_AG, CC GYPSY24_AG, CC GYPSY25_AG, GYPSY26_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY22-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. CC The consensus encodes the 1430-aa GYPSY22_AGp gag-pol like CC protein CC (pos. 68-4357). CC The sequence of the LTRs flanking GYPSY22-I is deposited as CC GYPSY22-LTR_AG. XX FH Key Location/Qualifiers FT CDS 68..4357 FT /product="GYPSY22_AGp" FT /translation="MDPASKKTEIPAASVTTRAAAKSASTGTPTADSSSNL FT PTGVLATDNSATAKPISSNTSTRGPVTAHSAAVKPKASSSTTSSTSVPGSS FT VGEGTALSVSVRGITTAVTNMSTMFSFEPFDPTNCKIQRWLERLQIAFKIH FT RVSEEDKRDYLLHYMGGATYDVLCNKLKNAEPQTKTFQEIVSILQEHFNPN FT PLEILENFKFANRKQAENETLSTYLMELEKLAQTCNFGDYLDKALRNQFVF FT GLQNRAIQSRLLEVRDLTLAKAKDIAFSMEMSNRGADEIHGAGAAYPVQHI FT STTSKKKSKPTTVQRKTACYRCGNEEHFADKCRHRNAICNYCKKMGHLDKV FT CRTKLQRAGVHTLEYEPDLPACDDDVVDVLNLRAVQNLAGKFLLEMEISDR FT KLIFEVDTGSPVSLISNRDRLNCFPNTLMKKSNVKLKSYCNGVINVLGEIE FT VRAKIKNLEIPLPLLVTKSDRNPLLGRNWMRTIKLDLNKFMHTSQEVSYCE FT KENFTPCSILEALLKKYSEVFEAGIGKIEGLQASLTLRKETKPIFIKARPV FT AFAVRDAVTNEINKLVEENVLEKVDHSEWATPIVPVKKSGGNVRLCGDYKI FT TVNPNLLVDEHPLPTVEELFTNIAGGEKFSKLDLSQAYLQLEVSPDCRDIL FT TLATHKGLYRPTRLMYGVASAPAIFQRLIEQILQDIPGVTAFIDDIRITGP FT NDEIHLKRLEEVLKRLRKYNLKVNKAKCEFFADQIEYCGYLVDKHGIHKLH FT TKIKAIQDMPAPKSVDELRSFLGLVNYYGRFFPNLSTVNYPLNNLLKDGIP FT YVWDEHCQKAFAQVKREMQTERVLVHYDPNLPLILATDASPYGVGAVLSHK FT FADGTERPLQYASQTLNKTQQRYSQIDKEAYAIIFGVHKFHQYLYGRKFTL FT VTDNKPLSQIFSESKGLPTMSAMRMQHYAAFLQGFDYNIRHRKSVNHCNAD FT ALSRLPLPSEDSDRRIEDSDLVEINVIETLPLTVPELAKATAVDPNVEELL FT RALRTGKIILPKHCFGIDQNEFDLQSDCIMRGSRVYIPPLLREKVLQELHS FT SHFGISRIKSLARSYCWWPNIDKEIENIVNNCQPCQETRANPPKVPIHCWE FT KAEGPFQRVHVDYAGPFMGSYFFILVDAFSKWPEVRVVNNMTTETTINACR FT EIFSTFGIPMVLVSDNGTQFSSTEFTRFLKLNGVIHKFSAPYHPATNGQAE FT RFIQTLKSKLKAVKCNRTEIPEVLSNILLSYRKIIHPSTGFSPSLLVFGRQ FT IRSRIDIMIPTSNSNNREIIKTKDFEIGQRVAAREYIKNNKWEFGKVTARL FT GKLHYEIELDDGRTWRRHIDQMRAVGISMQKPINRNTSWQGRETTSHNLES FT ETNTIPKNSSTPELMTDCGSRSIVQAQPSNPPPMSFPSEAPCSAEIPPEAK FT LRRSARSIKTPQRLLL" XX SQ Sequence 4383 BP; 1462 A; 876 C; 919 G; 1126 T; 0 other; aattggcgac gaggaagtac gaaaaccacg ttattatacc taagcaaact aagcttgtgt 60 tgttttcatg gatccggcgt caaagaagac cgaaattcct gctgcttctg tcaccacacg 120 ggctgctgct aaatctgcca gtaccggaac acctacggcc gattcttctt caaaccttcc 180 tactggcgtc cttgctacgg acaattctgc tactgctaag cctatttctt cgaacacttc 240 tactagaggc cctgttacgg cccattctgc tgcggtaaaa ccaaaggcat ccagttctac 300 tacatccagc acatcggtgc ccggctcaag tgtgggtgaa ggtaccgctc tgtctgtttc 360 ggtccggggg ataactactg ctgttacgaa tatgtctaca atgttctcat ttgagccgtt 420 tgatccaaca aactgcaaaa tccagagatg gttggaaagg ttgcaaattg ccttcaaaat 480 ccatcgagtt tccgaagaag acaaacgtga ttaccttctc cattatatgg gtggtgctac 540 ttatgacgtg ttgtgcaaca agttaaagaa tgcagaaccg caaactaaaa cgttccagga 600 aattgtttcc attcttcaag agcatttcaa cccaaatcct ttagaaattt tggaaaattt 660 caagtttgca aatcggaagc aagctgaaaa cgaaacactg tctacgtatc taatggaatt 720 ggagaagcta gcacaaacat gcaattttgg ggattacctc gacaaagcct tgagaaacca 780 atttgtattc ggactccaaa accgtgcgat ccaatcgcgg ttgctcgagg tgcgcgactt 840 aacattggct aaagcgaagg acatagcttt tagtatggaa atgtcaaatc gaggcgcaga 900 cgaaatacac ggcgccggtg cagcgtatcc agttcagcac atcagcacca cgagcaagaa 960 gaagagcaag ccaacaacgg ttcaaaggaa aacagcttgc tatcggtgtg gaaacgaaga 1020 gcattttgcg gataagtgtc gacaccggaa tgcaatttgc aactactgca agaaaatggg 1080 acacttggat aaagtgtgtc gtacaaaact acaacgagca ggagtacaca cactggagta 1140 cgagcctgat cttcctgcct gtgacgatga cgttgtggat gtgctcaacc tgagagcagt 1200 gcaaaatctg gcgggtaagt ttttgctgga aatggaaatc tctgatagaa aacttatttt 1260 cgaggtggac acgggttctc cagtatcact aataagtaac agggatagat taaattgttt 1320 tcctaacaca ttgatgaaga aaagcaatgt aaaattgaaa agttactgca acggtgttat 1380 caacgttctt ggagagatcg aagtgagagc aaaaattaaa aatttagaaa ttcctttgcc 1440 attgcttgtc acaaaatccg acagaaatcc tttgcttgga cgtaactgga tgagaactat 1500 aaagttagat ttgaataaat ttatgcatac cagtcaagaa gtttcatatt gtgaaaagga 1560 aaatttcact ccatgctcaa tactagaagc tctactaaaa aaatattcgg aagtatttga 1620 ggcaggaata ggcaaaattg aagggttaca agcttcatta actcttcgca aagaaacaaa 1680 acctattttt atcaaagcgc ggcctgtcgc atttgctgtc cgagatgctg taaccaatga 1740 aatcaataag ttagttgagg aaaacgttct agagaaagtg gatcattctg aatgggctac 1800 acctattgtg ccagtaaaaa aatcaggggg taatgtaaga ttatgtggtg actataaaat 1860 cacggttaat ccaaacctat tagtagacga gcacccacta ccgacagttg aggaattgtt 1920 cacaaatatt gcaggcgggg aaaaattttc taaattagac ctttcccaag cctatctgca 1980 attggaggta agccccgatt gtcgggatat tttaacatta gcaacccaca aaggattgta 2040 ccgccctacc agactaatgt acggggtggc atcggcacca gcaatatttc agaggttaat 2100 cgaacagatt ctacaggaca ttccaggagt aactgctttt attgatgata ttagaattac 2160 agggccaaac gatgaaatcc atttaaaaag attagaagaa gtactaaaaa gattacgaaa 2220 gtacaatcta aaagtaaaca aagctaaatg cgagtttttc gcagatcaaa tagaatattg 2280 tggataccta gttgataaac atggcataca caaactacat acaaaaatta aggccataca 2340 ggacatgccg gctccgaaat ccgtagatga attaagatct tttctaggtt tggtaaatta 2400 ttacggacgg ttcttcccaa atttaagtac tgtaaactat cctttaaaca atcttctaaa 2460 agatggaata ccatacgttt gggacgagca ttgccaaaag gcattcgctc aggtaaaaag 2520 ggaaatgcaa acggagagag tactggtaca ctatgatcct aaccttcctc tcatattagc 2580 tacagatgcc tcaccctacg gggtaggtgc cgtccttagc cataaatttg ccgatggaac 2640 cgaaaggcca ttacaatacg catcgcaaac tttaaacaaa actcaacagc gatattctca 2700 gatcgacaaa gaagcctacg caatcatttt cggagtgcat aaatttcacc aatatttata 2760 tggacggaaa tttactctgg taacggacaa caaaccctta tcacagatat tttcagaatc 2820 caaaggatta ccaacgatgt cggcaatgcg aatgcagcac tatgcagctt ttttacaagg 2880 gtttgattat aacattcgtc accggaaatc tgtaaaccac tgtaacgccg atgcattgtc 2940 aaggttgcct ttaccgtcag aggattctga tagaagaata gaggattccg atttagttga 3000 gatcaacgta atcgaaacat tgcctttaac agttcctgaa ttggcaaaag ccacagctgt 3060 agacccaaat gttgaagaac tactgcgtgc cctcagaacg ggtaaaataa ttttaccgaa 3120 acactgtttt ggtattgatc aaaatgaatt cgatttacaa agtgattgta ttatgcgggg 3180 gagtagagtt tacattccgc ctttattaag agaaaaggta ttacaggagc ttcactcttc 3240 tcattttggg atctctagaa tcaaatcttt ggcaagaagt tattgttggt ggcccaacat 3300 agacaaagaa attgaaaata ttgtaaataa ctgccaaccc tgtcaagaga caagagcaaa 3360 cccaccaaaa gtaccgatac attgctggga gaaagcagaa gggccttttc agagggtgca 3420 tgtggattat gctggtcctt tcatgggaag ctactttttt attttagtgg atgctttctc 3480 taaatggcct gaggtccgtg tcgtcaacaa catgactacc gaaaccacaa ttaatgcctg 3540 cagggaaata ttcagtacat tcggaattcc tatggttttg gttagcgaca atggcactca 3600 attttcgtca acggaattta ccaggttttt aaagctaaac ggggtgattc ataaatttag 3660 tgcaccttac catccagcga caaatggaca ggccgagcgt tttattcaaa cactaaaatc 3720 aaagctgaag gcagtaaagt gtaatcgtac agaaattcca gaagtgttga gcaatattct 3780 attgtcttac agaaaaataa ttcatcctag cactggtttt tcgccctctt tgttagtatt 3840 tggaaggcaa atacgcagcc gtattgacat aatgattcca acttcaaact cgaacaatag 3900 agaaatcata aaaactaaag attttgaaat aggacaaaga gtagctgcaa gggagtacat 3960 taaaaataac aaatgggaat tcggaaaagt aacagcaaga ttaggtaaac tgcattatga 4020 aatagagtta gatgatggac gaacttggag acgtcacatc gatcaaatgc gcgctgtggg 4080 tattagtatg caaaaaccca taaataggaa cacttcttgg caaggcagag agactacttc 4140 gcataatttg gaatcagaaa ctaataccat cccgaaaaat agcagtacac cagaactaat 4200 gacagattgt ggatctaggt caatagtaca agcgcaacca tccaaccctc caccaatgag 4260 ttttccatca gaggctccat gttcagcgga aataccaccg gaagcaaaac tgcggcgatc 4320 agcaaggagt atcaaaacac cacaaaggct actactgtaa ataactattt cggagggaag 4380 agc 4383 // ID Clu-13_AG repbase; DNA; ANG; 950 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-13_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-950 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1437-1437 (2010). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 950 BP; 308 A; 162 C; 180 G; 300 T; 0 other; taggggaaag cggggcaaaa tgggcatgtg gggcaaaatg ggcaccctct atttaagcat 60 tatttgacta caaagatact ttccaatgct caaaatgtat tcattagagt gttctatgaa 120 caccatgtaa gtttcataac ccttacataa caaataacca agaaaaatgc aaaataaggt 180 ttagcagcat attgatgtaa tttttgccac ttcgaaaata agcttaaacc agtgtcacag 240 ggatgcgttg aagttttaca tgatatattt ttaaagatat gatatttcta tacaatttgt 300 ctgaagaaag caaggcgatt gaatgcgtat ttttactaat ataacaaaaa atacaaaaat 360 gcttcacgtg gggcaaaatg ggcagttacg cttggggcaa aatgggcaga tgcttttgac 420 atacggcgct tggtgaggct atggtgttgt atttgtcgcc tcaataatga taacaggcac 480 agcttcaaaa gattcgattt tgtagattta aacgtgtaaa aactgtttta attttgataa 540 catatttttg gaagaatttt aagggctttt ctgaccagca aaacaaaaga tagaacgaaa 600 aatacatttc acgaaacgtg cgcaaaagca tagaccatta cctaagcgca gttgttgtgg 660 cccatattgc cccagtgttt tgacgtttca cagtttgttt acatatgccc attttgcccc 720 agggctatgc ccattttacc ccgcgtgtca aaatagcttc tcgtaaaaca tcaactttta 780 ttttacatta ttttcatgtt tttaagttgt tttcattcta tgtggcaatt ttaatccata 840 gataaatggt ggaggcatac aataatgaaa gaaaaattgt gttttgtggc atattcaaag 900 gaatattcag taaactgctt aacatgccca ttttgccccg ctttccccta 950 // ID GYPSY70-I_AG repbase; DNA; ANG; 4858 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY70-I_AG is an internal portion of retrotransposon GYPSY70_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; CsRn1 lineage; GYPSY70-I_AG; GYPSY70-LTR_AG; KW Gypsy clade; RNase-H; integrase GYPSY70_AG; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4858 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY70_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 179-179 (2004). XX DR [1] (Consensus) XX CC GYPSY70_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and Integrase is phylogenetically grouped CC with representatives of the CsRn1 lineage of other organisms. CC GYPSY48_AG, GYPSY49_AG, GYPSY50_AG, GYPSY51_AG, GYPSY52_AG, CC and GYPSY53_AG are other members of this same lineage in CC Anopheles gambiae. The GYPSY70-I_AG consensus was CC reconstructed after multiple alignment of 4 copies. The CC consensus encodes the 315-aa GYPSY70_AG1p gag-like polyprotein CC (pos. 694-1638) and the 1064?aa GYPSY70_AG2p pol-like CC polyprotein (pos. 1642-4833). The sequence of the LTRs CC flanking GYPSY70-I_AG is deposited as GYPSY70-LTR_AG. XX FH Key Location/Qualifiers FT CDS 694..1638 FT /product="GYPSY70_AG1p" FT /translation="MMQHSPERQPAPQAATHDAPPVNTERTGTSQATNAMT FT AAETASDPRVEAAQAVRLSVPEMDLHNLPAYFCALEHWFAATGITAKMDHK FT RYHVLMAQIPLRVYNEIQPIIENVPATERYNYIKRNILQHFGESQRSRLHR FT LLYGMDLGDRKPSQLLAEMHRASSDTLASTLLTDLWINKLPPHVQSAVVAA FT PGSVTEKAAVADTMVECLSASNNASVHHAVAGVRTTPNDFEQRISRQVDEL FT TQQLNDFITECRNRDQRQSRPRPRPPVSSRVGTEPSEGECYYHRRYGTAAR FT TCRQPCSFPAPVSQRVGQPSSSA" FT CDS 1642..4833 FT /product="GYPSY70_AG2p" FT /translation="GYAGDIGSQVATISPVSSRLMVIDRRTNQRYLIDTGA FT DVSVLPKPANYTPPTPSTMRLFAANGTPIMVFGESLRTLDFSLRRPFVWNF FT IIADVSSAIIGADFLRHYHLLVDLRQRCLVDAQTNLRVPGLPDTTRQTAVK FT VCDANSPMADLLNEFPTLITNSPGVRMQSEVVHRIETTGPPTFARSRRLPP FT DKYQAAKAEFDSLVQLGICRPSSSSWASPLHMVKKADGSWRPCGDYRSLNA FT RTTPDRYPLPYLQDFTMQLGGKTVFSKVDLQKAYHQISIHPEDIPKTAIIT FT PFGLFEYVTMPFGLRNAAQTFQRLIHDVLRGLNFVFPYIDDIIVASKTPEE FT HREHLRLLFARLTQHGLTINLAKCEFAQPEISFLGHRVTSEGILPLEEKVD FT TIRQFPKPNTVMELKRFLAMINFYRRFIPHALRAQGPLLEMIPGNKRRDKS FT SLTWTPATDTAFEDCKRQLAQATMLVHPVPSAELSLWCDASDFAAGAALHQ FT VIDGQMQPLGFFSRKFDNAQRRYSTYDRELAAVYLAVRYFRHQLEGRSFHI FT YTDHKPLVYAFRQSLDKASPRQARHLDFIGQFTTDIRHVEGQENVTADLLS FT RIEPIQSSTSIDYEKLAEDQTRDPELADILSGKTRTDLVLQRVPIPGSSMA FT LYCDCPAGIIRPYVTKPFRQQLLRAVHQMSHPGAKTTTKLMTERFVWLNIR FT RDTRDFVRHCLACQATKVQRHTRSPLGRYPVPDARFAHINLDLVGPFPISN FT GHRYCLTIIDRFTRWPEAIPIPDITSTTVAAALLSGWIARFGVPSFITTDQ FT GRQFESTLFAELNQLLGIKHLRTTAYHPQANGLIERWHRTLKAAICTKNTA FT HWTDHLPIILLGLRTAYKDDIKASPAELVYGSTLKIPAEFFNSSPMTSLPD FT TTEFTKSLKYAMNTIRPTQTAWHDKATPFVHSDLRTCSHVFVRNDTVRPAL FT TPPYQGPYKVLRRSDKSFEVLINERATNISIDRLKPCYSLQQPEPVTQLTP FT PSAAEYTPPPPALPPPASPPSAPPPPASPPPTEPSPPNDFSTGVTRSQRRV FT IIPVRYR" XX SQ Sequence 4858 BP; 1140 A; 1534 C; 1152 G; 1032 T; 0 other; actggtgacc ccgacgtgat tcccgagatt ccccgatcga cagattcatc atcggcccgc 60 tttcgccttt ccacgctacg cggttcccgt tcgcccaggt ccgtcaacgc cgcgagtaaa 120 atcggtacgt gtagtgtgtg tatcaagcgt cgccgagtgt cgtgcctttt tttttcgcaa 180 ggacaaaaac cgtgttcgat cgtgtttccg cgcgtgcata aaagtgtcgc gagtgcccta 240 cacgcccgcg tgtgcataat cgtcgcgagt gccctacacg cccgtgtgtg cgtaatcgtc 300 gcgtgtgtcc ttcacgcccg tgtgtgcgta atcgtcgcgt gtgtccttca cgcccgtgtg 360 tgcgtaagtg tcgcgagtgt cccgtcccga gtgcattgtt ctccatacgc accgacgtgt 420 ttcccttgca tcgtgaaccc gcatcgctac agaggatcga gccgcgggtt acatcgccgt 480 ccccaaagag aaggaagagc aacgatccca tcgttccagc ccggacaaaa gaaagcgctc 540 cttgtaaagc cgttgactgc cgccattttg cacgttcgac gccattgaca gccgccattt 600 ttgtctcgct gaagtaagcc caactcccga ctaccacgtc acttcttttg ttctcgcttc 660 tcgtgaaacc acgtgctcgc aagcgaataa aagatgatgc aacatagtcc agaacgtcaa 720 cccgctcccc aagcggcaac gcacgatgca cctccggtga acacggagag aactggcacg 780 tcccaagcaa caaatgccat gacagcagca gaaacggcaa gtgatcctcg agtggaagca 840 gcacaggccg ttcgactgag tgtacccgaa atggacctgc acaatttgcc tgcgtacttt 900 tgtgcactgg agcattggtt cgccgcgacc ggaatcaccg ccaaaatgga ccataagcgc 960 taccacgtat tgatggccca aattcctctt cgagtgtata acgagatcca gccgataatc 1020 gagaatgtac ctgcaacgga acggtacaat tacatcaagc ggaacatcct ccagcatttc 1080 ggggaatccc agcgtagccg cctccatcgt ctcctatacg gcatggactt aggggaccgc 1140 aagccatccc agctgctagc tgaaatgcac cgagcctcca gcgacacgtt ggccagtacg 1200 ctcctcaccg acctttggat caacaagctt ccgccacatg tgcaatcggc cgtcgttgct 1260 gcacccggaa gtgtaactga gaaagctgcc gtcgccgata ctatggtgga gtgcctgtcc 1320 gcatccaaca acgccagtgt gcaccatgcc gtagccgggg tgcgcaccac acccaacgat 1380 ttcgagcaac gcatttcgcg ccaggtggac gaactcacac agcaacttaa cgacttcatc 1440 acagagtgtc gtaaccgtga ccaacgccaa agtcgaccgc gcccacgtcc accagtgtca 1500 tctcgtgtcg gtacagaacc atctgaagga gagtgctact accatcgacg ttacgggaca 1560 gccgcccgga cctgccgcca gccctgttcg tttccagccc ctgtcagcca gcgcgtcggc 1620 cagccgtcgt cttcagcctg aggatacgct ggagatatcg gatcgcaggt ggctaccatc 1680 tcacctgtga gcagtcggtt gatggtgatc gaccgacgaa ccaaccagcg ttacctgatc 1740 gatactggtg ccgatgtctc cgtgttaccg aagccagcca actacacccc cccgacgccg 1800 tccaccatgc gactgtttgc agccaacggt actccgatca tggtatttgg tgagtccctc 1860 cgtacacttg attttagttt acgccgcccc tttgtgtgga attttattat cgcggacgta 1920 agctctgcta tcattggagc ggactttctt cgccattacc atcttttagt agaccttcga 1980 cagcgctgtt tggtggatgc ccaaaccaac ctacgtgtgc caggacttcc ggataccact 2040 cgacaaaccg ccgtcaaggt gtgcgacgca aattcaccga tggccgacct gctgaacgag 2100 tttcctacac tcatcacaaa ctcaccgggt gtgcggatgc aatccgaggt ggtacatcgt 2160 atcgaaacaa cgggtcctcc aactttcgcc cgttcgcgtc gtcttccacc cgacaagtac 2220 caagcagcca aagcagaatt cgactcactc gtgcagctag gaatttgcag accgtccagt 2280 agcagttggg ccagcccgct gcatatggtt aaaaaggctg atggttcctg gcgtccgtgc 2340 ggtgattatc ggtccctcaa cgcgcgtacc acccctgacc ggtatccatt gccttatcta 2400 caagacttca cgatgcaact gggaggaaaa accgttttct caaaggtgga tctacagaaa 2460 gcctaccacc agatctccat ccatccagag gacattccga agacagcgat catcactcct 2520 ttcggcctat ttgagtacgt gacgatgccc ttcggattac gcaatgcggc gcagacgttc 2580 caacggctaa tccacgatgt gctacgcggc ttgaactttg tgtttcccta tatcgacgac 2640 attatagtgg cctcgaagac accagaagaa catcgcgaac acctgcgact gctatttgct 2700 cgcctcacgc agcacggttt gaccatcaac cttgcgaagt gcgagtttgc ccagcctgag 2760 atctccttcc tgggacaccg tgttacgtct gaaggcatcc taccactgga agagaaagtc 2820 gacaccatac gtcagttccc gaagccaaac actgtcatgg agcttaagcg atttttggcg 2880 atgatcaact tctacaggcg cttcatcccc cacgcgctgc gagcacaagg accgctcctg 2940 gagatgattc ccggcaacaa acggcgagat aagtcatcgc tgacgtggac cccagctacc 3000 gacacggcct ttgaggactg caaacggcaa cttgcacagg ctaccatgct ggtacaccct 3060 gtaccatccg ccgaactatc actgtggtgt gatgcttccg atttcgctgc cggcgccgcc 3120 ttacatcagg tcatcgacgg gcagatgcag ccgctcggtt tcttctcccg aaagttcgac 3180 aacgcacagc gccgttattc cacgtacgat cgcgaactag cagctgttta cttggccgtc 3240 cggtattttc gtcatcagtt ggaaggacgt tctttccata tttacacgga tcataaacct 3300 ctggtttatg ctttccgcca gtcgctggat aaggcctctc ctagacaagc acggcacctt 3360 gactttattg gccagttcac caccgacatc cgacatgttg aaggccagga aaacgtcact 3420 gccgacttgc tgtcccgaat cgaacccatc cagtcatcaa catccatcga ctacgagaaa 3480 ttggccgaag accagacccg agatcccgag ctcgccgaca tcctcagcgg gaagacaagg 3540 acggacctgg tccttcaacg agttccaata cctggcagca gcatggcttt gtactgcgat 3600 tgccccgccg gcatcattcg tccgtatgtc acgaaaccgt ttcgtcaaca actactacgc 3660 gcggttcatc agatgagtca tcccggagcg aaaactacca cgaagctgat gacggagaga 3720 ttcgtctggc tgaacatccg ccgagatacg cgagatttcg tccgccactg cttagcgtgt 3780 caggctacga aggtgcaacg ccatacccgc agccctctag gacgctaccc ggtaccggat 3840 gctaggttcg cacacatcaa cctggacctg gttggaccgt tcccgatcag caatggacat 3900 cgttactgcc tcaccatcat cgatcgcttc acacgctggc cagaagctat tcccatccct 3960 gacataacat caacgactgt agccgcagcc ctactcagtg gatggatagc tcgctttggc 4020 gttccatcgt tcatcaccac ggatcaaggc aggcagttcg agtcgacgct tttcgcggag 4080 ctgaaccagt tactgggcat caaacatcta cgcacaacgg cttatcatcc gcaggccaac 4140 ggcttgatag aacgatggca ccgtacgctg aaggcagcta tttgtaccaa aaacaccgct 4200 cattggaccg accaccttcc catcatactc cttggattgc gcacagcgta caaagacgac 4260 atcaaagctt cgccagcaga gctggtttat ggaagcacgc taaagatccc agcggagttt 4320 ttcaatagca gcccgatgac cagcctgcca gacaccaccg agttcaccaa gtcgcttaag 4380 tatgccatga acaccattcg cccaacacaa acggcctggc atgacaaagc aacacccttc 4440 gtccattcgg atcttcggac atgcagccac gtcttcgtcc ggaatgacac cgtccgccca 4500 gcccttaccc cgccgtacca agggccttac aaggtgctta ggcgtagtga caagtcattc 4560 gaagtcctga tcaacgagag ggcaactaac atctctattg accgcctgaa accgtgctac 4620 tcactccagc aacccgagcc agtgacacag cttacgccac catctgcagc agagtatacg 4680 ccaccaccac cagcgctacc accaccagct tcaccaccat cagcgccacc accaccagct 4740 tcaccaccac cgacagagcc gtcacctccg aacgacttct cgaccggagt tacgcgttca 4800 cagcgacgtg tcatcatccc cgttcgttac cggtaactcc agtctagcag gggagcac 4858 // ID RETRO5_AG_LTR repbase; DNA; ANG; 221 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO5_AG DE retrotransposon - a consensus. XX KW LTR Retrotransposon; Transposable Element; AGM1; KW Long terminal repeat; RETRO5_AG_I; RETRO5_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-221 RA Jurka J. and Drazkiewicz A.; RT "RETRO5_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 13-13 (2002). XX DR [1] (Consensus) XX CC Related to Moose LTR retrotransposon AGM1 from Anopheles gambiae. CC 5 bp target site duplication. XX SQ Sequence 221 BP; 67 A; 37 C; 50 G; 67 T; 0 other; tgttaggatt cattcataat ttaaattaac ggaatgagga actaccgttg tgcgggtttt 60 tgttagttgg gaacggatag gattgtgtcc gtcaggaacg acaggttcta gagaagggta 120 tataagaagt ctgaattttg tatcgtcctc tttttgcact gcaatttcgc tcaaataaaa 180 acacaacgta ggaaaacaga cccaatcttt tggtcccaac a 221 // ID R7Ag2 repbase; DNA; ANG; 6590 BP. XX AC AB090821; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon R7Ag2 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; R7Ag2. XX NM R7Ag2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6590 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090821; Positions 1 6590. XX FH Key Location/Qualifiers FT CDS 1762..2610 FT /product="R7Ag2_1p" FT /translation="MLQQQQQQQRQPQRQAVVGTQQQQQRRQQQQHQQRSN FT ATQAQRREQLRNEQRRPARLRQDQIIFEPAEGTSYKVLYEKIRLNPRLQEE FT NKGVHQGYRTTRDFLRLELKKDTDAASLLQRIQQEVGDLAAGRIVTEMAEV FT LITGIDMLAKKEDVERGLQRALERTAVAATTSLWERRDGTQRARVRLPRRD FT TDLLLDKRIVVGHSVCLVRSAPKQQQSAVRCFRCLERGHTTADCAGEDRSS FT LCLHCGAADHRAASCTSDPKCIVCGGPHRIAAPMCKGPPSQC" FT CDS 2526..6110 FT /product="R7Ag2_2p" FT /translation="MHFGPEVHCLRRPTSHRRPHVQRTAITMLNILQFNAN FT HCENAQDLALHVITTESLDVLLLSEPYCVPRNNGNWVTDESKTVAIVVNGN FT RLPIQRIRHRQTLGVVAADVGGTTIVSCYVSPQTGVPEFRSIMEKIDLIVR FT GCSRVLLAGDFNAMHLDWGSSRTCPKGLELLQLADNLGLVLLNKADCLPTF FT KGNRADTNRFPPSRPDVTFASSVISRLDPRDDSARGWRVPDVATLSDHRYV FT QYEVGESSPPTRDRAARRGQRPARVSKAGTRWKTSQFDSQLFGKALAMTGF FT ARQVNSVESLVESLTSVCDETMSRVFPTQDHTGRPAYWWTPAIQAMIDNLS FT RKEQMTMRTIPPEEQLQTELLAARESLRKAIRLSKNEAFDRFLRSIREDVT FT GIFFRKVFHWFQGARSAPERDPAELRRIVDALFPVHPPVEWPDLGVGNMAP FT LRSIGLTELDQIAASMHPRKAPGLDGVPNAALTVALRQQPEPFRRVFQECL FT DMSCFPQPWKKQRLVLLPKPGKSPGEPSSFRPICLLDNTGKALERLLLNRL FT NEYIEDPESPQLSEQQFGFRRGRSTLQAIQQVVDAGRRALSLGRTNNRDRR FT CLMVVALDVRNAFNTASWQSIAEALQAKGVPVQLCRILQDYFADRELEYDT FT ADGPVTRRVSAGVTQGSILGPTLWNIMYDGVLAVELPEGASIVGFADDLAI FT LAAGTIPEHAAAIAEEAVAAVNNWMVQHKLSLAPEKTELLMISSKRSGYRN FT IPVNICGVEVRSKRSIRYLGVMLHDHLSWRPHVEMVADKALRVVRALRGIM FT RNHSGPQVSKRKLLAAVAASIIRYGAPVWTEATDLQWCRRILDRVQRLLAQ FT GITSAFHSTSCEVAVVLAGELPYHLLAKEDARCYNRQQSSPDSSREAIRQE FT EKETSLQLWQQQWDDVAANNTSRYLRWAHRQVPDVRLWTGRKHGEVDFYLS FT QVLSGHAFVHEFLHVFGFAPSPDCPRCAGSVESVAHVMFECPRFADVRAEF FT LQGVGEHNLGSRLLESAEWWDRIQQAARRILSVLQEDWREEQQTLAAAEAA FT QPDPASSLPEDMAEAERRLLRRREVRNRSAQRRRQQQRQQRLGDFELVPAV FT LARAAANEAAEPTGEVEEEEVSPPVPPIPPRSRRLPPSPRTTEMRRRRRNY FT MQLQYRRRRRDGELGDVPQGRQRRGRIPTSAAELER" XX SQ Sequence 6590 BP; 1616 A; 1777 C; 2009 G; 1188 T; 0 other; agttgcaact gagagttcaa accgaacaga cgtgcctcca cgcagttccg tgagcaagtg 60 aaaaaattgc aagtaaaatg caataaaacc gtatgaattg tgcatattcg tgtagatgaa 120 tgcaaagtga aacactgtgt gtgaaatttg gatcaaaaac agtggataga aacgttagaa 180 aagtgtttgt aatattctgt gacgtcacac cataagtttg tatggagaaa actttgtgcg 240 tcaaataaac aggttactac atatatttcc ggaatcaaaa gtgatcaggc gatagatctc 300 gccgggatct ttacagtgac ggttttcctg tggaaaaata ttaaatggaa agtggtaaaa 360 agtgcatcag aaagttattt aattcggaat ttatcgcgtg ggtaaaatgg cgtctgcccg 420 tgaaaatttt tattttcggg ttgggccagt gtgtcaaaag tgacaaggct tggtaaccgc 480 ctgacaccgt tcggccccgc cattttcgcc gtcgttgaag tgtcaaactg aattagtgaa 540 acattacaag aacaaattca aatggtccgt tgaaagtgca gtaaaagaag aaagttttcg 600 tgaaagtttt caaaaaaatt caaaaaaaaa ggcgttgcga gtgaaattgt ggtgaagtgt 660 gtgaaatatt tccgaccgtt aatttcaaaa ccaaccgttg gtgaactttt cggggccagt 720 tttgaaatta acggccaagt gcgcggaaag tttcacggcc aagacggaaa gggccgttta 780 tagtgtggtg tgtgagttga actatacacg gacatacgtt ccggccgaaa tatagtgagt 840 ggttgaagtg ttaaagcggt aggcacacgt gctgaaacaa atttgaaacg acggttggaa 900 gcagttcgac agcgttacat agcgagcgga tttgcagcgg tacgcccaag ttgcagcggt 960 acggcggagg tgatatgggc aacgacatct gttggcaata acaacagccg agacagaaac 1020 cagctccccg ttcggttggc taacgcggtg ccttctattg cccgcaagca gaaaaataaa 1080 ctgctgccac ctgcggtaga cgagtaaaat ggttgcaaaa ctctaaaata aatggatcgc 1140 aacttaaggc ctcggcagcc gtctgtcgat tgtagcttgg aagcgattcc aaagaagacc 1200 acagtttctg cgaaagtcga cactggacgc acgtccatga cggcgaagat gaacgaggag 1260 aaagaaatcc gcctgcacct catgctgcag gcggagaagg ccgaaaaggc ggaacttatg 1320 aagacggtgg caagtcttca agccacaata gagggccttc agaagcagct ggcggaggaa 1380 cagcaggcca ggctttccgc cgatgcggag ctaagaagga acgggatgca atgctggccg 1440 agatccgaga cctggggcag taaatgcgcc gggagctatc ctggaaactc ggccaacagc 1500 aacagttgca gctgcaatcg caacctggtc cgtcagggac agcggcggta aagctcccgg 1560 accaccaggc gccaacggca gaaggcattc ggcaagcgga gcagctcagc ttcgctgagg 1620 tcgtccgccg caaatttcgc ggcatggcta aggggaaact ccggcagccc cctcagcagc 1680 atcagcagca gcagcaggaa cagcagcagc tgcaacaaca gcagcagcag cggtacccac 1740 agcgacaggc cggtcgctgg catgctgcag caacaacagc agcagcagcg gcagccacag 1800 cgacaggcgg tcgtcggcac gcagcagcag caacagcgtc ggcagcagca acaacaccag 1860 cagcggtcaa atgccacgca ggcgcagcgc cgagaacagc tgcggaacga acaacgacgt 1920 ccagcgcgcc ttcggcagga ccaaatcatc ttcgagccag ccgaaggtac atcctacaag 1980 gtgctgtacg agaagatacg cttgaacccg cgcctacagg aggaaaacaa gggagtccac 2040 cagggctacc gtacgactcg ggacttcctc cgactcgagt tgaagaagga tacagacgcc 2100 gcttcgttgt tgcagcggat ccaacaggaa gttggcgact tggccgctgg ccggatcgta 2160 acagagatgg ccgaagtctt gatcacgggt atcgacatgc tggccaagaa ggaggatgtg 2220 gaacgcggtc tgcagcgggc gctggagcgc acagcagttg cagcaactac atctttgtgg 2280 gagcgtcgcg acgggacgca gcgtgcccgt gtccgactgc cacggaggga cactgacctc 2340 ctactggata agcgcatcgt ggtcgggcat tcggtgtgcc tggtgcgcag tgccccgaaa 2400 cagcagcaaa gcgcggttcg ctgtttccgc tgcttggagc gcggccatac cacagcagac 2460 tgcgctggcg aggatcgatc cagcttgtgc ctgcactgcg gagccgcgga tcatcgcgcg 2520 gcgtcatgca cttcggaccc gaagtgcatt gtctgcggcg gcccacatcg catcgccgcc 2580 cccatgtgca aaggaccgcc atcacaatgt tgaacatcct gcagttcaac gcaaatcact 2640 gtgagaacgc ccaggacctg gcgttgcacg tgataaccac cgagagtctc gacgttctgc 2700 tcttgtccga gccctattgc gtaccgcgca ataacggcaa ttgggtgacg gacgagagta 2760 agacggttgc catcgtcgtc aacggcaacc ggctgccgat acagcgtatc aggcaccgtc 2820 agaccctcgg tgtggttgct gccgacgtgg gaggaacaac aatcgtgagc tgctacgtgt 2880 cgccgcagac gggagtcccg gagttccgga gcattatgga gaagattgac ttgatcgtcc 2940 gcgggtgcag ccgggttctc ctggccggcg atttcaacgc catgcacctt gactggggaa 3000 gcagcaggac ttgcccgaag ggcttggagc ttctccagtt ggcggacaac ctcgggctgg 3060 tgctcctgaa caaggcggac tgcctaccta cgttcaaggg gaaccgagcg gacacaaatc 3120 ggttcccgcc cagccgaccg gacgttacat tcgccagcag tgtgatcagc cgcctcgacc 3180 cgcgcgatga cagcgcccga ggctggcgcg tgcctgacgt ggcaacgctg agtgatcacc 3240 gatacgtcca gtacgaggtt ggcgagagtt caccaccaac gagggatcgg gcagcacggc 3300 gtggacagcg cccggcccgt gtaagcaaag ccggtacgcg ctggaagacc agtcagtttg 3360 actcccagct tttcgggaaa gcgctggcta tgactgggtt cgcccgtcaa gttaacagcg 3420 tcgaaagctt ggtcgagtcg ctgaccagcg tctgcgacga gacgatgtcg cgggtcttcc 3480 caacgcagga ccacacaggt cggccagctt actggtggac tccggcgatc caggcaatga 3540 tagacaacct ctccagaaag gagcagatga cgatgaggac aatcccgccg gaagagcagc 3600 tccaaacgga actcttagct gccagagaga gccttcggaa ggctattcgt ttgagtaaga 3660 acgaggcgtt cgaccggttc ttgcggtcga tcagggagga cgtaactgga atttttttcc 3720 ggaaagtctt ccactggttc cagggagccc gctcagcccc ggaacgtgat ccagcagaac 3780 tgcggcgaat tgtcgacgcc ttgttcccgg ttcatccgcc ggtcgagtgg cctgatctcg 3840 gagttggcaa catggcgccg cttcgatcga tcggcctgac cgagctggac caaatagcag 3900 ccagcatgca cccgcggaag gctcctgggc ttgatggagt gccgaacgcc gctcttacag 3960 tcgctctcag gcagcaaccg gagcccttcc ggcgggtgtt tcaggagtgc ctggacatgt 4020 cctgctttcc acagccgtgg aaaaagcagc gactggtgct cctgccaaag ccaggcaaat 4080 caccgggcga gccgtcgtcc ttccgcccga tttgcctgct ggacaacacg ggcaaggctt 4140 tggagcggct gctattaaac cggctcaacg agtacatcga agatcctgag agtccgcaac 4200 tgtctgagca gcagttcgga ttccggcgag ggcgatcgac tctgcaggcc atccagcagg 4260 tcgtggacgc gggacggagg gcgttgtcct taggcagaac caacaaccgt gaccggcgct 4320 gcctcatggt tgttgccttg gacgtccgca acgcgttcaa tacggccagc tggcagagca 4380 tcgccgaggc gctccaggcg aagggtgtcc cagtgcagtt gtgccgcata ctgcaggact 4440 acttcgccga cagggagctc gagtacgaca cggcggacgg gccggttacc cgccgagtat 4500 cggcaggtgt tacacagggg tccatattgg gccccacact gtggaacatc atgtacgacg 4560 gcgtgctagc cgtagagctc cctgagggcg cctccatcgt gggattcgcc gacgatcttg 4620 cgattctggc agcgggaaca atcccagaac acgccgctgc aatagcggag gaagcagtag 4680 cagcggtcaa caactggatg gtgcagcata aactttctct ggcgccggag aagacggagc 4740 tgctgatgat ctccagtaag cgcagcggat atcgtaacat cccggttaac atctgcgggg 4800 tggaagtgcg ctcgaagcga tcgatccgtt acttgggggt catgctacac gaccacctat 4860 cgtggcgccc acacgtcgag atggtcgcgg acaaggccct ccgtgtggtg cgagcattgc 4920 gcggtatcat gcgcaaccac agcggccccc aagtgagcaa gcggaagctg ctcgccgcag 4980 tggccgcatc cattatccgc tacggcgcac ccgtctggac ggaagccacg gacttgcagt 5040 ggtgcaggcg gatattggat cgcgtgcagc gcctcctggc ccagggcatc acgagcgcct 5100 tccactccac gagctgcgag gtagcggtag ttttagccgg agaactgccc taccacctcc 5160 tggcgaagga agacgcccgc tgctacaatc gacagcagtc cagcccggac agcagtcgag 5220 aggcgattcg ccaggaggag aaggagacat cgctgcagct gtggcagcaa cagtgggacg 5280 acgtggcggc aaacaacacc agccgctact tacgttgggc ccaccgacaa gtgccagacg 5340 tgcgcctgtg gacgggacgg aagcacggag aggtagattt ctacctctcg caggtactta 5400 gtggacacgc cttcgtccac gagttcctgc acgtgttcgg gttcgctccg tccccggatt 5460 gtcccaggtg cgcagggtcg gtcgagtcgg tggcccacgt aatgttcgag tgtccacgtt 5520 tcgcggatgt ccgggcggag ttcctgcaag gcgtcggcga acacaacctc ggcagccgcc 5580 tgttggagag tgcggagtgg tgggaccgca tccagcaagc ggctcggcgg atcctctccg 5640 ttctgcagga ggactggcgg gaggagcagc aaaccttggc agcagctgag gctgctcagc 5700 cagatcctgc atccagcctg cccgaggaca tggcagaggc agaacggaga ctgctgcgac 5760 gtcgcgaggt gcgcaaccgc agcgcccagc ggaggcggca acagcaacgg cagcagcgac 5820 tcggcgactt tgagctggta ccagccgtgt tggcacgcgc agcggccaac gaagcagcgg 5880 agccgacagg ggaggtcgag gaggaggagg tcagccctcc agtaccacca atccctccgc 5940 gcagccggcg attgccaccc tccccacgaa caacggagat gcggcgacgt cggcgcaact 6000 atatgcagct gcagtatcgg aggaggcgtc gggacggaga gctcggagat gtcccgcagg 6060 gtcgccagcg acgtgggcga ataccaacat ccgcagcgga gctggagcga tgacggctca 6120 acaggcgaca gagggagaga caacggcggc tggagcaacg gcaagcggag gtcgaggctc 6180 agcctccccc gtggcgagct gccgaatgag ctgccgagct ctcagctggc cgaaagaaga 6240 agcagcctca cagacgtcga gagagcggcc ataacatcgg caaatacttc cgcacgttga 6300 ggaatattgg aatatgattt ggagacacct gctaggaaac ggaaaacgtt aaaggcttac 6360 ggaacgacat ttttgttttg cgaaaaagga tatcttcctc tgatcatttg gggaagtaca 6420 aaaactaagt gaactaaaat actatagttt aaataaagag aaaggcccaa atgagcgaaa 6480 ttcccggcgg ggtggagtcc attagtggta accccccgct ggtaggagcg ggttttcttg 6540 gaagcagtta gtaacaccaa taaagataac ccaatgaatt aaaaaaaaaa 6590 // ID CR1-4_AG repbase; DNA; ANG; 4411 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 19-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE CR1-4_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW AP endonuclease; CR1 clade; CR1-4_AG; DNA/RNA-binding; PHD finger; KW reverse transcriptase. XX NM CR1-4_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4411 RA Kapitonov V.V. and Jurka J.; RT "CR1-4_AG, a subfamily of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 15-15 (2003). XX DR [1] (Consensus) XX CC CR1-4_AG is a young family of CR1-like non-LTR retrotransposons. CC The CR1-4_AG consensus sequence was reconstructed based on CC multiple alignment of ~50 copies identified in the CC sequenced portion of the genome. Given the ~1% divergence CC of these copies from the consensus sequence, transposition of CC CR1-4_AG occurred less than 1 million years ago. CC The 3' terminus of CR1-4_AG is composed of the TAAA CC microsatellite. CC CR1-4_AG encodes two proteins: a 349-aa CR1-4_AG-ORF1p CC (positions 363-1409) and 965-aa CR1-4_AG-ORF2p (positions CC 1430-4324). CR1-4_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 3-40). CR1-4_AG-ORF2p is composed of CC the AP endonuclease and reverse transcriptase domains. XX FH Key Location/Qualifiers FT CDS 1430..4324 FT /product="CR1-4_AG-ORF2p" FT /translation="MEPVLNTFNIFYQNVRGLRTKTSECFANTAIADWDVI FT VLTETWLDDSFPSELLFDNNRFNTFRTDRSAANSNKCRGGGVLVAINANYA FT SSLCSTNTSTIECLWVRVKVLNVSLIIGSFYLPPDQSANMDTINAFCNSLH FT LTREKYKNDFFILFGDFNQPNLKWDINGKFPTLNLMLTRLSPTSQALLDEL FT SFEGLRQLNTVLNHNNNMLDLVFANDKVTDYMRPIELCIESIVEPDGHHPA FT LLTYFTLPQYSVPSSKPPRQADFNFRRTNFTDLVSALNQINWDSIADHDDI FT NDSVAEFSSQMNELYEQFIPRFNVRAHPPWTNSALRLAKRRRSRALKKLHR FT LKNSTNQINFARASKIYKQLNRTAYANYVRKIEINIKRHPTSFWKFAKDKE FT SCGRLPSSMQFEGNTITGDEEFCNAFASYFSSVYTNNSSVPSNSTSALSFI FT NDEVNLCTPLINDDEVESAISLLKLSYAPGPDNIPSAILINCKAALIPILT FT KLFNKSLQSKCFPRLWKSSWMFPVYKKSDKSNVCNYRGISMLCACSKLFEK FT IMSRHMLQAFSPLISNVQHGFMPKRSIETNLIYLLNFCHSYIDKGLQVDVI FT YTDFCAAFDKVNHFLLLSKLSKYGVHTNVVEWLRSYLTDRCINVKIGTSLS FT ATFHNLSGVPQGSILGPLLFIIFINDVVFAIPHVKLLLYADDLKMFLPVKY FT SDDCEMLQDSLNYFSAWCFNNEMLLNVSKCSCITFSKKKNPIIYNYKINED FT SVPRFSQVRDLGVILDSKLSLSSHYQTIVTKALKLLGFVLRVSADFKDPFS FT LKTLYCSLVRPILEFASVVWCPHQITYIDKIEKIQKKITRVMFHRLPWSNQ FT IPRPSYNVRCLLFGLETLQHRRTTAQITFMHKLLIGDFDAPDILNFICFST FT PSRGLRSRELLRSPFRSTGFGANDPLLKMIDVYNRLGLSADFNQSVSQLRQ FT HIQVSSRAII" FT CDS 363..1409 FT /product="CR1-4_AG-ORF1p" FT /translation="MDCAICSTTINKDPVVCIGNLPFSECNSAFHPECIKL FT AATCVKEVARNRGLCWMCEKCRDSRSDLFSSISCLMNTLKDELKNAIRSEL FT DQRISQLDPNRVMPQEREKIAPTVSTISLTDKTFHTSTDTPMSPTPTPVKQ FT NSLPHSQSRMLSDNEHINPTQAILHTGTANDHINTDTIQFIPAPEPKVWMF FT VTRIAPTVTEENMKMFILGRLKCTDCSVKCVIPRGRVTSSLKYVSFKIGIP FT SEFGELAFSPSTWPCGFVYRQFEFHQRTQKQFTPTLPVSCFPASNNSTARS FT FSTTNFMHNDVNCINVIPPTHTTQHSPSPSHHLKNANSPETHLTQNNSSGS FT TFLSQH" XX SQ Sequence 4411 BP; 1287 A; 967 C; 779 G; 1377 T; 1 other; cggggagaga gttggccact gaagcgtttg gacgtgtttc tgcgttgctc ctgtgtactg 60 tctgtttttg gtgtgatttt ggtttgaaat cgttcgtttt tgagtttgtg aagctgttcc 120 tgtcgcaagt tttgctgtgt tgctgtgttg aagcattttg ttcctgcatc gtacgagaat 180 ttctacgaga acgcgtgcct gaatttgtat tttctgcgtg atttgtgttt catcggcgac 240 agcggttcag tgttattgac cctgcttaat acactcagag agtaattcgc accagctcat 300 caataagtgt cctctccgtt cactctcact gctcacctaa gtgatcgata ctctctctca 360 taatggactg tgcaatctgc tctactacta tcaacaaaga tccggttgtt tgtattggca 420 atcttccttt ttcggagtgc aattctgcct ttcatccgga atgcattaaa cttgccgcca 480 cttgtgttaa ggaggtggca cgcaatcgtg gtctctgctg gatgtgcgag aaatgccgtg 540 actcgagatc agatttattc tcatcaattt cgtgcttgat gaatacactg aaagatgagt 600 tgaaaaatgc aatacggagt gagcttgatc aaaggatatc tcagctggat ccaaatagag 660 tgatgccgca agagcgtgag aaaatcgctc cgactgtgag taccatctct ctcaccgata 720 aaacattcca cacatccacc gatactccta tgtcaccaac accaactccc gtcaaacaga 780 actcactacc acactctcaa tcgcgtatgc tctctgataa tgaacacata aatccaacac 840 aagcaattct tcacaccggc actgccaatg atcacatcaa cacagacacc atacaattca 900 ttcctgcacc tgagccaaaa gtttggatgt ttgtaactag gattgcaccc actgtaactg 960 aggagaacat gaaaatgttc attctaggaa ggttaaaatg cactgactgt tcggtgaagt 1020 gtgtcatacc aagaggtcgc gttacaagct cactaaagta tgtgtccttt aagatcggca 1080 ttccatcgga gtttggcgag ctcgctttct ctccttcaac ttggccatgc ggatttgttt 1140 accgccagtt tgaatttcac caacgtacac agaaacaatt cacaccaaca ctcccggtat 1200 cttgctttcc tgcttcaaac aattcaactg ccagaagttt ctcaactact aattttatgc 1260 ataatgatgt caactgcata aatgtgatcc caccaacaca cacaacacaa catagtccat 1320 caccgagcca tcatcttaaa aatgcaaact cacctgaaac tcacctgact caaaayaatt 1380 cctccggttc gactttttta agccaacatt aagtcattca actgtaccta tggagccagt 1440 actaaacaca ttcaatatat tttatcaaaa cgtgagaggc ttacgtacta aaacctctga 1500 atgttttgct aatactgcaa tcgccgactg ggatgttatc gttcttacgg aaacgtggct 1560 agatgacagc tttccatctg agcttttgtt tgataacaac cgctttaata cattccgtac 1620 ggatcgttct gcagcaaaca gtaacaaatg tagaggtggt ggtgtcctcg ttgcgatcaa 1680 tgcaaattat gcttcctctc tgtgctcgac aaacacatct acaattgaat gcctttgggt 1740 tcgagtaaaa gttcttaatg tttcgctaat catcggatca ttttatttgc ctcctgatca 1800 atcagccaac atggacacca taaatgcatt ctgtaattca ttgcacttaa cgagagagaa 1860 atataagaat gactttttta ttctgttcgg ggacttcaat caacctaatc ttaagtggga 1920 tatcaacggc aaatttccga cactgaatct tatgcttacc cgactatccc ctaccagtca 1980 agctctactc gatgaactaa gctttgaagg tcttcgccaa cttaacactg ttctaaacca 2040 caataacaac atgttagatc tagtgtttgc gaacgacaaa gttactgatt acatgaggcc 2100 aatcgaatta tgcatcgaaa gtattgttga acctgatgga catcatccag ctttacttac 2160 atacttcacg ctcccgcaat acagtgtgcc gtcatctaaa cccccacgtc aagcagattt 2220 caattttaga cgaactaatt tcactgatct cgtttctgca cttaaccaaa tcaactggga 2280 ctctatcgct gaccatgacg acatcaacga cagtgtggct gaattttctt ctcaaatgaa 2340 tgaactgtat gaacaattca tcccacggtt taatgttcgt gctcatccac catggaccaa 2400 ttctgcacta cgtttggcta aacgtcgaag atcacgggcc cttaaaaaac tgcatcgact 2460 aaaaaacagc actaatcaaa taaattttgc tcgcgcctca aaaatatata agcagctgaa 2520 ccgaaccgcc tacgccaact acgtcaggaa aattgaaatc aatatcaaaa gacatcccac 2580 atcattctgg aaatttgcta aagataagga atcctgtgga cgacttcctt cctcgatgca 2640 gtttgaggga aacaccatta ccggagatga agagttttgc aatgcatttg cctcatattt 2700 ctcttctgtg tacaccaaca attcctcggt accgtctaac tcaacatcgg ctttatcctt 2760 cataaatgat gaggttaatt tatgcacacc actaatcaac gatgatgagg tggaatctgc 2820 tatttcgttg ttgaagcttt cctacgcacc tgggcctgac aatattccca gcgcaattct 2880 cattaactgc aaggctgctc tcattcccat actgaccaaa ctattcaaca aatccttgca 2940 atcgaaatgc ttcccgcgtc tttggaaatc atcgtggatg tttcctgttt ataaaaaatc 3000 tgacaaaagt aatgtgtgca attatagagg aatttcaatg ttatgtgcgt gcagtaaact 3060 gtttgaaaag ataatgtctc gtcatatgct ccaagccttt tcaccattaa tttctaatgt 3120 tcaacatggc tttatgccga aacgatcgat tgagaccaat ctaatatact tacttaactt 3180 ttgccactcc tatattgaca agggcttaca ggttgacgtc atttatactg acttctgcgc 3240 tgcttttgac aaggtcaacc actttctatt gctatctaaa ttatcaaagt atggtgttca 3300 cacaaatgtt gttgagtggt taagaagtta tttaactgat cgttgcatta acgttaagat 3360 aggaactagt ttatctgcca cattccataa tttatctggt gttcctcaag ggagtatatt 3420 aggaccttta ctatttatta tatttattaa cgatgtcgtg tttgctattc ctcatgttaa 3480 attgttatta tatgctgacg atctaaaaat gttcctacct gttaaatatt ctgacgactg 3540 tgaaatgtta caagactcgt taaactattt ttctgcatgg tgttttaata atgaaatgtt 3600 acttaatgtt agcaaatgta gttgtataac attctcaaag aaaaaaaatc ctattattta 3660 taactacaaa atcaatgaag acagcgttcc acgtttttct caggttcggg acttaggagt 3720 tattttagac agtaagctta gtttatctag ccattatcaa actatcgtca ctaaagctct 3780 taaactgtta ggatttgtct tacgtgtctc ggccgacttt aaagatcctt tcagtttaaa 3840 aacattatac tgttctttag tccgtcccat cctggaattt gccagtgtag tttggtgtcc 3900 ccatcaaatc acatacatag ataaaattga aaaaattcag aaaaaaatca cacgtgttat 3960 gtttcatcgt cttccctggt caaaccaaat tccacgtcct tcgtataatg ttcggtgttt 4020 actatttggc ttagagactt tacagcatag gagaactacg gctcagataa cttttatgca 4080 taaattatta attggagatt ttgatgcacc tgacatttta aattttattt gcttttctac 4140 tccctctaga ggtcttagaa gtagagagct actaagaagc ccttttagat cgactggttt 4200 tggtgccaat gatccactgc tcaaaatgat tgatgtgtat aataggttag ggttatcagc 4260 agatttcaat caatccgtta gtcagctgcg tcagcatatt caagttagtt ctagggccat 4320 aatttaaact tgtactctgt aagcattaag ttatttaggc atactgcccg ataactttga 4380 tttaaataaa taaataaata aataaataaa t 4411 // ID GYPSY40-LTR_AG repbase; DNA; ANG; 167 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY40-LTR_AG is an LTR of retrotransposon GYPSY40_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY40_AG; GYPSY lineage; GYPSY40-I_AG; GYPSY40-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-167 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY40_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 73-73 (2004). XX DR [1] (Consensus) XX CC GYPSY40-LTR is a long terminal repeat of GYPSY40_AG (its internal CC portion is deposited as GYPSY40-I_AG). XX SQ Sequence 167 BP; 53 A; 43 C; 34 G; 37 T; 0 other; agttatatgc gtggaacaca aatacaggtg cgacatcact gcaacagcct cgcacccatg 60 ccgatacgat aggagccgat tccgtcgcgt cagatcggta tcagcatcca agcgataagc 120 acgcaataat cattttaaat aaaagctgtg gttccatact cataact 167 // ID MSAT2_AG repbase; DNA; ANG; 106 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Mini-satellite type DNA - a consensus sequence. XX KW MSAT; Satellite; Simple Repeat; Nonautonomous; MSAT2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-106 RA Jurka J.; RT "Minisatellites from African malaria mosquito."; RL Repbase Reports 9(2), 638-638 (2009). XX DR [1] (Consensus) XX SQ Sequence 106 BP; 42 A; 16 C; 12 G; 36 T; 0 other; tttaaagatt catgaatctc taaagattca cgaatcttca aagattcatg aatctttaaa 60 gattcacgaa tcttcaaaga ttcattaatc tgtaaagatt catgaa 106 // ID HATN4_AG repbase; DNA; ANG; 762 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE HATN4_AG is a hAT-like nonautonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW 8-bp TSD; HAT1_AG; HATN4_AG; nonautonomous DNA transposon; KW hAT superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-762 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "HATN4_AG: a family of nonautonomous hAT-like DNA transposons RT from African malaria mosquito."; RL Repbase Reports 3(3), 59-59 (2003). XX DR [1] (Consensus) XX CC HATN4_AG is a family of nonautonomous DNA transposons that CC belongs CC to the hAT superfamily. HATN4_AG copies are ~5% divergent from CC the CC consensus sequence. HATN4_AG has 15-bp terminal inverted repeats. CC HATN4_AG has imperfect 11-bp terminal inverted repeats (3 CC mismatches). CC The nonautonomous HATN4_AG elements are related to the HAT1_AG CC autonomous transposon. These elements share the ~85% identical CC 450-bp 3' and 120-bp 5' termini, respectively. CC The genome harbors ~100 HATN4_AG elements. XX SQ Sequence 762 BP; 265 A; 178 C; 125 G; 192 T; 2 other; tagagttgtg cctcaagtac cagaactgta cgattcattc aattcatcct aaagaacgaa 60 cgatctacgt tcatctaatc aacgttcatc gattctttat aaatcataat gcactgagtt 120 ggtcaccaaa ataacgtcct aaataacgcc cgccaaaata acgaacgtta tagtacgcca 180 ggtcgttcat tttccccata caaatatgaa cgtgaaccgt atgaccagga cgttcacgaa 240 aagaacgacc tatccgtcga aataaagaac gtcctagtac gctaggtcgt tcattcccgt 300 catacaaata tgaacgtgaa ccgtaatacc aggaccatgt aatgaacgtc ctacacgtcg 360 aaatgaagaa cgtcctagta caccaggtcg ttcattttcc caatacaaat atgaacgtga 420 accgtatgac cmggacattc ackaaaagaa cgacctatcc gtcgaaataa agaacgtcct 480 agtacgctag gtcgttcatt cccgccatac aaatatgaat gtgaaccgta ataccagtac 540 gttcatgaaa agaacgtcct accaatcgta ataaggatct tcctggtctg cctaataatt 600 ttcaagatgc aaaacaattt acaatttgca atacaataaa acaatttacc atattttctt 660 caactgaaat gaacgaatga atcgcgattc acgttcagtt catctgaatg aaagaactag 720 tacgttcacg ttcatggaaa tgaaccgaat tgcccaagcc ta 762 // ID GYPSY68-I_AG repbase; DNA; ANG; 5536 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY68-I_AG is an internal portion of retrotransposon GYPSY68_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY68-I_AG; GYPSY68-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY68_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5536 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY68_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 175-175 (2004). XX DR [1] (Consensus) XX CC GYPSY68_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, CC GYPSY64_AG, GYPSY65_AG, GYPSY66_AG, GYPSY67_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY68-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 345-aa CC GYPSY68_AG1p gag-like polyprotein (pos. 958-1992) and the CC 1024?aa GYPSY68_AG2p pol-like polyprotein (pos. 2430-5501). CC The sequence of the LTRs flanking GYPSY68-I_AG is deposited as CC GYPSY68-LTR_AG. XX FH Key Location/Qualifiers FT CDS 958..1992 FT /product="GYPSY68_AG1p" FT /translation="MAGWTVENNLECALEVRSESKEKDYKGDFLAGSWPLK FT PFSEELPIHERKAEWARFKGQYERIVMCKGKVNSEMKLIALKVFAGSYLLN FT IIEMQERRVKSGEGDVYTETVKGVDDFFNGICDEGKERMKLREMKMEADET FT FGDWVLRLEAQVKFCGLPEEQKTEELVQAVLRRSVPQIAEKLFEMADLFGN FT ELDKLVKHGKHLDFVRMEKREIEQKLAKSTTKEFTDSDNYVAAVTSWRTKR FT ERYEPYEQQQNKRLSGMHSNGFRNTYNKGATKCSKCDKIHGWGKCGASRSK FT CFRCGAIGHLAICCRTRLGSNRFPNVVKKQTEDINQVVIGDPAAERYQKKK FT NK" FT CDS 2430..5501 FT /product="GYPSY68_AG2p" FT /translation="MIGNVPIIFLLDTGASVNTVTAESWWQIRTHAQSSVR FT EWQPHPETTLRSYANSAVLDVECSFKAIICAGGTNRTMAKFFVVKGAEISL FT LSFNTAVSLGLVCIGPGKHYMLRNKTQETTVKDTFPKLNIGAVKFRIDKTV FT TPKQIIRYNIPKAFERAVNERLEEMELNGIIEYVDSEDAEISFVSPMVLVP FT KGNKDFRIVIDYREANKAIIRDPYPMPSLERIWTDIPNNDGKILFSKIDLK FT DAYFHIELHSEVRHLTAFMTMNGLMRFKRLPFGLSCAPEIFQKEMEKVFRN FT CKNVLIYLDDILMYGKSLKELKEIENMVREVIKKNGLTINEEKSCYGQEKV FT NFLGLTLDGEGILPMREKMTAIQNFDRPKDAGELRSFLGMLTFIGPFIRDF FT SHKTKPLRDLIKRDKFKWEEMHQNTFEELKKVAMEDLIKRGYFNEKDRIIL FT YTDASPWGLGAILAQEKLSTSEMRIIACASKGLTETESRYPQLHREALAIV FT WAMERFAYYLLGRKFLLRSDSEALKFMIKPAKQKDIGKRIMSRAEGWFIRL FT DHYDFDFEHVPGNENIADTASRLCRSKEDYQFGVAKEPHELCLVTACVNQV FT NETLLALTTEEVGKELEKDVILNKVIAWLDKENEWPQEIARYKAFQRELYV FT ENGFLFKQEKMVLPHILRNRALTLAHRSHPGMSTMKNVLRTGLWWPGMDKE FT IEAFVRSCPECQLVKTTNTAVPIELTELPENPWDYVSMDISSTTDNVKTLV FT LTDNYSRFLIAVPLERTDANSIQKALNRIFLTYYVPKKLKADNGPPFNAVD FT FKRWLADVWGIKLINSTPLNPTENGLVERGMQGINKIAAIARLEKKCWKQA FT LAEYVADYNSWPHHVTKIAPAELMFGRAIRHRLPNPKTDTRQANDDELRDR FT DKMAKFQRNQREDTRRGARPLSIKPGDTVLVRNQKADKTDSSYKKELHEVL FT KICGAGRVTVKEKKSGKIYDRNVKHIKKFFMRKAEEGLTSGSSSSEEVAGL FT VKERQARLIKKPKRLIEY" XX SQ Sequence 5536 BP; 1925 A; 859 C; 1354 G; 1398 T; 0 other; ttggcgatcc tgccatgatg aataaaagcg ctcctaggtt gaagaaaata aggagtgata 60 aaaaaaaaaa ggagaggaag gataaaaaca ggcgatataa aggataaacc gagtggtagc 120 ctggtgcatc tcatggcccc tccaaaaagg tgtaagcccg agtggtagcc aagttcagct 180 catggcccct cgaagaatgt gtaagccaga gtggtagccg agtttagctc atggcccctc 240 gaagaaaatg tatacgcgag tggtagccga gttcagctca tggcccctcg aagaatgtgt 300 aagccagagt ggtagccgag tttagctcat ggcccctcca agaaagtgta agccagagtg 360 gtagccgagt ttagctcatg gcccctcgaa gaaaatgtat acgggggtgg tagccgagtt 420 tagctcatgg cccctcgaag aaagtgtaag ccagagtggt agccgagttt agctcatggc 480 ccctcgaagg aaatgtatac gcgggtggta gccgaattca gctcggctga aaaaaaaaaa 540 aaaaaaaaaa aacataaagc ggtatatcgt tttttttttg tttcgcgttc ttcttcttct 600 tctttttctt atttatttca gccagcagaa cccggctgac cggtctggta cagctgcccg 660 atattcaaca ataacaattc tatggaatac aatggatggt cggtcactgg tttgcttaga 720 tggattcaac cgaacccggc tgttgtccgg cgtgatggcg tgccgtccaa gatttggcaa 780 aaacaattcc atggaattca atgccatggt aaacaagtgt gattgtataa gtgacgatcg 840 cacacatagc attgcaacgg gtgggctggt ggcaacctga gctatcagga tcagtcgctt 900 ttggtgtagc aagcggacag catcgtcaca gaacaagcac ggaagtggtt tagaaagatg 960 gcaggttgga cagttgagaa caatttggag tgtgccctgg aagtgcgtag tgagagcaag 1020 gaaaaagatt acaaggggga ttttttggcc ggttcttggc cgcttaaacc attttcggaa 1080 gagcttccta tccacgagcg gaaagcagag tgggcgcgtt ttaaaggaca atatgagaga 1140 atagtgatgt gtaaaggaaa agtgaattcg gaaatgaagt tgatagcttt gaaagtattt 1200 gctggaagct atctgctaaa cattatagaa atgcaggaaa gacgagtgaa aagtggtgaa 1260 ggggacgtgt atacagaaac agtgaaaggg gtagacgatt ttttcaacgg catatgtgat 1320 gagggaaaag aaaggatgaa gctgagagaa atgaaaatgg aagctgatga aacatttggt 1380 gattgggtgt taagattaga agctcaagtg aaattctgtg gtctgccaga agaacagaaa 1440 actgaagagt tagtgcaagc tgtgttgcga cgttcggttc cccagattgc cgaaaaattg 1500 tttgaaatgg cagatttatt tgggaatgaa ttggacaaat tagtaaagca tggtaaacac 1560 cttgatttcg taagaatgga aaagagagag atagaacaaa agcttgccaa aagcacgaca 1620 aaagaattta cagatagcga taattatgta gcggcagtga cttcgtggag aacgaaaagg 1680 gaacgatatg agccgtacga acaacaacaa aacaagagac tatcgggaat gcattcaaac 1740 ggttttcgga acacgtataa taagggagca acgaaatgct caaagtgtga caaaatacat 1800 ggctggggga agtgtggagc atccagatca aaatgcttcc ggtgcggtgc cataggacat 1860 ctggcaattt gttgcagaac cagattggga tcaaatcgat ttccaaatgt ggtgaagaag 1920 cagaccgagg acattaatca ggtggtaata ggagaccccg ctgcagaacg gtaccaaaaa 1980 aaaaaaaata aataaataaa taaataaata aataaataaa taaaaaaaaa ttaaataaaa 2040 taaataaaaa aaaaaataaa aaaaaataaa aataaaaata acaacaaaaa aaaaaaaaaa 2100 cttattatca tgctatatat gtatatgtgt atgtgtgtat gtgtgtatgt gtgtatgtat 2160 gtgtgtatgt atgtatgtgt gtatatgtgt atgtatatat atatatgtat atatgtatgt 2220 atatgtgtgt atatatatat atatatatat ttatatatat atatagcaag ctgaaattgc 2280 tcgccgtcaa aagttcatga gtgtgtgggc aaaagtgtat ttttgattca cttatcttat 2340 ttatttcgtt ttattttaat ttactttaac agttttccga cgaacgagga aggcatgcag 2400 taaccgacga tccaaggcgc attaactgca tgatcggaaa tgtgcctatc attttcttac 2460 tagatactgg ggcatcggtg aacactgtta ctgctgagag ttggtggcaa ataaggacac 2520 acgcccagtc gagtgtgcgt gaatggcagc cacaccctga aacaacgctg cggagttatg 2580 cgaatagtgc agttttggac gtagaatgct catttaaagc aataatatgt gcaggtggaa 2640 ctaatagaac tatggcaaaa ttttttgtgg taaagggagc tgaaatatcc ctgttaagtt 2700 ttaacacagc agtcagctta ggtttagtat gcataggacc tggaaagcat tatatgttgc 2760 gtaacaaaac acaagagacg acagtaaagg atacatttcc taaattgaat attggtgcgg 2820 tgaaatttag aatagacaaa acagtcacac ccaaacaaat aatacgttat aatataccta 2880 aagcatttga gagagccgtg aatgagagac tagaggaaat ggagcttaac ggtatcattg 2940 agtatgtaga cagcgaggat gcggaaatat cttttgtgtc acctatggtg ctagtgccga 3000 aaggaaataa ggactttaga attgttattg attatagaga agcgaataag gctatcatac 3060 gagatccgta ccccatgcca tcattagaaa gaatttggac tgatatcccc aacaacgatg 3120 gaaagatttt gttttctaag atagatttga aagacgcata ctttcacata gaattacata 3180 gtgaggtgcg ccatttgact gcgtttatga cgatgaatgg attaatgcgg ttcaagagat 3240 tacctttcgg gctatcttgt gcaccggaga tttttcaaaa agagatggag aaagttttca 3300 gaaattgcaa aaatgttctt atatacctag atgatatttt gatgtacggt aaatcgttga 3360 aagaactaaa ggaaatagaa aatatggtta gggaagttat taaaaaaaat ggacttacca 3420 ttaacgaaga aaaatcgtgt tatggtcagg aaaaagtgaa tttcttagga ttaacactag 3480 atggagaagg aatattaccg atgagggaaa aaatgacagc cattcaaaat tttgacagac 3540 ctaaggacgc aggtgaactt agaagttttt taggaatgtt gacttttatt ggcccattta 3600 tcagagactt ttcacataaa actaagccgt tgagggattt gataaaaaga gataaattta 3660 aatgggaaga gatgcatcaa aatacgtttg aagaattgaa aaaggttgct atggaagacc 3720 ttatcaaaag gggttatttc aatgaaaagg ataggatcat actatatact gatgcatcac 3780 cttggggctt gggagcaatt ttagcccaag aaaaattgag cacgtccgaa atgcgtatta 3840 tcgcatgtgc atccaaagga ttgacggaga cggaaagccg ctatccgcaa ctgcataggg 3900 aggctttggc catagtttgg gctatggaac gatttgctta ctatctatta ggaaggaaat 3960 ttttgttgcg atcagatagt gaagcgttga aatttatgat taaaccggct aaacaaaaag 4020 atataggaaa gaggattatg tccagggcag aaggatggtt catcagactg gaccattatg 4080 attttgattt cgaacacgtt ccgggaaacg agaacatagc cgatactgcg tctaggctgt 4140 gcagatctaa agaggattac caatttgggg ttgctaagga gcctcacgaa ctatgtttgg 4200 ttactgcgtg cgtaaaccag gttaatgaaa cactattggc gttaaccacg gaagaagtag 4260 gaaaggaatt agaaaaagat gttatattaa acaaggttat tgcttggtta gataaagaga 4320 atgaatggcc acaagaaatt gcaaggtata aagcgtttca gagggagctg tatgtggaaa 4380 acggtttcct atttaaacaa gaaaagatgg tcctcccgca tatacttagg aaccgagcgt 4440 taacattagc tcatagaagc caccctggca tgtccacgat gaaaaatgtt ctgagaacgg 4500 gactttggtg gcccggtatg gacaaagaga ttgaagcttt tgttagaagc tgcccagaat 4560 gtcagctagt gaaaacgaca aatacggcgg tacctataga attaacggag ctaccggaga 4620 atccatggga ttatgtgtca atggatattt catctaccac agataatgtt aaaacattgg 4680 tgcttacaga taattactcg cgatttctga tagccgtccc tctagagcga acggatgcga 4740 atagtataca aaaagcattg aacaggatat ttttaacata ctatgtacct aagaaattaa 4800 aagcagataa tggacccccg tttaatgcgg tagatttcaa aagatggtta gcagacgtat 4860 gggggataaa actcataaat agcacgcctc taaatcctac cgaaaatggg ttagttgagc 4920 gtggaatgca aggtataaat aagattgcag ccattgcacg gttagaaaaa aaatgctgga 4980 aacaggccct tgcggaatat gtagcggatt ataactcatg gccgcaccat gtaactaaaa 5040 tagcaccagc cgaactaatg tttggtagag ctattaggca taggttacca aacccaaaaa 5100 cggatactag acaagcgaac gatgatgaat tacgagacag ggataaaatg gcgaaatttc 5160 agcgcaatca aagagaggat acaaggcgtg gagctaggcc tcttagcatt aaacctgggg 5220 atacggtttt agttagaaat caaaaagcag ataagacaga ctcatcgtac aaaaaagaat 5280 tgcatgaggt acttaaaatt tgtggtgctg gcagggtaac ggtgaaggaa aaaaaatcag 5340 gaaaaattta cgatcgtaat gtaaagcaca taaaaaaatt tttcatgcga aaagcggaag 5400 aagggttgac aagtggtagt agtagttcgg aggaagtagc agggctagtg aaagaacgac 5460 aggctaggct gattaaaaaa ccaaaacgat taatcgaata ttaaagaatt tgtagcctta 5520 ttcctcaagg ggagga 5536 // ID GYPSY67-LTR_AG repbase; DNA; ANG; 236 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY67-LTR_AG is an LTR of retrotransposon GYPSY67_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY67_AG; GYPSY67-I_AG; GYPSY67-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-236 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY67_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 174-174 (2004). XX DR [1] (Consensus) XX CC GYPSY67-LTR is a long terminal repeat of GYPSY67_AG (its CC internal portion is deposited as GYPSY67-I_AG). XX SQ Sequence 236 BP; 54 A; 56 C; 52 G; 74 T; 0 other; tgttatggtg aatattttgc cttagtgagc caacctgtat gtatcacggc tctcactgat 60 gagtgcacgc atctcgtgag atgccgttag gttcgggctt atgacgtgat ccccgaagga 120 tcactctctc tccggcgcgc ctccaattag gatagtgaat aaacgttttt atatgtgtta 180 acttattcgc gtgcattcgt ccacctcttt taaatgcatg acgacagacc ttaaca 236 // ID Copia-6_AG-LTR repbase; DNA; ANG; 148 BP. XX AC . XX DT 01-SEP-2010 (Rel. 15.09, Created) DT 01-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Copia-6_AG_LTR. XX KW Copia; LTR Retrotransposon; Transposable Element; Copia-6_AG-LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-148 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1457-1457 (2010). XX DR [1] (Consensus) XX CC LTR consensus sequences from element Copia-6_AG. The LTRs are 149 CC nucleotides long and present a very high degree of identity CC (p-dist=0,0033, ds=0,0038). In two of these sequences the 3? and CC 5? LTRs are identical (Table 2) indicating that they have CC inserted recently and have not had time to accumulate mutations CC between the LTRs. XX SQ Sequence 148 BP; 45 A; 21 C; 25 G; 57 T; 0 other; tattggaatt atagtaagcc gtatttgggt tatgagtaat ttaatcatag taacctatga 60 atttgtgtag ttataagttt tggtatgaaa tacaattcat tctgtttcta accgtacacc 120 aagctggttg acttcactac ttccaata 148 // ID COPIA2-LTR_AG repbase; DNA; ANG; 207 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA2-LTR_AG is a long terminal repeat of the COPIA2_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA2-I_AG; COPIA2-LTR_AG; COPIA2_AG; Copia clade; Salto7; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-207 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA2_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Direct Submission to Repbase Update (31-MAR-2003). XX DR [1] (Consensus) XX CC COPIA2-LTR_AG is a long terminal repeat of the COPIA2_AG LTR CC retrotransposon. There are ~7 copies of COPIA2-LTR_AG in CC the genome. XX SQ Sequence 207 BP; 57 A; 56 C; 37 G; 57 T; 0 other; tgttgagctt ctcggattga tgcgacaagt gcctatcagg caacaccgaa tgattgccaa 60 tcgggccaca ctttgcacac cacacacacg attgaactct gaataaagat cattcctgca 120 ttaagcgtac aagcgaacac acgtcttttc atttggtaaa acagtccact cggtatttcc 180 actttgcgtt tctcttccag ttcaaca 207 // ID GYPSY3-LTR_AG repbase; DNA; ANG; 141 BP. XX AC . XX DT 02-MAY-2003 (Rel. 8.04, Created) DT 02-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY3-LTR_AG is an LTR of the GYPSY3_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY3-I_AG; GYPSY3-LTR_AG; GYPSY3_AG; Gyspy clade; KW RETRO56_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-141 RA Jurka J. and Drazkiewicz A.; RT "RETRO56_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 12-12 (2002). XX RN [2] RP 1-141 RA Pavlicek A., Kapitonov V.V., Drazkiewicz A. and Jurka J.; RT "GYPSY3_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Direct Submission to Repbase Update (30-APR-2003). XX DR [2] (Consensus) XX CC GYPSY3-LTR is a long terminal repeat of GYPSY3_AG (it internal CC portion is deposited as GYPSY3-I_AG). LTRs appear to be flanked CC by 5bp long direct repeats. XX SQ Sequence 141 BP; 49 A; 27 C; 26 G; 39 T; 0 other; tgttgcatag taacacgcat aacgcagtaa cattgcaaga ctcgatcaga gtacacattg 60 agtgaataaa gacgattcca ttctgaacta aggaataaag cagttgtgtt tttctcaaga 120 tatattccct gcgacatatc a 141 // ID CR1-8_AG repbase; DNA; ANG; 1899 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 28-FEB-2009 (Rel. 14.02, Last updated, Version 2) XX DE CR1-like non-LTR retrotransposon - a consensus sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; CR1-8_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1899 RA Jurka J.; RT "CR1-like non-LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 634-634 (2009). XX DR [1] (Consensus) XX FH Key Location/Qualifiers FT CDS 129..1739 FT /product="CR1-8_AG_1p" FT /translation="MPSPIVDQTVISRATSYTPLNMIDYTIPTFDIRTVEK FT CLKSLKPSTAPGPDGIPSYIISKCYQSLAPVLTNIYNTSLSSGIFPRLWKT FT SWMVPIFKKGDRSCASNYRGITSLCAVSKAFELLLYGPMLSATSNYIMDNQ FT HGFVPKRSTLTNLTEFVSFCKRNLDSGGQVDAIYTDLKAAFDLISHDILIA FT KLQKLGFSEQIVKWLHSYLTERSYKIKVNDHTSREVLGTSGVPQGSNLGPL FT LFILYINDVGQLLAEHRFLLFADDVKLFAPINDTNDCINLQHAINLFSIWC FT RDNAMVLCIEKCRVLSFYRSRSCTRFNYQIDNISVERTDTFRDLGIILDTR FT LTFNDHLENIVSRGNQLLGWIIRSTQGFRNPMTIKTIYCAYVRSVLEYGSI FT VSDPCTEQWSNRIEAIQRKATRYAVRLLPWRQGDVLPSYHSRCLLLGIQSL FT KKRREIAKCLFISGLLNHHIDAPTLLATIEFNAPSRNLRTRQFISVPRYRT FT RFGQSDPMSAMCRTFINNFDLFDFNIPQNVFRDRLRTQLT*" XX SQ Sequence 1899 BP; 547 A; 455 C; 343 G; 554 T; 0 other; tatatatctt caatcgaaag gcatccccga aaattttggc gcttcatgga cagcagacga 60 cgctccaagc acaatcatgg gcacgtcacc cgaagatatt tgcaacattt tcgctgaacg 120 attttccgat gccttctcca attgtagatc aaacggtgat ctcaagggcc acatcttaca 180 cacctctaaa catgattgat tacaccattc ctacgtttga tatacgcaca gtagaaaagt 240 gcctcaaaag ccttaaaccc tctacagctc ctggtcccga tggcattccg agttacatca 300 tcagtaaatg ctatcagtcc ctagcacctg tattaacaaa catatacaac acgtcactga 360 gttccggtat tttcccacgc ttatggaaga cctcgtggat ggtaccaata tttaaaaaag 420 gggatcgctc atgtgcttca aactatcgcg gtatcacatc actttgtgcg gtatcaaaag 480 cttttgagct tctgctttac ggccctatgc tctccgctac atcgaattac attatggata 540 accaacacgg gttcgtaccc aaacgttcaa cattaacgaa cctcactgaa ttcgttagtt 600 tctgcaaacg taatctcgac tccggaggtc aagtggatgc tatttacacc gatctcaaag 660 ctgctttcga cctcatctca cacgacattt taatcgccaa attgcagaag ttagggtttt 720 cggaacaaat agttaagtgg ttgcattcct acctcacaga acgctcatac aagatcaagg 780 taaacgatca tacatctaga gaggtgcttg gcacctcagg agtgcctcaa ggcagcaatt 840 tgggcccgct gctcttcatt ctttatataa acgatgtagg gcaattgcta gcggaacacc 900 gttttcttct gtttgccgat gacgttaagc tgttcgctcc tatcaacgat accaatgact 960 gcatcaatct tcaacacgcc atcaacctgt ttagcatatg gtgccgtgat aatgcaatgg 1020 tgctatgtat cgaaaaatgt cgtgttttgt ctttttaccg ttctcggtct tgcacacgat 1080 tcaactacca gatcgataac atatccgttg aacgcaccga tacattccgc gaccttggga 1140 tcattctcga cacacgttta acatttaacg atcacctaga gaatatagta tccagaggga 1200 accaactact gggttggatt attcgatcaa cgcaaggttt ccgtaatcct atgaccataa 1260 aaaccatcta ctgtgcctac gttcgatcag tgttagaata tggcagtatt gttagcgacc 1320 cctgtaccga acaatggagt aatcgaatcg aggccatcca aagaaaagcg actagatatg 1380 ctgtaagact cttaccatgg cgacagggcg atgtgttacc gtcataccat tcccggtgtc 1440 tgcttttagg gattcagtcg ttgaaaaagc gccgtgagat agctaaatgc ctttttatat 1500 caggactact taaccaccat attgatgctc ctactctatt agctacgata gaatttaacg 1560 cgccatctag gaatctacgt acacgtcaat tcatatccgt ccctcgttat agaactcgtt 1620 ttggccagtc tgatcctatg tcagctatgt gtcgaacttt cataaataac tttgatttat 1680 ttgatttcaa cataccacag aatgttttta gagatagact caggacacaa cttacgtaat 1740 aattttattt tcaaaataca ttttctattc ttgtgcacct gttctcccct atcgatttat 1800 gtaccttaca tagtctataa gcactaattt aagtacattc atgtaggatc ttcaagatcc 1860 cgatggatta atcaataaac atattaaaca tattaaaca 1899 // ID GYPSY49-LTR_AG repbase; DNA; ANG; 316 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY49-LTR_AG is an LTR of retrotransposon GYPSY49_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY49_AG; CsRn1 lineage; GYPSY49-I_AG; GYPSY49-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-316 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY49_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 91-91 (2004). XX DR [1] (Consensus) XX CC GYPSY49-LTR is a long terminal repeat of GYPSY49_AG (its internal CC portion is deposited as GYPSY49-I_AG). XX SQ Sequence 316 BP; 79 A; 73 C; 85 G; 79 T; 0 other; tgtggtgttc ggcaccactg gtggtgttgc gctcctggct gcgtttccga attcaccact 60 ttcgagcgtc acactgtggt aagattaagt tagtgtaggt aggtagcgat gtaatacgaa 120 gtataggata cgaaggatag gacacgaagg ataggacacg aataggagcg atcggatgcg 180 ccgatcgtgg tcctcgagct tgcgagcgat cggatgcgcc gatcgtcgct ctctcatctt 240 tcgaccgcca atcaatacag ttcactcatc gcaattgtca cgttaaataa aataattgta 300 tcgcgcaagc cctaca 316 // ID CR1-7_AG repbase; DNA; ANG; 3274 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 28-FEB-2009 (Rel. 14.02, Last updated, Version 2) XX DE CR1-like non-LTR retrotransposon - a consensus sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; CR1-7_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3274 RA Jurka J.; RT "CR1-like non-LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 633-633 (2009). XX DR [1] (Consensus) XX FH Key Location/Qualifiers FT CDS 594..3113 FT /product="CR1-7_AG_1p" FT /translation="MYVPPDMANNTSVIHTICSTIDDVACKANEMCLLFGD FT FNLSNLNWIPDETNPLMLQVDYESSRITDNCRLLLDCMNSNGLYQINKIHN FT ASNNYLDLIFVCDDILQQCSPPAITPFPIVAIDCYHPPIELCLPNGLAKAE FT HHTTSHQESGMLDYSRTNFLILNDLLQSSDWSFLDGNPDVNHALDMFNRKF FT TEHLSACCPIKRPRQGPPWGNAQLRRLKRNKSSCYRFYQNSGTQQSRLRFI FT AAHNVYRSYNRQLHSRFLIRIQFSLRRKPKKFWRYVDTKRKNSSLPAIVSF FT NGTTSCTKEETCNLFADRFMSSFATEAPGFSLDAALNNVPSDAVDVNVRDI FT NISRDSVLKALKQVKPSYNPGPDGIPTAVLAKCSEVLAGPLARIFTLSLNQ FT RMFPAAWKSSYMFPVHKKGDKSLVENYRGITTLPAGAKVFEVVVQNSLLNC FT CRSLISTRQHGFFPRRSVTTNLVEFVSHCHAAFASGAQMDVIYTDLKAAFD FT RVNHRLLLAKLARLGFSTPLVEWFESYLTNRRYRVKVECMTSREFVSCSGV FT PQGSNLGPLLFSLFFNDVTTVVRESECLLYADDLRLFFPVRCFGDCLVLQA FT SIDSFSDWCLNNDLQISVDKCLSMSFHRSNCPIVFNYCISGTQLQRHSAVK FT DLGVTLDRKLDFHVHHNEIIDRANRMLGFIRRQSREFSDPHCLVALYKSLV FT RSILEYSSVVWCPSSSLWSNKIESVQKKFTRLVLRFTPWRNVAARPSYHAR FT CLIFGLESLATRRETAQILFMKKIILGEIDSPDLLARVNFHVPARILRNFR FT LLRVDQTRRAYSDNEPLTAMAIRFNAHYDSFDWNSLS*" XX SQ Sequence 3274 BP; 908 A; 768 C; 641 G; 955 T; 2 other; attcgatcac cacagctgtt ggatcacaca atactttcag cacacactcc acgcacttca 60 cccacaccgc aaacccacta atatccaccg ataaacaatt cacacacatc acgcacactc 120 gcacgcacgc atttgcacct gatgcaggtg caggcwactt aacgatcgtg aacccaacac 180 taactattgc tacagtacag atcctacact cacacaaacg cacgacggtc atgtcattcc 240 tgcaggggtt gcctccttaa ataatactga cgatttgtta tttttatatc aaaatgtaag 300 aggactaaaa actaaatgtc atgaggtatt cgattatgtt tctactactc attacgacgt 360 cattatcctg actgaaacat ggttggacta ttaactcgtc acagttgttt ccggaccatt 420 atatcgtctt cagaaaggac cgtaatgaca raaatagcag gaaaaccaaa ggtggtgggg 480 ttctcatagc tgtcgcctca catttgcgag tactcttatg ttctacatca gacactgtcg 540 agcaactttg gttaaagatt tacagcatta agaaaacgta tgttatcggc gcaatgtatg 600 ttcctcctga catggcaaat aacacgtcgg ttattcacac aatctgttca actattgacg 660 acgtagcgtg taaagctaat gaaatgtgtt tgttattcgg ggacttcaat ttgtcaaacc 720 tgaattggat acccgatgaa acaaatccac ttatgctcca agttgactac gaaagctcga 780 ggatcacgga caactgtcgc cttcttcttg attgtatgaa tagcaacgga ctgtaccaga 840 ttaacaaaat tcataatgcg tcgaataact acctcgattt gatctttgta tgcgacgata 900 tcctgcaaca atgctctcca ccggcaataa cacccttccc catagttgct attgattgct 960 atcatccgcc tatagagcta tgcctgccga acggcttagc caaagcagaa catcacacaa 1020 catcacatca agaatcaggc atgctcgact acagtcgcac aaactttttg attctaaatg 1080 atctactgca atcctctgac tggtcattcc ttgatggaaa cccggacgta aaccatgcac 1140 tcgatatgtt caatcggaaa tttacggaac atttatctgc ctgttgccct ataaaacgtc 1200 cacgacaggg tccaccgtgg ggtaatgctc aattacgccg tttaaaacga aacaaatctt 1260 cctgctatcg gttttaccaa aacagtggaa cacaacaatc gagattgcgg ttcatcgctg 1320 cacacaatgt atatagaagc tataatagac aacttcactc tcgttttttg attcgtatac 1380 agttcagctt aagaagaaaa ccgaaaaaat tttggaggta tgtcgacaca aaacggaaaa 1440 acagctccct gccagctata gtttccttta atggaacaac ttcgtgtacc aaagaggaaa 1500 cctgcaattt atttgccgat cgtttcatga gttcgtttgc cactgaagct cctggcttct 1560 cattggatgc tgctctcaac aacgtgccct ccgacgctgt cgatgttaat gtgcgtgaca 1620 taaacatctc cagagattct gtgctaaagg cgctcaagca ggtcaaacca tcctacaacc 1680 ctggacctga tggtattcca accgcagttc ttgccaaatg cagcgaagtt ttggccggac 1740 ctctggctcg tatcttcaca ctctctctca atcagcgtat gtttccagca gcatggaaat 1800 cgtcctacat gtttcctgtg cataaaaagg gggataagag cttagtggaa aactaccgtg 1860 gcattactac tttgcctgca ggtgctaagg tttttgaagt cgtcgtccag aactcgctct 1920 tgaactgttg ccgctcgctg atatcaacac gccaacacgg gttcttccct cgccgcagtg 1980 tgaccaccaa tttggtggag tttgtatctc actgccatgc tgcattcgct tcgggagccc 2040 agatggatgt catctatacc gaccttaagg ctgcctttga tcgggtgaat catcgcttgc 2100 tgttggcgaa gcttgctcgt cttggattct ctaccccttt agtggagtgg ttcgaatctt 2160 acctcaccaa tcgtcgatat cgcgtcaaag ttgaatgtat gacatctagg gaattcgtga 2220 gttgttcagg cgtgccacag ggtagtaacc ttggaccctt gctgttttct ctatttttca 2280 atgacgttac tactgtagta agggaatctg aatgtttatt gtatgcggac gatcttagac 2340 ttttttttcc tgtgaggtgt tttggtgatt gtcttgtcct gcaagcatcg atcgactctt 2400 tttccgactg gtgtctgaac aacgacctac aaatttctgt tgataaatgt ttgtccatgt 2460 cgtttcaccg ttctaattgt ccaatagtgt ttaactactg catctccggt acacaattac 2520 aacgacacag tgcagtaaaa gacttaggag ttacccttga ccgtaaactt gactttcatg 2580 tgcaccataa cgaaattatt gatagagcga acagaatgct aggatttata cgcaggcagt 2640 cgagagaatt ctccgaccca cattgtcttg tcgcactcta caaatcacta gtcagatcca 2700 tcttagaata ctcctcagta gtttggtgcc catcgtcgtc gttatggtcg aataagattg 2760 agtcagtgca gaaaaaattc acccgtttgg tcttacggtt cacaccctgg cgtaacgtag 2820 cagcgagacc tagttatcat gcccggtgtt taatttttgg actcgagtct ctcgctactc 2880 gtcgcgaaac ggcgcaaatc ttgtttatga aaaaaatcat cctaggagag atagattcac 2940 cggacctttt agctagagtt aatttccacg tacctgctcg tattttaaga aactttaggc 3000 tactaagagt tgaccaaact agacgtgcat atagcgataa tgaacctttg actgcgatgg 3060 ctatcagatt taatgcacat tatgactctt ttgactggaa ttccctaagt taacctgttt 3120 ttcttgctgt gatactctgt gttatttatt ttgtgctgat gttacattgt accgtgatgc 3180 tttatgttat tataagttta gtttataaga tcagtgataa ggccaatgat ggccaatcac 3240 ttaacattta caaataaaca aataaacaaa taaa 3274 // ID MARINERN3_AG repbase; DNA; ANG; 358 BP. XX AC . XX DT 29-JAN-2003 (Rel. 8, Created) DT 11-FEB-2003 (Rel. 8.01, Last updated, Version 2) XX DE MARINERN3_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN3_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-358 RA Kapitonov V.V. and Jurka J.; RT "MARINERN3_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(1), 6-6 (2003). XX DR [1] (Consensus) XX CC MARINERN3_AG copies are ~98% identical to the consensus sequence. CC They are flanked by the TA target site duplications. CC This element has perfect 23-bp terminal inverted repeats. CC Classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. XX SQ Sequence 358 BP; 106 A; 70 C; 61 G; 121 T; 0 other; cagtggagcg ccgtttatcc gggcttctcg ggacttgacc tcgaccggat aagcgaataa 60 cacggataat gagtcaaatg atatatttta tcatcaattt ctgattagtt ctttaaaaaa 120 ttcctatttt ttgataaaaa taacccagtt tttattaact tcatgttttt cccatgagaa 180 tatttgaaat ttgccatgaa ttatagtgtt ttttgttatg ataatgtgct cttataaccg 240 cttttttaaa taccctcctc ataacgcaat gtcacctttg tattgaactg tcatttctca 300 acagcacgga taaacggaaa gccggataaa aggtacccgg ataaacggcg ctccactg 358 // ID GYPSY23-I_AG repbase; DNA; ANG; 4394 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY23-I_AG is an internal portion of retrotransposon GYPSY23_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY23-I_AG; GYPSY23-LTR_AG; GYPSY23_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4394 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY23_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 11-11 (2004). XX DR [1] (Consensus) XX CC GYPSY23_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY24_AG, CC GYPSY25_AG, GYPSY26_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY23-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 1358-aa GYPSY23_AGp gag-pol like CC protein CC (pos. 280-4353). CC The sequence of the LTRs flanking GYPSY23-I_AG is deposited as CC GYPSY23-LTR_AG. XX FH Key Location/Qualifiers FT CDS 280..4353 FT /product="GYPSY23_AGp" FT /translation="MQPQTVASAGSSTPTASSSSGTNAKMANMCVFEPFNP FT VSSSFDRWMERLKIWFRINQIGDGDKKDYLLHYMGGPTYDVLCNKLQNADP FT YTKSFDEIVALLRNHFNPAPLEILENFKFTSRKQLENESLSEYLMELEKLA FT KSCNFDSYLDKALRNQFVFGIRNRGIQSRLLEVRDLTLSKAKDIAFGMEMS FT LRGTEEMHGTSPRCEVQQVTANTKKISSTNNGQQRKCYRCGDTNHMANRCQ FT HKQTVCSACGKRGHLQKVCLSRHNNRRQENTHYLEENDPKDVLHVSTVQNH FT AGKFLLNLRVNQGVLTFEVDTGSPVSLINIKDKQKHLKDIEILQTDLRLVS FT YSDNDIGVLGKLLVTVVVEGRKLVLPLYVTKSNKHPLLGRDWLRALNLDFN FT RIFKSGTHTVSYCDRDDECKYSALNALLQKYPSVFSKEIGKVKGIQASLTV FT REHTKPVYIKARQVPFALRDAVDKEINQFVNDGVWERVDHSEWATPVVVVK FT KAGGKVRLCGDYKITLNPNLMVDEHPLPTIEELFVTVAGGKTFSKIDLSQA FT YLQLEVRPEDRKFLTLSTHRGLFQPSRLMYGVASAPAIFQRLMEEVLQGIE FT GVTVFIDDIRVTGPDSETHLQRLESVLQRLDKYNLRVNRDKCDFFAKQIEY FT CGYMVDKDGIHKVRNKIDAIQNMPIPKNRDQVRSYVGLINYYGRFFPNLST FT TLYPLNNLLKEDVPFQWTKECESSFKAVKKEMQSDRFLVHYDPSLPVTLAT FT DASPYGVGAVLSHQYPDGTERPIQYASQTLNRTQQKYSQIDKEAYSIIFGI FT RKFHQYLYGRRFILITDNKPISQIFSETKGLPTMSTIRMQHYAAFLQGYDY FT IVRHRRSSEHCNADAMSRLPTCTTDPMNEIEEPDFIEVNAIETLPLTVDEL FT SSATIADDTVRELLRALRMGKSIDAKFRFGIDQNEFSLQKDCLLRGTRVYV FT PPALRKNVLKELHSTHFGICRIKSLARSYCWWEGIDKDIENVVKDCQSCQV FT SKANPPKTSFHCWNTPNEPFQRVHADYAGPFMGYYYLILIDAYSKWPSVYV FT VNNTTTDTTIRVCREFFSTFGIPSVFVSDNGPQFTSADFTKFLKLNGIVHK FT LIAPYHPATNGQAERFVQTMKSKLKSLNCDRSQVHSEICNILLNYRKMIHP FT ATGFSPSMMVFGRQIRSRLDLMIPSDDPERSEIQNKIRELTVGSKVAAREY FT LHNNKWSFGTIKERLGKLHYLVQLGDGRIWKRHIDQLRSVGDNLPISTEDS FT ILLRAEEPNSHTEDLTVTSDLPSTQTPLTDTQSSTTNVSDSAHPTIQTVPV FT RSKVIQSSTSTECATFEGTSGSSGLTADSGLRRSTRTIKPPQRLDL" XX SQ Sequence 4394 BP; 1349 A; 902 C; 949 G; 1194 T; 0 other; atttggcgac gagaaaaggg aattatacgt tttcagttac ttaaacagtt ccctacgttg 60 gtgagaagga gcatttaccg ttattacttc tatcggtgct acgaatacgg tgaattttga 120 tgctgttggc ggtactacgt ctggcgttgc tactgctgcc ggagcgtcag ctggattgat 180 tacgggacga gctgctacca tttttttgct actaccggag atactactgc tgctcctact 240 actcttactg ttgctgtgac tgctactgct tgtatatcga tgcaaccaca aaccgtggcg 300 agcgctggat cttccacacc tactgcttct tcttcttccg gtaccaacgc aaaaatggct 360 aatatgtgtg ttttcgagcc tttcaacccc gtgtcatcat cgttcgatag atggatggaa 420 cggctaaaaa tttggtttag aatcaaccaa attggagatg gcgataaaaa ggactacctt 480 ctacattaca tgggcggccc tacgtacgac gtattatgca acaagctgca aaacgccgat 540 ccctacacga agtcattcga cgaaatcgtt gcactcctga gaaatcactt caaccctgca 600 ccgctagaaa tcctagaaaa cttcaagttt actagccgta agcaattgga gaacgaatca 660 ttgagcgaat acttaatgga gttagagaag ctggctaaaa gctgcaattt tgattcgtac 720 ttagacaaag cgctgagaaa tcaatttgtc tttggcattc ggaatcgtgg aatacagtca 780 cgattgctgg aagtacgcga tcttacacta tccaaagcca aggacattgc gtttggaatg 840 gaaatgtcat tgcgtgggac cgaggaaatg catggaacca gcccgagatg tgaagtgcaa 900 caagttacgg ccaatacaaa gaaaatatca tcgacaaaca acggtcaaca gcgaaagtgc 960 tacagatgcg gcgatacaaa tcacatggct aatagatgcc agcacaagca aacggtttgc 1020 agcgcttgtg ggaaaagagg acatctacaa aaagtgtgtt tgtctcgcca caacaacagg 1080 cgtcaggaga acacacatta tttggaggaa aacgacccaa aggatgttct tcatgtgagc 1140 acggtccaga atcatgctgg taagtttttg ttgaatctga gggtcaatca aggtgtacta 1200 acattcgagg tcgacactgg atcgccggta tcattgataa acataaagga caaacaaaaa 1260 catctcaaag acatagaaat tttgcaaacg gacttaagac tagtaagcta ttcggacaat 1320 gacataggtg tgttgggaaa attattggtc acagtagtgg tagagggaag aaaactcgta 1380 ttgccactgt atgtaacgaa gtccaacaaa caccccttgc tgggacgtga ttggttacgt 1440 gcattaaatt tagattttaa tcgcattttc aaatctggca cacacactgt ttcatactgc 1500 gatagggatg atgaatgcaa gtacagtgca ttaaatgcct tacttcagaa atatccatca 1560 gtattcagta aggaaattgg aaaagtaaaa ggaattcaag cttcactaac agtacgggaa 1620 catacaaaac ctgtgtatat taaagcaaga caggtgccgt ttgcgcttcg tgatgcagtt 1680 gataaagaaa ttaatcaatt tgtaaatgat ggtgtatggg agcgcgtaga ccattctgag 1740 tgggcgactc ctgtcgttgt ggtaaaaaaa gctggtggta aggtacggtt atgcggagac 1800 tataaaatca ccttaaaccc aaatttaatg gtggatgagc accctctccc tacaattgag 1860 gaactttttg tgactgtcgc tgggggtaag acattctcta aaatagatct atcccaagca 1920 tacttacaac ttgaagtacg accagaggat agaaaatttc tcacccttag tacacataga 1980 ggattgttcc agccatcaag gctcatgtat ggtgttgcgt ctgccccagc tattttccag 2040 cgtctgatgg aggaggtgtt acaaggaata gaaggagtga cagtttttat agatgatatc 2100 cgtgttactg ggccagatag tgaaacccac ttacagaggt tggaatcggt attgcaaagg 2160 ttagacaagt acaatttgcg agtgaatagg gataaatgcg attttttcgc caagcaaatc 2220 gagtattgcg gctatatggt tgataaagat ggcattcaca aagttcgtaa taaaatagac 2280 gctattcaga atatgcccat tcctaagaac agggatcaag tacgatcata cgtaggtttg 2340 ataaactatt atggaagatt ctttccaaat cttagcacga ctttataccc acttaataac 2400 ttattaaaag aagatgttcc attccaatgg acaaaggaat gcgaaagttc atttaaagcc 2460 gttaaaaaag aaatgcaatc tgatcgattt ctcgttcatt atgatccttc actaccagtg 2520 actttagcaa cggacgcttc gccttacgga gtgggagcgg tccttagtca tcaataccca 2580 gatggtacgg aacggccaat tcaatacgcg tcccaaactc tcaatagaac acaacaaaag 2640 tattcgcaaa tcgataaaga agcgtactcg atcatttttg gtattcgaaa atttcatcaa 2700 tacctttatg gtcgtagatt tattttaata accgataaca aaccgatcag tcaaatattc 2760 tcggaaacta agggacttcc tactatgtca actatacgta tgcagcatta tgcagcattt 2820 cttcaaggtt atgattacat cgtgcgacat cgtcgttcat cagaacactg caatgccgac 2880 gccatgtctc ggttaccgac atgcacgact gatcccatga atgaaataga agaacctgat 2940 tttattgaag tcaatgccat cgaaacatta cccctcactg ttgatgagtt aagttcagct 3000 acaattgcag acgatactgt ccgtgaattg cttcgagcct taagaatggg aaagagcatt 3060 gatgctaaat tccgatttgg catagatcag aatgaattta gtttacagaa ggattgctta 3120 ctccgcggta ctcgtgttta tgtacctcct gctctacgta aaaatgtttt aaaagaactt 3180 cactcgacac atttcgggat ttgcagaatt aagtctctag ccaggagcta ttgttggtgg 3240 gagggcattg ataaagacat cgaaaatgtg gttaaggatt gtcaatcctg ccaagtgtcg 3300 aaagctaatc ctcctaaaac atcattccac tgttggaata ctccaaatga accatttcaa 3360 agagttcacg cagattacgc aggaccgttt atgggatatt actatctgat tttaatcgat 3420 gcctattcta agtggccttc ggtttatgta gtaaacaaca caactactga tacaacaata 3480 cgagtgtgca gggagttttt cagtactttt ggaatcccat ctgtgtttgt tagtgacaat 3540 gggccccaat tcacttcagc tgattttaca aaatttttaa aactaaatgg aatagtacat 3600 aagcttattg ctccatacca tccggccact aatggacaag ctgaacggtt cgtacaaaca 3660 atgaagtcca aattaaaatc tttaaattgc gatcgttctc aagtccatag tgaaatatgc 3720 aatatccttt taaactatcg caaaatgatt catcctgcca ctggtttctc tccttcaatg 3780 atggtgtttg gtcgacagat tcgttcaaga ttggatctta tgattccatc ggatgaccct 3840 gagaggagtg aaattcagaa taaaattcga gagttgacag ttggctcaaa agttgctgct 3900 cgagaatacc tacataacaa caagtggagc ttcggcacta taaaagaacg gttgggtaaa 3960 ttacattatt tagtacaact tggagacggt cgaatttgga aacggcacat tgatcaactg 4020 cgtagtgtgg gcgataatct tcccatatct accgaggatt caattttatt gcgcgcagaa 4080 gagcccaata gccacactga agacttgact gtgacctctg acttaccctc gacccagacc 4140 ccgttaacgg acacacagtc ttcaacaacc aacgtttctg actctgcaca tccgactata 4200 caaactgtcc ctgtgcgttc aaaggtgatc cagtcgtcga catctactga gtgtgccacc 4260 tttgagggga cttccggttc gtcaggcttg acagcggatt ctggacttcg tcgatcaaca 4320 agaaccatca aacctcctca acgattggat ctgtagcaga gcggaagtag gtgctttatt 4380 tccgggggaa gagc 4394 // ID COPIA1-I_AG repbase; DNA; ANG; 4129 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 28-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE COPIA1-I_AG is an internal portion of the COPIA1_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA1-I_AG; COPIA1-LTR_AG; COPIA1_AG; Copia clade; KW reverse transcriptase. XX NM COPIA1-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4129 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "COPIA1_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 49-49 (2003). XX DR [1] (Consensus) XX CC COPIA1_AG is a young family of Copia-like LTR retrotransposons. CC COPIA1-I_AG, an internal portion of COPIA1_AG is flanked by 99% CC identical COPIA1-LTR_AG LTRs. The COPIA1-I_AG consensus sequence CC was reconstructed based on multiple alignment of 4 copies. CC The consensus sequence encodes the 1323-aa COPIA1-I_AGp protein CC (positions 151-4119). XX FH Key Location/Qualifiers FT CDS 151..4119 FT /product="COPIA1-I_AGp" FT /translation="MSESHVTIEKLNDQNYAIWKFKMELLLAREKVLTVVK FT DSKPASPDAAWIANDERARALIGLSLDDSQLIHVMQTSSSKDMWDALKGYH FT ERSSLSSKIHVMRKMFATKMTEGGDISNHLKELCSLRLRLIALGEEMKDPS FT FVALMLSSLPKSFDGLIVALESRPDEDLTVDYVKGKLLDEGRRRADGADED FT KALLSGGKNXTKFWKDRKLTTNKEKQCHYCKKNGHIRKDCRKWAADKRSKL FT DGESVNVANEDNREEVCLFIGEXNETGPWCFDSGATSHMTNDTSILKLIDK FT SKQSSISLANGDSIKSAGVGXCKLFSMDGNGKRKKVSLDXVCHVPSLTTNL FT LSVSKITDNGFEXXFDRXGCRVLKGKQVLLIGERKGGLYYLKQTEQAMLVD FT KNHEASCIHLWHRRFGHRDIEAIMKIARNNLGSGLNINRCHVKSICGSCCE FT GKMSRDPFPNSSSSRTSGVGELIHTDLGGPFEVSTARGSRYFMTMVDDFSR FT YTIIYLXQNKCETENRIREYCXMMKTQFGHYPKVIRSDGGGEYRSNSLKEF FT FVDHGIVHQQTAPYSPQQNGVAERKNRYLVEMMRCMLAESNMDKVFWGEAI FT TTANYLQNRLPSSLLESTPYEMWHGKNPSYEHLRVFGSEAFVHIPKEKRRK FT LDKKAEKLVFVGYADNQKAYRFVNLETKTITISRDAKFLEQCEIEKIGTKP FT KPTTSGGVVVLPLGSTPSLCRAEETTTRENIVQMEASAESSCIRESNMNDT FT VDXLDVTPYNSASDGELSDEPGAIEMHQSVRRSXRTTKGIAPVRFREESYM FT AGSSEQNEEPRNLKEVFVCAAREKWISAMENELKSHEENGTWDALVELPAG FT RKVVGCRWIFKLKRNAAGQVIKHKARLVAQGYSQQFGEDYDQVFAPVTSHT FT TFRLMLAIASKTQMKLRHLDIKTAYLYGDLDQELFMRQPPGYEXKGKEHLV FT CRLKKSIYGLKQSARCWNQKLHGVLLEIGFQQSAADQCLYIKTEDGKRVYI FT LVYVDDMIVGCVDETLIDSVYHALTEHFEMTDLGPVSYFLGMEVKXEKGNY FT SVSLEGYIEKLIRKFGLSEAKTAKTPMDEGFLKQQDSSSILKDSTQYRSLV FT GALLYISVCTRPDIAVSTGILGRNVSNPTESCWVAAKRVVRYLKATKHFKL FT TFNKAGSDLIGYSDADWAGDTITRKSTSGYVFFYASGAVSWASRKQTSIAL FT SSMESEYISLSEATQEQMWLTRLMKDLGEHIENPVKIFEDNQSCICFVNSD FT RTNRRSKHIETKEHFVKQQCESRKMMLEYCPTEEMVADILTKPLGATKQRK FT FTEMLGLHGTR" XX SQ Sequence 4129 BP; 1357 A; 731 C; 960 G; 1064 T; 17 other; ggttatgggc ccagaagtac tgtcgaacaa gtgttaagtt tgtgttatat tgaagtaata 60 atctatcgaa aattttaatc ttgaagaaat actgaaaact tatcgaaaat tttaatcttg 120 aagaaacact gaaaacttaa atctagaaac atgtctgaat cacacgtcac cattgaaaaa 180 ttaaacgatc aaaattacgc aatatggaaa ttcaagatgg aacttttgtt agcaagggaa 240 aaggtgctga ctgtcgtgaa agattcgaaa ccagcaagtc ccgacgctgc atggattgcg 300 aatgatgaac gtgctagggc actgatcggt ctgtcgttgg acgacagcca actcatccat 360 gtcatgcaaa cgagttcatc gaaagatatg tgggatgccc taaaaggcta tcatgagcgt 420 tcatctttgt ccagcaaaat acacgtcatg cgaaaaatgt ttgccacaaa aatgactgaa 480 ggtggagaca tttctaacca tctcaaagaa ctatgttctc tgcgacttcg tttaattgcg 540 ctgggagaag aaatgaaaga tccatccttt gtcgcgttaa tgttgtccag tttgccaaaa 600 tcctttgatg gtttgatcgt ggctttggaa agtaggcctg atgaagatct tacggtggat 660 tatgtaaaag gcaaattgtt ggatgaagga agacgtcgag cagatggtgc agatgaagat 720 aaagcgttac tatctggagg aaagaayawc acgaaatttt ggaaggacag gaaactaaca 780 accaacaagg aaaaacagtg ccattattgc aagaagaatg ggcacataag aaaagactgt 840 agaaaatggg ctgcagacaa aagaagtaaa ctagatggtg aaagcgtcaa cgttgctaat 900 gaagacaatc gagaggaagt atgtttgttc attggagaar gaaacgaaac tggaccatgg 960 tgtttcgatt ctggtgcaac ttctcatatg acgaacgata cgtctatttt gaaattaata 1020 gataaatcga agcaatcctc gatttcatta gcgaacggag attccatcaa gtcagctggt 1080 gtcggaarct gcaaattgtt ttccatggat ggaaacggaa aacgcaagaa agtttccttg 1140 gacaakgtgt gtcatgtacc atctttgacg acaaacctat tatctgtaag taaaattacc 1200 gataatggat tcgaartgyt tttcgatagg twtggatgtc gtgtcctgaa aggaaaacaa 1260 gtattgctga ttggtgaacg taaaggtggt ctgtattatc taaaacagac tgaacaagcc 1320 atgttggtag ataaaaacca tgaagcttcc tgtatacacc tatggcatcg tcgatttggc 1380 catcgtgaca tagaagccat aatgaagatt gcgcggaaca atttgggaag cggcttgaac 1440 atcaaccgat gtcatgtgaa atccatttgt ggatcatgct gtgaagggaa gatgagccgt 1500 gatcctttcc caaattcttc atcttcaagg acatctggtg ttggcgaact gatacatacg 1560 gacttgggag gaccgtttga agtatcaaca gcccgaggaa gccgatattt tatgactatg 1620 gttgatgatt ttagtcggta tacaattatc tacctgktgc aaaacaagtg tgagacagaa 1680 aaccggatca gagaatattg crccatgatg aaaacacagt ttggacacta tccgaaagtc 1740 atcagatcgg atggtggtgg tgaatacagg agcaattctt taaaggaatt ttttgtagat 1800 cacggsatcg tgcatcaaca aactgcccca tattctccac aacaaaacgg cgtggctgaa 1860 cgtaagaacc ggtacctcgt tgaaatgatg agatgcatgt tggcagaatc gaacatggac 1920 aaggtgttct ggggtgaagc gatcaccact gccaattatt tacaaaatcg cttgccatcc 1980 tccttactgg aatcgacacc ttacgaaatg tggcacggaa agaatccttc gtatgaacat 2040 cttcgagtat ttggttcaga agctttcgta cacattccta aagaaaaacg gcgtaagttg 2100 gataaaaagg ctgaaaagtt ggtattcgtc ggatacgcgg acaatcagaa ggcctatcga 2160 ttcgtaaacc tggagacgaa aacaattacc attagtcgtg acgcaaaatt tttagaacaa 2220 tgcgagattg agaaaattgg aacaaaaccg aaaccaacga catcaggagg agtagtggta 2280 ctgccacttg gatcaactcc ttcattatgt cgcgcagaag aaactaccac gagagaaaac 2340 atcgttcaaa tggaggcttc tgctgaatcc tcctgcatta gggaatcgaa catgaacgat 2400 acygtagatg awcttgatgt tacaccatac aacagtgcat ctgatggcga actatcagat 2460 gaaccagggg ctattgaaat gcatcaaagt gtacgtaggt ccaygcgaac aacaaaaggc 2520 atcgcacctg ttcgattcag ggaggaaagt tatatggcgg gatcttctga acaaaacgaa 2580 gaacccagaa atttgaaaga agttttygtc tgtgcagcgc gcgaaaaatg gatatcggca 2640 atggaaaatg aactgaaatc acacgaagaa aacggaacat gggatgcatt ggtagagcta 2700 cctgctggca gaaaggttgt tggttgccgc tggattttta agttgaaaag aaatgcagct 2760 ggacaagtaa tcaaacacaa agctcgtcta gtggcgcaag gctattccca gcaatttggc 2820 gaagactatg atcaagtatt tgctccggtc acaagccata caacatttcg tttgatgctc 2880 gctatagctt ctaaaacaca gatgaaatta cggcatttag atattaaaac ggcctattta 2940 tatggtgatc tagatcagga gctttttatg cgacaaccac ctggatacga gayaaaaggc 3000 aaagagcatt tggtttgtcg attgaaaaag agtatttatg gtctgaaaca atcagctcga 3060 tgttggaacc agaaactgca cggtgttctg ctagagattg gcttccaaca aagtgctgct 3120 gatcagtgtc tgtacattaa aactgaagat ggaaaaagag tctacatttt agtgtacgtg 3180 gatgatatga tagtcggttg tgtggacgag actctcattg attctgtgta tcacgcttta 3240 accgaacatt tcgaaatgac ggacctggga ccagttagtt actttctggg aatggaggtt 3300 aaatrtgaaa aaggtaacta cagcgttagc ctcgaaggtt acattgaaaa attgattcgt 3360 aagttcggat tgagcgaagc aaaaactgcg aaaacaccga tggatgaagg atttttgaag 3420 cagcaagact caagctctat tttgaaagac tctactcaat atagaagtct agttggtgct 3480 cttctataca tatcggtgtg tacgcgacca gatattgctg taagtacggg aatacttggt 3540 cgtaatgtta gtaatcctac tgaatcatgc tgggttgcgg ctaagcgtgt tgtaagatat 3600 ttaaaagcaa ctaaacattt taagctcact ttcaacaaag ctggtagcga tttgattggt 3660 tattctgatg ctgactgggc aggtgatact ataacaagaa aatcgacttc cggatatgtg 3720 tttttctatg ctagtggagc tgtgtcatgg gccagtcgca aacaaaccag cattgcatta 3780 tcatcgatgg aatcagaata tatttcctta agtgaagcta ctcaagaaca aatgtggctt 3840 actcgattga tgaaagactt aggagaacat attgaaaacc ccgttaaaat ctttgaggat 3900 aaccagagtt gcatttgttt cgtcaactct gatagaacca atcgtcgatc gaaacacatt 3960 gaaacaaaag aacactttgt caaacaacag tgtgaatcta gaaaaatgat gcttgaatat 4020 tgtcccacgg aagagatggt tgcggacatt ctaacgaaac cactaggagc aacaaaacaa 4080 agaaaattta cggagatgtt agggcttcat ggcacacgtt gaggaggag 4129 // ID GYPSY25-LTR_AG repbase; DNA; ANG; 234 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY25-LTR_AG is an LTR of retrotransposon GYPSY25_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY25-I_AG; GYPSY25-LTR_AG; GYPSY25_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-234 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY25_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 16-16 (2004). XX DR [1] (Consensus) XX CC GYPSY25-LTR_AG is a long terminal repeat of GYPSY25_AG (its CC internal CC portion is deposited as GYPSY25-I_AG). XX SQ Sequence 234 BP; 72 A; 45 C; 69 G; 48 T; 0 other; tgttatggcc caacctgcct aacggctaag ggcactcacc gaagacaggc aatggtcttc 60 gggtgatgtg tgagtgtagc aggtcgggcc agagagagag ccgagcggca aagtgcatta 120 ttatactgta ggctatcgca cgggcaacaa gaggagatag ataaaggcat attccatata 180 ttcagaaaga cgaaagtgta ctgtattcga tagaagggtg gcgatatcac aaca 234 // ID CR1-5_AG repbase; DNA; ANG; 4525 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 19-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE CR1-5_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW AP endonuclease; CR1 clade; CR1-5_AG; DNA/RNA-binding; PHD finger; KW reverse transcriptase. XX NM CR1-5_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4525 RA Kapitonov V.V. and Jurka J.; RT "CR1-5_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 16-16 (2003). XX DR [1] (Consensus) XX CC CR1-5_AG is a young family of CR1-like non-LTR retrotransposons. CC The CR1-5_AG consensus sequence was reconstructed based on CC multiple alignment of ~50 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-5_AG occurred less than 1 million years ago. CC The 3' terminus of CR1-5_AG is composed of the ATAAAC CC microsatellite. CC CR1-5_AG encodes two protein sequences: a 378-aa CR1-5_AG-ORF1p CC (positions 228-1361) and 968-aa CR1-5_AG-ORF2p (positions CC 1492-4395). CR1-5_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (aa positions 5-60). CR1-5_AG-ORF2p is composed CC of CC the AP endonuclease (aa positions 1-250) and reverse CC transcriptase CC (aa positions 520-750) domains. XX FH Key Location/Qualifiers FT CDS 1492..4395 FT /product="CR1-5_AG-ORF2p" FT /translation="MQQAPTDRAACITVYYQNVRGLRSKADEFRLSVLETE FT YDVLVLTETWLDPCIPTALLLGDEYRVYRCDRDATNSSHSRGGGVLIACNV FT SLLSYVLPTLPQLLELTCVCIQLCDHRLFIAAAYLPPNHSMNEEKISALID FT FVANICTTLGPRDRFMLIGDFNQSTLSWTSASEENGAAFDFYEPHARSARS FT VQFVDGLHQSGLYQLNFNTNSSGRILDLIYANWPAASTCSSIRVCEYPLTT FT IDEHHPPLDFDLDNVAPVTIATAIDADKRLNYARVDLPKLERLILSFDNSF FT NCSDYSTIDDATEAFCEFMRSAICECTPVKIPRRGPPWADRTLNALKKEKR FT KAYSDQRAERNSTSRLYYNRVHSLYRRYNRSCHRSYLKQTARSLCKYPRRF FT WSYMDKKRKSAGLPSIIRYDGDSACSLPEMCKLFALRFKDNFASQTTGPED FT VADALSNTPVGALLPMLPVITVDTITSAIKRVKSSYTPGPDGIPAVILKWC FT ASALAPSLMKIFKESLRCGTFPATWKSSWMTPIYKKGCKNDAVNYRGITSL FT SVCAKVFEMLIYEPLLASACNYISVNQHGFVPRRSTTTNLLEFVSKCHKSI FT DNGLQLDAIYTDIKAAFDSVSHSILLAKLDLLGLPNPMIMWLRSYLTDRQY FT SVKLGPYMSSPVHASSGVPQGSNLGPLLFLLFINDATLILPADNHLLYADD FT AKIFRVIREPEDHARLQTSLHEFQCWCNRNALSLCTHKCEVITFSRSRCPS FT LYEYALDGQSLARKQCVKDLGVLLDTKLSFKDQLDHVVATSNRMLGLVINM FT TRELNDIPCTKALYCSLIRSLMEYANIVWWPTAARPLARLESIQRKISRFA FT LRSWNQRLDYRTRCLLLGLPTLSERIRKARLSFITGLLDGRIDSPSLLAAI FT NLYVPARPLRTRAMLALDDRRTQFGSSDPFLLMCRAFNAVSDAFEPRISPT FT EFNDRVSVLNLVP" FT CDS 228..1361 FT /product="CR1-5_AG-ORF1p" FT /translation="MEDVCYACSEALGPVEGSITCLYCEGTYHLACTKVPI FT SVIDEVKRMASLHWSCIGCTNAIGNPRSKAIKGMGMQVGFQAALTAAVDAM FT KASLVPPVIQEIRDGFAGIATAHSASLQHSNQLLPDALPNGKRRRLFRDAV FT ASSDAVFVDSAAISNVTNTNNTHRIDRPSLPPIITGTNTTSASIPTVSQTP FT RTDYLWLHLSRLAASVTIEQVVSLVCLQLDTADAIAFSLLKAGTVPSPLSA FT VSFKVRIPAALRSKALNAASWPVGLGVREFISLPPRSLHSRTQHTNTYTPI FT PQQPRTQQTPAPLQPRTQHSPAPQQPCTQHNNPHNNPYSPAHIHEPARTQT FT EQSPTDMNCTLPSPTLPSTSRSKTLQTTLVQFFPK" XX SQ Sequence 4525 BP; 1158 A; 1175 C; 947 G; 1245 T; 0 other; tcagctttga cagttgtcaa catagcgtgt ctactgctat cgctccgcat ttatccgcta 60 ttttattggt gtaatctggt tttgctataa tcaacacaag ttttagtgtg tccataagtg 120 tagtctcgtg accctgccat cgtgttcctg acgaaaacca ccgtgttcgc gatatatttg 180 gcttttattt gtggctcctt cgacgttcat ctgcagcggc atctaaaatg gaggatgtgt 240 gttatgcatg ttctgaagca ctgggcccgg ttgaagggtc aattacgtgc ctatattgtg 300 agggtacata ccatctggcg tgcactaagg tgcctatatc ggtcatcgac gaagtgaaac 360 ggatggcttc cctgcactgg agttgcattg ggtgtactaa cgccatcggg aatccgcgta 420 gcaaggcgat caagggcatg ggtatgcagg ttgggtttca ggcagccctc acagctgctg 480 ttgatgccat gaaggctagc ttggtcccac cggtaataca ggagattcgg gatggctttg 540 ccggaatcgc cacggcccat tcggcatcct tgcaacatag caatcaattg ctcccagatg 600 cactaccaaa cggaaaaaga cgaagattat ttcgtgatgc agttgcatcc tcagatgccg 660 tgtttgttga cagtgcagca atctcaaacg tcacaaatac aaacaacact caccggatcg 720 atcgtccttc actaccacca ataatcacag gtactaatac aacctccgct tcaataccaa 780 ctgtttcaca gacaccaaga accgattatt tatggctgca tctttcaagg ttggctgcgt 840 ccgtcaccat agagcaagtt gtctcgttgg tgtgtttgca gttggacact gcagacgcaa 900 ttgcttttag tctgctaaaa gcaggaacgg ttcctagccc attgagtgct gtatcgttca 960 aggtcagaat ccctgctgcc cttcgtagta aggcacttaa cgcagcctcc tggccggtcg 1020 ggctcggtgt gcgtgagttc atctcgctcc cgccgcgatc tctacattca cgaacacaac 1080 acacgaacac ttacacaccg ataccgcaac aaccccgtac tcagcagaca ccggcaccac 1140 tacagcctcg cactcaacac tcaccggcac cacagcagcc ttgtacccaa cacaataatc 1200 cccacaataa tccatactca cctgcacata tccatgaacc cgcacgaaca caaaccgaac 1260 aatcaccaac cgatatgaac tgcactctcc cttcacccac tcttccgtcc acgtcacgtt 1320 caaaaacact tcaaactaca ctggttcaat tttttccgaa gtaaccgcac tacagcagcg 1380 cacccatatt ttgcgcatac ggctcgagca aatccttcga caacaaacac aatacactca 1440 ccttctacga taagctcgct caattgcaac accacagcgc ctactaacac tatgcaacaa 1500 gcacccactg atcgagcggc ttgtatcact gtgtactatc agaatgtgcg tggcctgcgt 1560 tctaaagccg atgagttccg tttgtcggtt cttgagacgg aatacgatgt attggtactc 1620 actgaaactt ggctcgatcc ctgtattccc accgctctcc ttctaggaga tgagtatcgt 1680 gtttaccgat gcgaccggga tgccactaac agctctcact ctcgtggggg tggtgtttta 1740 attgcatgca acgtttctct tctctcttat gtgcttccta cgttgccgca actgttagaa 1800 ctcacttgcg tatgtattca gttatgcgac caccgtttgt ttattgccgc tgcctacctg 1860 cctcctaacc atagcatgaa tgaggagaaa ataagtgcat tgatcgactt cgttgctaat 1920 atctgcacta ctcttggtcc gagagacagg ttcatgctta ttggtgattt caaccagtct 1980 acgctatcct ggacatcggc gtcagaggaa aatggagctg cttttgattt ctatgagcct 2040 catgcacgct ccgcgcgcag tgtccagttc gttgatggtt tgcatcagag tggactatac 2100 cagcttaact tcaatactaa ctcatcgggg cgaatacttg acctaattta cgcaaattgg 2160 ccagctgcat ctacatgttc ttcaatacgt gtctgtgaat atccccttac tactatcgac 2220 gaacaccacc ctccgcttga ctttgatttg gacaacgtag ccccagttac gatcgctaca 2280 gccattgatg ctgataaaag gctgaattat gctcgtgttg accttccaaa attggagcga 2340 ttgattttat cttttgacaa ttcttttaac tgttcggact attcaaccat tgacgatgcc 2400 acagaggctt tttgtgagtt catgagatct gctatatgtg aatgcactcc tgttaaaatt 2460 cctcgtcgtg gaccaccatg ggctgatcgc acactaaatg cattaaaaaa agaaaaaaga 2520 aaagcctatt ctgatcaacg agcagaacga aacagcacgt cacgtcttta ttacaacaga 2580 gttcattctc tctatcgccg atataataga tcctgtcacc gaagttattt gaaacaaact 2640 gcacgttccc tctgcaagta ccctaggcgc ttttggagtt atatggacaa aaaacgaaaa 2700 tctgctgggc taccaagcat tatacgttat gacggtgaca gtgcctgttc cctgccagaa 2760 atgtgtaaat tgtttgcctt gcgcttcaag gacaacttcg cttctcaaac tactggtcca 2820 gaagatgtgg ccgatgctct ctctaacaca cctgttggtg cactgctccc tatgctacca 2880 gtcatcactg tcgatacgat cacctctgcc atcaaacgtg tcaaatcctc atatacccct 2940 ggcccagatg gtataccggc cgttatccta aagtggtgtg cttctgcgct tgctccctcg 3000 ctgatgaaga tttttaagga gtccctgaga tgtggaacct ttcctgccac ttggaaatct 3060 tcctggatga cacccatcta caagaagggc tgtaagaatg atgctgtaaa ctatcgtggc 3120 ataacctcac tcagcgtgtg tgcaaaggtg ttcgaaatgc taatctacga gccattgttg 3180 gcctcggcct gcaactatat tagtgtgaat caacatggtt ttgtacccag gcgatctaca 3240 acgactaatc tactagaatt tgttagcaaa tgccataaat ctatcgataa cggtttacaa 3300 ctagatgcta tttatacgga tatcaaggca gcattcgaca gtgtatcgca ctccatcttg 3360 ctagcgaaac tcgacttact tggtctccca aaccctatga taatgtggct tagatcgtac 3420 cttacagatc gtcaatattc tgtgaagtta ggcccgtaca tgtcaagtcc agtgcatgca 3480 tcttctggag tcccgcaggg cagcaacctg ggtcctttgc tgttccttct tttcatcaac 3540 gacgcgacat tgatccttcc ggctgataat catctactgt acgcagatga cgcgaaaatt 3600 tttcgtgtta ttcgtgaacc agaagaccac gcccgactgc aaacttcctt gcatgagttc 3660 cagtgttggt gcaaccgtaa tgctttatcc ctatgcacac ataaatgtga agtcatcact 3720 ttcagtcgtt ctcgctgccc atcattgtat gaatatgcgc ttgatggaca gtccttggcg 3780 cgaaaacaat gtgttaagga tctaggtgtt ttgctcgata caaaactatc atttaaggat 3840 cagctggatc acgtagtagc caccagcaat agaatgctag gactagttat caatatgact 3900 cgcgagctta atgatatacc ttgcaccaag gcgctctatt gctcacttat ccggtcgttg 3960 atggaatatg caaatatcgt atggtggcca actgcagcgc gtccgttagc tcgattggaa 4020 tcaatccagc gcaaaatttc acgatttgca cttcgctcat ggaaccaaag gctcgattac 4080 cggactaggt gtttactact cgggctaccc accctaagtg agcgaatacg aaaagccagg 4140 ttgtcgttca tcacgggact tctcgacggc cgtattgact ctccgtcact actggctgcc 4200 atcaacctgt acgttccggc caggccgctc cggactcggg caatgttggc cctcgacgac 4260 cgtagaacgc aatttggctc ctctgacccg ttcctactca tgtgccgtgc tttcaacgca 4320 gtcagcgacg cttttgagcc gaggatttcg ccaactgagt ttaatgatcg tgtttctgtg 4380 ttaaatttgg ttccatagtg cacattttta tgttctatgt tattgtaatc cattgtaaag 4440 aactcattgt aaaacattgc taaaatggtt cgagagggca ttattgtcca tcgatagaca 4500 aataaacata aacataaaca taaac 4525 // ID Mariner2_AG repbase; DNA; ANG; 2000 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 02-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Mariner DNA transposon - a consensus sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Mariner2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2000 RA Jurka J.; RT "Putative mariner/Tc1-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 646-646 (2009). XX DR [1] (Consensus) XX CC TA TSD (included). XX FH Key Location/Qualifiers FT CDS 624..1625 FT /product="Mariner2_AG_1p" FT /translation="MGRSAHSTQQQRLDIKRLSAAGYSQRKIAEILGRSKT FT FVYNALHSTGTKIPTGRPRKTSARDDARMTRLCKADPFKSARAIRDELQLS FT VSDRTVQRRLFENNLVGRNPRKVPLLRRCHVQARLQFAREHYDWAGENLNK FT WRNVLWSDESKVNLVGSDGKRFVRRPKNTAYRPQYTLKTVKHGGGNIMVWA FT CFSWYGVGPIFWIKDIMDQHRYLNIIQTVMLPHAEWEMPLKWQFMHDNDPK FT HTAKAVKKWFVDQKIDVMNWPAQSPDLNPIENLWKIVKAKLPPAGSRTKEK FT LWKHIENAWYSIPPSTCKSLVESMPKRMRAVIRNNGHATKY*" XX SQ Sequence 2000 BP; 606 A; 388 C; 426 G; 569 T; 11 other; tacagtgtcg gacaaatcra taggaccact tatcaattta ttattcactt atcatatttc 60 atgwyttaaa gtgaatcaat ctttaccaaa ctttcatgaa agcwytatta gaatgtttac 120 tttaagaatg tttttaaagt ttggtgaaaa gttaagaaac aaaaaagtta tgataagaaa 180 tgtaaaaata acccaaaaat catgcgccaa aataatagga ccagttcaga aaattttaaa 240 ataattatca tttattattg ttaaaacaca gtttttgatg ttattgagtt gtcggataac 300 ttaacttaaa ttagtgaact aatgcgtatt catttttttt gtagaaacgc ttagcgcagt 360 ggcaattcgc tgtgtcttta attaacatgc tctaatttac taatgcaaga atattacggc 420 actcgcttaa gggttttgag ctccctttgc actgagagac gaaatagtgg tgagtacctt 480 cgtctayggg tataaaagcg cgtcacagcg ctggatcgct tacactgtgt tctaagccgt 540 tgcctagaac acttcacacg atccattagc tcgtgtattt ttgggtgcac tacttcaagc 600 gcttgattga aggaattcca acgatgggcc gtagtgcaca ctccacacaa caacagcgtt 660 trgatatcaa acgtttgtcc gctgctgggt actcccagcg taaaatcgcg gaaatacttg 720 gccgatccaa aacattcgtg tataacgcgc tgcatagcac gggcacgaaa ataccaacgg 780 gtcggccacg caagacatcc gctagggatg atgctcggat gacgcgattg tgcaaggctg 840 atcctttcaa gtcggccagg gcaattaggg acgagctgca actttccgta agtgatcgga 900 cagtgcagcg tcgcctattt gaaaacaacc tggtcggccg gaatccccgc aaagttccgt 960 tgcttcgacg gtgccacgtt caagcccggt tgcagtttgc tcgagagcac tacgactggg 1020 caggtgaaaa cctgaacaag tggcgaaatg tactgtggtc ggacgagtcc aaagtgaact 1080 tggtaggttc ggatgggaaa cgctttgtgc gtcgtccaaa gaataccgcc taccggccac 1140 agtacacttt aaaaaccgtg aagcacggcg gaggcaacat aatggtttgg gcatgtttct 1200 cttggtacgg tgtcggtcct atcttttgga taaaagatat catggaccaa caccgatact 1260 taaacatcat ccaaactgtc atgytgcctc atgcagagtg ggagatgcca ttgaaatggc 1320 aattcatgca tgacaacgat ccaaaacata cygcaaaagc ggttaaaaaa tggtttgttg 1380 atcaaaaaat tgacgtcatg aattggcctg cacaatcgcc tgacctgaat ccgatcgaga 1440 acctgtggaa gattgttaaa gctaagcttc caccggcagg ttctcgaact aaagagaaac 1500 tgtggaagca tatagagaat gcttggtaca gcattccgcc atcgacttgc aaaagtttgg 1560 tggaatccat gcctaaaagg atgcgagctg tcatccggaa caatgggcat gctaccaaat 1620 attaaatgct accaagttag cccaggataa ggctgttttg ttattttctt tagtttttca 1680 aatgtacggg ctggccacat tgcgaattaa aagccggagc tgtaattatt tataggcaat 1740 aatgataaat ggttaaaata ataaaatttt ctatactggt cctattattt tgtcgcatga 1800 tttttgggkt tttttyaaca tttcttataa taactttttt gtttgttaac ttatcaacaa 1860 actttcaaag cagactttaa gccaacatcc gaaaagtgct ttcatgaaaa tttggtaaat 1920 attgattcac tttaggtaat gaaatgtgat aagtgaataa aacattatta agtggtccta 1980 ttgatttgtc cgacactgta 2000 // ID BEL7-I_AG repbase; DNA; ANG; 2613 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL7-I_AG is an internal portion of the BEL7_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL7-I_AG; BEL7-LTR_AG; BEL7_AG; Bel clade; PHD domain; KW peptidase A16. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2613 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL7_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 43-43 (2003). XX DR [1] (Consensus) XX CC BEL7_AG is a young nonautonomous family of Bel/Pao-like LTR CC retrotransposons. CC BEL7-I_AG, an internal portion of BEL1_AG is flanked by CC BEL7-LTR_AG CC LTRs. The BEL7-I_AG consensus sequence was reconstructed based on CC multiple alignment of 11 copies; they are less than 1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes (pos. 393-2552) a 720-aa BEL7_AGp CC protein composed of the PHD and peptidase A16 domains (pos. 18-62 CC and 111-260, respectively). CC The reverse transcriptase and integrase domains are not present; CC therefore, this family is a nonautonomous one. XX FH Key Location/Qualifiers FT CDS 393..2552 FT /product="BEL7_AGp" FT /translation="MTTKRNLTPFVDNDNGSCRLCSRIDDPNMVQCDECDR FT WFHMACAKLSRLPKADEPFLCIKCTKDYAKIKAPASTGSDQNTAIAALIEA FT LKGSSLDTNTYLKRLTFDQLPDFDGKAKEWLKFKRAYEETTKQAKYTNVEN FT MTRLQHALKGEAYKCVHRLFLEPDNVPEIITKLEEQFGRAELVYDELLKDV FT QNIRVENQHKIPDLSDAIQDMITNIKAINMPMYLQDHRLVNELACKLPTDR FT HLKWIEYKSTNIKPGVLPSLEDFGKWLLPQAKVLKELPKRPERPKHSMNVH FT QSSAHNTKRPFNHNHFGNAQNNSPATNTYARNAIDYTAQNCTAARERGPHP FT CEETHHRLHFTADAEENCHQEQKILYQIVPVVLRNEDKTLKTFALLDSGSS FT FTLIEEETANQLQLEGPTEPITMTWTQNLSIREKESRRVSCLVKGEKEKKE FT RVLEKMRTVKNLQLPRQSINCNVLREKCPHLRGIDISDYESARPTILIGLD FT HSHLLIPLGRKMGRSDEPMAIKTKLGWTVFGMNVKGHEAPHNQPMIHYDAD FT STMISKEIDRGKEIIQKPFVETVPNRAWSEDQRIAEPDEEDRGISIPVLHH FT HEAVVPDSSMAPLPLCGTHPHNPSPLHTGVDHLEPPDITTQRFVEREVRQP FT CPGESAAVLMEMRINPQPYVHPPLTNEEDEILSRRGIGSLTTKTLRHRGRP FT HRRRALLRPAIGLASNAGSQ" XX SQ Sequence 2613 BP; 818 A; 650 C; 606 G; 539 T; 0 other; catttgtatg gtagagagat ttgcccgtgg tatgaaagct tccgcaaatt ctctagcctt 60 ctacattctc accgaagaga aattgcttcg ctcctagacg aacctatcct cgctccagag 120 gaagcactca cactaggagg aagacagcat cgcttccgat aggattgctt ccgctctcca 180 cctacgcttt tgcactcacg catacggctc cactaacacg tgtgcagctc cagcgagctg 240 cgtatgcgat agcgtgagag agcactacga gagagctcca agggagagcc ctttgcgcac 300 cgttgaacca gtgcgacatc gcacacccac accactgata aaaatctgtg cactatacac 360 gcacacgaac tcatcggcac attagctacg tcatgactac gaaaagaaac cttaccccgt 420 tcgtcgataa cgacaatggt agctgccgct tatgcagccg gattgatgat cccaatatgg 480 tccaatgcga cgaatgcgat cgctggttcc atatggcatg cgccaaactg agccggctgc 540 ctaaagcgga cgaaccattc ttgtgtatta aatgcacaaa agactatgct aagattaagg 600 ctccagcgtc gactggctca gaccaaaaca cggcaattgc agccctaatc gaagcgctta 660 aaggtagcag cttagataca aacacatatc tcaaacgctt gacctttgat cagctacctg 720 atttcgatgg caaggccaaa gaatggttga aattcaagcg tgcgtatgag gaaacgacta 780 agcaggctaa atatacaaat gttgaaaaca tgacccggct tcaacacgcc cttaagggag 840 aggcgtataa atgcgtgcac cgtctatttc tcgagccaga taacgtgccg gaaataatta 900 cgaagctcga agagcaattc ggcagagcag aactcgtgta cgacgagctg cttaaagacg 960 tgcaaaacat acgtgttgaa aatcagcaca aaatacctga tttgtccgac gcgattcagg 1020 acatgatcac taacataaag gcaataaata tgcctatgta tttacaagac catcgcctcg 1080 taaatgagtt ggcttgtaaa ctgcctactg ataggcatct aaaatggatc gaatataaat 1140 ccaccaatat caaaccgggt gttcttccat cgctggaaga ttttggcaaa tggttgctgc 1200 ctcaagcaaa ggtacttaaa gaattgccaa aacggccaga aaggccaaaa cactcgatga 1260 acgttcacca gtcttcagcg cacaacacaa aaagaccgtt caatcacaac cactttggta 1320 acgcacaaaa caattcaccg gccactaaca catacgcaag aaatgcaata gattacaccg 1380 cacagaattg cacagccgca cgagaacgcg gcccacatcc atgcgaagaa acgcatcacc 1440 gtttgcactt cacagccgat gcggaggaga attgtcatca agaacaaaaa atactttacc 1500 aaattgttcc ggtcgtgtta cgcaacgaag ataaaacgct gaaaacattt gcattgttag 1560 attcagggtc atcatttacg ttgatcgagg aagaaacagc taatcagctt cagttagaag 1620 ggccgactga gccgattacg atgacgtgga cgcagaattt gtccatacga gagaaagaaa 1680 gtaggagagt aagttgccta gtgaaaggag agaaggagaa aaaagaacgg gtcttggaaa 1740 aaatgagaac cgtgaagaat ctccagctcc cgaggcaatc aatcaattgc aacgttctca 1800 gagagaaatg tcctcatctg agaggaattg acatcagcga ttacgagagc gcaaggccaa 1860 ctatactaat tggtctcgat catagccatc tactaatacc actagggcgc aaaatgggcc 1920 gaagcgacga gccaatggca atcaagacca agctgggttg gactgtcttc gggatgaacg 1980 tgaaaggaca cgaagcgcct cataatcagc cgatgatcca ctacgatgca gatagtacga 2040 tgatatcaaa agagatcgat cgtggaaagg aaataatcca gaaaccgttc gtggaaacgg 2100 taccgaatcg agcttggtcc gaggaccagc gtatcgcgga accagatgaa gaagatcgcg 2160 gtatatcgat ccccgtactt caccatcatg aagctgtggt tccagattct tctatggcac 2220 ccttgccatt gtgtggaaca catccccaca acccctcacc actacacacc ggtgtagatc 2280 atcttgagcc ccctgacatc accacgcaac gatttgtcga gagggaagtg cgacaaccct 2340 gcccaggcga atcggcagcg gttctgatgg agatgaggat caacccacaa ccatatgtac 2400 atccaccgct taccaacgag gaagacgaga tcctgtcgag acgaggaatt ggctccctga 2460 ctacaaagac tcttcgacac agggggaggc cccaccgacg gagggctttg ctgcggcctg 2520 cgattgggct agcgagcaac gcaggcagcc agtgaaataa tcgctcactt ttttattttc 2580 gtaattcacg agaatgaatt acgaggggaa gga 2613 // ID Clu-47_AG repbase; DNA; ANG; 1048 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-47_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1048 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1446-1446 (2010). XX DR [1] (Consensus) XX CC 2 bp TSD. >97% identical to consensus. XX SQ Sequence 1048 BP; 246 A; 232 C; 286 G; 280 T; 4 other; ccctactagc aatgagaatg aaaaagtgtg cgactttcag gcgaggggca tgtagaagca 60 tggggagacg gttggaaatn cgcggaattg cncggaggcg cgagcgtgtt ttggtttcta 120 cacagttttt gcactcacac aattacacat gtgtgaatgt gtgcgttcac tttcagcctg 180 cgcgctcagt ttggttttct cgcagcggca tgcgggtatc ggtgtctcgc tcgcactagt 240 gcagcgagtg ttcctctgtg cgctcccctt cttgccacca cacgaaatcg tcagttcagc 300 tcaagcgttc gccgtgaaag gctgcgcgcg tgttagccta agtcccgggc gatcgaagtt 360 tcgctaagtg ttgaagtgtt gtgcacattg aaatgaagct gcgcttttct ttcaccattt 420 atcctaaatg tgtgatttgc gctgcttcat ttcaatgttt gtttttgctg tatcgaacat 480 caacaacggg atccacgcag ttaggcggct atttnaaata cggcgatttt ttttatataa 540 agaatgtgtg tgattttgtg gcaagattgt gtgaagctat gtgtgaaaat gcccctttat 600 ttgcaataaa agtgataaaa tataaaccaa gtttgatgtg ttcaagtttt tgttttgatt 660 cgggtgcacc aagtccgaat gaagggaagc gcacagcgcg ctacgaaaga aacgagcggg 720 agggggtgtc agtccagcgc agcgcaattc gctggaatca agtgggaggg ggtaccagtt 780 cagcgctgcg caagccgaaa gaaacgggaa ggaaggggta tgaggaaaat acgatgaaac 840 agcgcgagcg agcgttcttt acagcgagag cgcctcgtgt attttaaagg caagcaagag 900 agcgcattct ctccgtggta gcgctccatc cgccattttt aatttcgagc ccaaaacaac 960 ggacttcgga tccgancttg ggccgggccc ccaccgcgtg tttctaccta cccctcgcct 1020 gaaaatttcg ttctctttgt ctgtcggg 1048 // ID MARINERN5_AG repbase; DNA; ANG; 1172 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE MARINERN5_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN5_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1172 RA Kapitonov V.V. and Jurka J.; RT "MARINERN5_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 22-22 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of MARINERN5_AG in the genome. CC They are ~98% identical to the consensus sequence. CC MARINERN5_AG copies are flanked by 2-bp target site duplications. CC This element has imperfect 26-bp terminal inverted repeats. CC Putative classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. XX SQ Sequence 1172 BP; 342 A; 253 C; 274 G; 303 T; 0 other; cccagatagc aaagccgaag gcctttgtat gggagccatc gcgctgatca aggctcactt 60 tttcgcgcgc ataaagcaaa atgtgcgcga aggcgaaaaa aaacccgaaa gcgcttcgtg 120 tggccctctc tcgtgtttct agttttattc tctctcgcat gtcatcactc tctcccagtt 180 tttcggattg aaatgagaat tcactctctt gatttggttg cggcagtcta gctgcttcac 240 actctttatt ctcttctttt tgcagttccc acaagaggat tcacacatgt gttctcacat 300 acttacatac acacacacac ggtggttgga agcctggtgc ggctgtttcc aattctgtct 360 tttcttcttc cctcttcaca ccgcgctgga gcccgtggtg gtgcgcgtgt tcagcctgct 420 tggttcagga aagatgtcat cagatttggc gcgcctcaac cgcggtgttg tatgaaggct 480 gttgatgaaa atacacaatt aatagacaca ctttgttaaa aagattacac ttcatttaaa 540 gatgtgacaa acttaccttt gtgcttgagg atcaattgaa acttgaacgc ctgtgcacac 600 agtgtcacac tttttaagta gtttgaggaa cagctagcgg tagcagcaca catctgaaac 660 ttgtaatgtg caaaacattg catattagaa taatattaga attagaataa atgtcacata 720 gcactagaca cttagaaaag gacacttacc caaacaatta gatgataaaa attcgtcctc 780 cgatgaagca gaaagaacac cgcggatggc gcagtatgga aagtgtttca cgcaggaaaa 840 atagttcacg cagcacacaa acacacactc tcacatccac ggttcagagc acactgaagt 900 cgctctttgg cttggatgga aaagagcaaa caccctgatg agtgcgatgg aagaaaatga 960 cgtgtgttca acagggatgg gtagaatcca acacacgaag tgtgcaggag aattctcttc 1020 gtgtggcaag agagaattga ctaacaccct gatgagcgag aacgcaacaa acgttgtgga 1080 atgtagaaac gggataggaa tagtaagcga aggcagtttg tggacatgag ggggttgaaa 1140 ctgggagaaa aaagcttcgg ttgctacttg gg 1172 // ID AARA8_AG repbase; DNA; ANG; 4188 BP. XX AC . XX DT 09-DEC-2004 (Rel. 9.11, Created) DT 29-OCT-2010 (Rel. 15.11, Last updated, Version 2) XX DE Mosquito I clade non-LTR retrotransposon, partial sequence - a DE consensus. XX KW I; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; AARA8_AG. XX NM AARA8_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Cook M.J., Martin J., Lewin A., Sinden E.R. and Tristem M.; RT "Systematic screening of Anopheles mosquito genomes yields RT evidence for a major clade of Pao-like retrotransposons."; RL Insect Mol.Biol 9(1), 109-117 (2000). XX RN [2] RP 1-4188 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [3] RP 1-4188 RA Gentles A. and Jurka J.; RT "Anopheles putative non-LTR retrotransposon."; RL Direct Submission to Repbase Update (30-NOV-2004). XX DR [3] (Consensus) XX CC This sequence is ~98% identical to Ag-I-3 in [2]. XX SQ Sequence 4188 BP; 1371 A; 1046 C; 677 G; 1090 T; 4 other; ttcactgtaa actgtcgcct acctgcacta actgtggtac ccctgcgcat ggagagtgac 60 cgtagatccc gtgtgcataa ctgcaaggac aaccacagca ccacgtctcg tacctgtccc 120 cctacctcca agaagaacag gtcatcaata aagatcgaca acaactgttc ttacggtgag 180 gcacgtaaag ctctgcttaa tgttaccaat ccacaaattc tagtcccatt caaaaccgca 240 tcacatacgc acaacagaca cacacataga tgaaaaagac acccaaatta agaactcaaa 300 aaaagaatga atctctagaa aaaatgataa ttttgcttac acaaacagta actaatctaa 360 caacaaaaat cgaagatcag gataaaacta tgaaaagaat gtgcaagaat tttgaaaaga 420 gtttagaaaa aaagaaaata gaatctcaaa tggaaattga agaggatgaa agtgaagaag 480 aattagacga ttttagcacg gaagaaagca cggaagaaag cactgaagaa gaaactgaag 540 aagataatga agacaaaaaa gtagaaaaag taaaacatgt aaaacaaaat gagtagagaa 600 caaaataaca ccactaacaa acggaaacgt tccaccaaaa gacaccccct cctcggaaga 660 ggaatcccaa ccccgaaaaa tcatccccat ctctctcatc ccaatccccc aaccctccca 720 catccaatct tctctctgtc gcctctaaaa aaactctaga tctaacacga aatctgtaga 780 ttaatttata atttcaatat taacaatatc cttctgatta taagcattaa cttttatcca 840 tttttccttt actacataat tcaattcatc atatcaatct aataactatt catccatatt 900 tgcattatcc tggaatatca gaagcatgaa ccaaaatata gaagaactaa aactactaat 960 caatagacac gatccaacaa taatcaccct tcaagaagta atgcagttcc ctctctcctt 1020 agctctcctc ttaatagata caattggtac actaacatac acccaaccat acctcatcat 1080 agtgtagctg taggcatttt aaaacacata tcttcccgca gtagctgtag aaatttctct 1140 cccgtgcaat gctccatagt atgtttaatc tttctcattc aatcaaccct tttacaataa 1200 tttcgattca ctagtacttt cagcaaatta ataataattt aattgtttta ggggatttta 1260 atgcccaaca taccacctgg ggtgctcgac ctcatgtaac agagggaaca gcattgcttg 1320 tgtttttgaa acaattggtc tggaagtaat ttcaaaccaa tccgctcacg tatttctccc 1380 ataaacggca aaggctccat acttgactac tgcgctgtgt catcttctat ggcacagcag 1440 ttcacagtta cagtttctaa tgacactctg gtagtgatca ttttccactt ttaatccaca 1500 gtagtttaaa tccccagcgg ccacttctcc gccccaggtg gaaatatgag gaggccaatt 1560 gggcagcata tcaaagagaa ataatgttca atcttccact tgatgaaaat ccccctcttc 1620 tcgctttact gcatgcattg aatacgccgc ttctcggagt attccccgta caactggaaa 1680 acctggaaaa agatgtgaac catggtggaa cgcaacagtt gctgcagcta ttaaagctcg 1740 cagagctgct ttgcgcaaat tccgccgagc tagcaaaaat ccggataatt ttttcacacc 1800 catttttgct gaggaatatc gtacagccaa tcgacttgcc aaggaagcag ttcgtcttgc 1860 caaaaagaat aactgggata atttcattaa tgaaattaat cctcagttat ctagcaagga 1920 agtgtggagg cgagtcggtt gtttgaatgg taaaaatcaa caaagctcca caatagttct 1980 caaaacccag gacactatca ttgcccctgc tgaagtacct gaagcctttg ccatccactt 2040 ttctgatgtt tctgccacac acaattatcc gagtaatttt caaacccata aacttaacac 2100 tgaatctgtt ccaatttctt tccctgatgc cactgaacat agatacaata gtcttttcac 2160 aataactgaa ctccattggg cactaaggaa atgtaaaggt agatctgctg gtcccgataa 2220 cataggatat cccttactgc aaaacctcct gtagaatcta aatccaccct tcttaacatt 2280 tacaatagta tttggtcttc yggtaatatt cctaatgatt ggaaaagtag tttaactatc 2340 cccataccca aacctaacaa acctaacmac aacgttgata gctaccgtcc aatctccctc 2400 cttagctgca tgggtaaagt tttagaacgc atggttaatc gccgcctctc gcaagaatta 2460 gaagacagaa acctgcttag ctctgaccaa cacgcttttc ggagcgggtt gggcacggaa 2520 acatattttg ccaaattaga tgatactatt caaaaatcaa tagaccagga tcatcacata 2580 gactttgcca tcatagatat ttccaaagct tttgatcgca cttggcgcca ttccattctt 2640 tctcaactag ctttttgggg ctttggtggc cggctcacta atttcataga caattttttt 2700 acagacagat catttagagt cctaatcggt aatagtgttt ctaatccata tcctctagaa 2760 aacggtgtcc ctcagggcgc catcctttct cctacacttt tccttatcag tatagaatct 2820 ctgttttgct ccattccata tgaaatcaaa ccttttgtct atgccgatga tatcattctg 2880 gtctcttcat cgagatctgt tagcacgtct cgtcagcttc tccaaaaagg agttgacaag 2940 ataaggcaat ggtctcggtg gacaggtcat gaaatatccc attccaaatc tcaaatcctc 3000 catatctgca aaatgggctt tcatcgcaaa ctaccaatca aacttcacga ccacatcata 3060 ccaaatgtaa actcagcgaa aatattaggt gttacatttg acacaaaact tactttcatt 3120 ccccactcca accggataag aaaagaagca aaaaccagac tgaacctctt taggatgtta 3180 ggagctggac aacatcgtgc atcacgtcaa acacttctgc agatcctcaa tagctggctg 3240 cttcccaaaa ttctctaygg tattgagatt gtctctcgcc aacgagaaaa ctttgagaag 3300 cgtattgctc ccacatacca tacagccatc cgcctttcaa ctggagcctt ctgtaccagt 3360 ccaatccact ctctactttg tgagagcgga cttcttcctt ttgattacat tatcaccaac 3420 agattaacag cagcagcagg acgcatccta gagaaggaca tcaaagccga atcatttatc 3480 aacagagtca atgcacaatt tcataccctt accaacaacc agcttcccca tatcagcaag 3540 cttaccggac gcggaggacg cccttggtat cgtcaaccgc ctacaataga ctggtcttta 3600 aaagataaac tacgtgcagg aaattgcagc catatcgccg gccatcattt taccagtctc 3660 atccaaagca aataccagaa tcaccatcat atctttaccg acggctctgt cctcaatgaa 3720 tccgccggtt tcggtatttt ctcctccaac aacagttgtg ccatcaaact accagaccat 3780 acctctatat tctcagctga agcaatagcc atgatagtcg ccgcgcaaga aggtatttgc 3840 cttaataaac caaatattat tttcaccgac agtgccagtg ttctagccgc cctagaacac 3900 ggtaacatcc gagatccgca catcyaacta ttagatcaac tagataactc cccggttata 3960 acattctgct ggatacctgg tcactcgggt atcagcggaa acgagaaagc tgaccaactt 4020 gcaaaccagg gtcggctccg tcctccagaa aacaacacaa ccatagcccg cagagacttc 4080 aacaaatgga gcaaaacaat agttgacgaa aaatggaacc tcacctggca ccggaaacaa 4140 aattctttct tcgaaccatc aaaccatcac aacagcctgg aaagacac 4188 // ID HATN2_AG repbase; DNA; ANG; 365 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE HATN2_AG is a hAT-like nonautonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW 8-bp TSD; HATN2_AG; nonautonomous DNA transposon; KW hAT superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-365 RA Kapitonov V.V. and Jurka J.; RT "HATN2_AG: a family of nonautonomous hAT-like DNA transposons RT from African malaria mosquito."; RL Repbase Reports 3(2), 19-19 (2003). XX DR [1] (Consensus) XX CC HATN2_AG is a family of nonautonomous DNA transposons that CC belongs CC to the hAT superfamily. One subfamily of HATN2_AG was formed as CC a result of transpositions of the P4_AG transposon whose copies CC harbor HATN2_AG. CC HATN2_AG elements are flanked by 8-bp target site duplications. CC HATN2_AG has 15-bp terminal inverted repeats. XX SQ Sequence 365 BP; 73 A; 94 C; 120 G; 78 T; 0 other; cagtggcgga ttaagggtat cgggggccct aggcggtaag acgagttgag gccccctgta 60 aatggtaaat ggtgttgggg gggtggtagc ggcactaaac ttgctgttgg gagattgaac 120 agtaaacttg aaatgccgtc gggggccctt cgatcatcag tcacaggctc taaaattttt 180 gctgacgggg gggggggggt gcaaatgatt cgccagcctc ggggccccaa cgtccatcct 240 tccgggggcc cttgtcgcta gtactctgtc catacggcat actatagaat ggcgatctac 300 ggggccccta aatcggcggg gccccaggcg accgcctagt ccgcctaccg ttagatccgc 360 cactg 365 // ID GYPSY11-I_AG repbase; DNA; ANG; 5781 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE GYPSY11-I_AG is an internal portion of retrotransposon GYPSY11_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY11-I_AG; GYPSY11-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY11_AG; mdg1 lineage; reverse transcriptase. XX NM GYPSY11-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5781 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY11_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 164-164 (2003). XX DR [1] (Consensus) XX CC GYPSY11_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY9_AG, GYPSY10_AG, GYPSY12_AG, GYPSY13_AG, GYPSY14_AG, CC GYPSY15_AG, GYPSY16_AG, GYPSY17_AG, and GYPSY8_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY11-I_AG consensus was reconstructed after multiple CC alignment of 7 copies. CC The consensus encodes the 445-aa GYPSY11_AG1p gag-like protein CC (pos. 772-2106) and the 2122-aa GYPSY11_AG2p (pos. 2067-5732). CC The sequence of the LTRs flanking GYPSY11-I_AG is deposited as CC GYPSY11-LTR_AG. XX FH Key Location/Qualifiers FT CDS 772..2106 FT /product="GYPSY11_AG1p" FT /note="gag-like protein" FT /translation="AATTTNVFKKVPYTCNFYGRSCINMQKLLEKIEILDR FT TYDQVRQLNKCYRLCALTTLRNNTKELYDEIQELLRKHESSIKDEILTTLV FT KKSRHLYYEINKCIKIHFERHPDSLNTTLSENQFDITIETKSDKMADIIEL FT IKITTSLISKYDGNEKDLKGVVSNLNVLKKIVKPENRETIIELVLGRLTGK FT ARIVVGEAPKSIEDIVNKLQDRCSIKVTPEIVVSKMDNTKQTGTIEDFGSI FT IEKLTQQLEEAYIAEEITPEVARKKATKSGISALSYGLKDGETKIIMRSSK FT FETLHEAIEQAVKLELEDRTKKGKNEQTKILYSNATRNNRGYGNNYQGRNN FT YNRFTNNNNNRYQTQNPPRFPPARYGHNNNRNNNNYRNNNFNNTRNQHANR FT QNNSNRNQYVQSNQRNNSNLQNNRAPIHNTVTAEEQNNFLGQPQASENTQY FT " FT CDS 2067..5732 FT /product="GYPSY11_AG2p" FT /translation="FFRATSSIGKYPILTINPDADNFVKVKIEITKEIYST FT LIIDTGATVSVLKASKLKPGCKINTSKKLTLISSSDHESETLGTAMTTIHF FT GDYSIMHEFHIIEDVESIFSDGLLGKDFIKHRCIVDYVNWMIYFSSDNGLI FT SHPIEDNVNGNYILPKRSEVVRKISIPNLTEDSIILSQEIQPGVFCGNTIV FT SKRNQYIKFINTTDKDVSFNIKSYTPEVEPLREYEQLQKKLDTSKERIQKI FT HNKIHIENIPQIAREELENLITKFSDIFCLEDEPVSTNNFYTQEISLKDNI FT PSYIPNYKQIHSQTEEMQSQVEKMLKNNIIEHSVSSYNSPILLVPKKSGEG FT KKKWRLVVDFRQLNKKILPDKFPLPRIDTILDQLGRAKYFSTLDLMSGFHQ FT IKLDKNSRKYTAFSTPTGHYQFTRMPFGLNISPNSFQRMMAIAMAGLTPEL FT AFVYIDDIIVTGCSARHHISNLGKVFDRLRKYNLKLNAEKCCFFKTEVTYL FT GHKITDKGIYPDDAKFDTIKNFPIPTNADEARRFVAFCNYYRKFVQNFAKI FT AKPINNLIKKDVKFAWTSECQAAFDTLKQSLLSPTILQYPDFKKQFIITTD FT ASDMACGAVLSQITDGNDLPVAFASKSFTPGEKNKPIIEKELTAIHWAINY FT FKPYVYGQKFIVRTDHRPLAYLFGMKNPTSKLTRMRLDLEEFDFEIEYLAG FT KANVAADALSRIILNSDDLKASIPKSKTILMVNTRAMVKKNNEKTDINKDE FT PIATTGTDHPAMWKTDRPLEVRKVLKIGTQRIKNNVEFIIYNHSYSKALGK FT FLLRNDVNGSQALEFALLEMCKIAKQYGRNKLAWSEEDHLFEEYSQQTIKE FT IANRAITKFEIILFTPTRWITTEKDRLRIISDYHMTPSGGHIGQYRLYQKI FT REKYKWKNMKDDIKKYVRNCKACIVNKTTRHTKEKTVVTTTPTKPFNIISI FT DTVGPLTKTNKNNRYAITIQCDLTKYIVVIPIHNKEANTIAKALVENFILT FT FGTFIELKSDQGLEYNNEILHKISEILKIKQTFSTAYHPQTIGSLERNHRC FT LNEYLRSYTNEHHDDWDDWTKFYEFVYNTTEHSDTNYTPYELVFGRKANLP FT QDIFKTKIEPVYNIDQYYFEMKYKLQKSNEIARENLIKAKIKRQQTLNKDT FT VPLIINLGDQVYLENENRKKLDPVYIGPFTVVSDQGPNCVIQNNTTKKIST FT VHKNRLIKYTGE" XX SQ Sequence 5781 BP; 2420 A; 980 C; 980 G; 1401 T; 0 other; tggcgaccgt gacttttaaa ctgtaatctt cggatgtgca aaaaaaaagt gacgaatgaa 60 acctcaaaca cggacaaaaa gtgcaaagtg gaaacgtttt ataaaaatcg caagtgcttc 120 tgaaatgact tggaaagtga ttaattacca acaatagtga actacgagtg aaaaccaatc 180 attttcaaaa tgggaggcaa ggctgcaaaa cctgaaacaa atataaaagg agaccatgat 240 ctcacaatag ttcaaactca gaatattcat acagaatatc atctgactca ggatttaaaa 300 ctaaacatta ttttagggct gctaatcacc ctgtgcattg ttaaaatagc gaaaacttgt 360 tacaaacacc ttcgtaacca agcgcaaaaa cacgctttaa aagtgcttac gctaccaaag 420 tagcaacgta aacattgaat cgagaacagt gaatgaatga tatacgaaaa aggtgatatg 480 ctatttgacc cacaaaaata ggtaaaggct gtgggaaagt accgtaagtt caccaatgca 540 gctatggacc gcgattatga aaaacaatta tgcgcgtatg agaacctacc tcaacgtgta 600 aaaacactgt gaccagcggt atggtttgga acagccgttc ccaacgcgga agacgagaag 660 aataaggcaa gaatggacgg ttgctcactg tcggacagac aggtgaaaga agagatccct 720 gggccaaggg acagcagttc accgagcacc gggaacaacg tagcaccgtg agcagcaaca 780 acaacaaatg tatttaaaaa ggtaccgtac acatgcaatt tttacggacg aagctgcata 840 aatatgcaaa agctgttaga aaaaatagaa atactagata gaacgtatga tcaggttaga 900 cagctaaaca aatgctatag gctctgcgcg ttaaccacac taagaaataa cactaaggaa 960 ttatatgacg aaatacaaga gcttctacga aagcacgaat catccattaa agacgaaata 1020 ttaacaacct tagttaaaaa aagtagacac ctatattacg aaataaataa gtgcattaaa 1080 atacatttcg aaagacatcc agattcgtta aatacaacat tatcagaaaa ccagttcgac 1140 ataacgatag aaactaaatc tgacaaaatg gctgacatta tagaactaat taaaatcacc 1200 acttctctca tatcaaagta tgatggtaat gagaaagatt taaaaggtgt ggtgtcaaat 1260 ttaaatgtat taaagaaaat agtgaagccg gagaataggg aaacaataat agagctagta 1320 ttaggacgtc tgacaggtaa agcgcgaatt gttgtaggag aagccccaaa atcaatagaa 1380 gatatagtta acaaactaca agacagatgc agcataaagg taacaccaga aatagtagta 1440 tccaaaatgg ataatacgaa acagacagga acaatagaag atttcggaag cattatagaa 1500 aaactaacgc agcaacttga agaagcatac atagcggaag aaataacacc agaagtagcc 1560 aggaaaaaag caactaagtc tggaattagc gcattgagtt atggacttaa ggatggcgaa 1620 accaaaataa taatgagatc aagcaaattt gaaaccctgc atgaagcaat agaacaagca 1680 gtaaaattgg aactagaaga cagaacgaaa aagggaaaga atgaacagac aaaaatttta 1740 tattcaaacg ctactaggaa caatagaggg tatggtaaca actaccaggg aaggaacaat 1800 tacaatagat tcacaaataa taataataat aggtatcaga cacaaaaccc acccaggttc 1860 ccacccgcaa gatatggaca taacaataac cgaaacaata ataactacag aaataacaac 1920 tttaacaaca ctagaaatca gcacgcaaat cgacaaaata attccaacag aaatcagtac 1980 gtacaatcca atcagagaaa taatagcaat ttgcaaaata atcgagcgcc tattcataac 2040 acagtaacag ccgaagaaca gaataatttt ttagggcaac ctcaagcatc ggaaaatacc 2100 caatactaac cataaatcct gatgcagata attttgttaa agtaaaaata gaaattacaa 2160 aggaaatcta tagcacactc atcatagata ccggagcaac cgtatccgta cttaaagcta 2220 gtaaattaaa accaggttgt aaaatcaata catcaaaaaa attaaccttg ataagctcta 2280 gtgatcatga atcagagact ttaggaactg ctatgacaac aattcacttt ggcgattatt 2340 ccattatgca cgaatttcat ataatagaag atgtagaatc cattttttcc gacggactat 2400 taggaaaaga ctttataaag cacagatgta ttgttgatta tgttaattgg atgatatact 2460 tctcatctga taacggattg atttcacacc caatagaaga caatgtaaac ggaaattata 2520 ttttaccaaa acgaagtgaa gtagtacgaa aaataagtat accaaacttg acagaagatt 2580 caatcatctt atcacaagaa atccaaccag gggtattttg cggaaacaca atagtctcaa 2640 aacgtaatca gtatatcaaa ttcattaata ccacagataa agatgtttct tttaacataa 2700 aatcttatac accagaagtt gaaccattaa gagagtatga gcaattacag aagaaacttg 2760 acacatctaa ggaacgaatt cagaaaattc ataacaaaat ccatatagaa aatattccac 2820 aaatagcaag agaagagtta gaaaatctta tcacaaaatt ctcggatata ttttgtttag 2880 aagatgaacc ggtctctact aacaattttt atacccagga aatttcatta aaagataata 2940 ttccttctta tataccaaat tataaacaaa tacattcaca aacagaggaa atgcaatcac 3000 aggtagaaaa gatgttgaaa aataacatta tagaacattc tgtttcatca tataattcac 3060 cgatactatt agtaccgaag aagtcaggtg aaggaaaaaa gaaatggcgt ttagtagtgg 3120 attttcggca attgaacaag aaaattttac cagacaaatt ccctttacct cgcatagaca 3180 cgatactaga tcagctagga agagccaaat atttcagcac attggatttg atgtcagggt 3240 ttcatcaaat caaacttgat aaaaattcga gaaaatacac agctttttcg acacctacag 3300 gccactatca gtttacaaga atgccatttg gactcaacat tagcccaaac agctttcaaa 3360 gaatgatggc tatcgctatg gctggtttaa caccagagct agcatttgta tatatagatg 3420 atattatagt tactggatgc agtgcacggc atcatatcag taatttaggt aaagtttttg 3480 ataggctaag aaaatataac cttaaactaa atgcagaaaa atgttgtttc ttcaaaacag 3540 aagtaacgta cttaggtcat aaaataacag ataaaggaat ttatccggac gacgcgaagt 3600 ttgacacaat taaaaacttc ccgattccta ctaatgctga tgaagcaaga cgttttgtcg 3660 cattttgtaa ttattaccgt aaatttgtac agaattttgc taagatagct aaacctatta 3720 ataatttgat taagaaagac gttaagtttg catggacttc agaatgtcaa gcagcttttg 3780 atacattgaa acaaagctta ctctcaccca caattttaca atatccagat tttaaaaaac 3840 aattcattat tacgacagat gcatcggata tggcatgtgg tgcagtgtta tcacaaataa 3900 cagatggaaa cgatttgcca gtcgcatttg cgagtaaaag ttttacacca ggggaaaaga 3960 ataagccaat tatcgagaaa gagcttacag ctatacattg ggcaattaat tattttaaac 4020 cttatgtata tggacaaaaa tttatagtta gaacagatca tagaccatta gcatacttat 4080 ttggtatgaa aaatcctact tctaagctta ctagaatgag actagattta gaagaatttg 4140 actttgaaat agaatattta gcaggtaaag ctaatgttgc ggcagacgca ctatcaagaa 4200 taatccttaa ctcagatgac ctaaaagcat caataccaaa atccaaaacg attttaatgg 4260 ttaatacaag agccatggtt aagaaaaata acgagaaaac tgatataaac aaagacgaac 4320 caatcgcaac aacagggact gatcaccccg cgatgtggaa aacagataga cctttagaag 4380 tgagaaaggt actaaaaata ggtacgcaga gaattaagaa caacgttgaa ttcataatat 4440 acaaccattc atacagtaaa gcactaggaa aatttctttt gagaaatgat gtaaatggaa 4500 gtcaagcatt agagtttgct cttctagaaa tgtgcaaaat cgcgaaacaa tatggaagaa 4560 ataagctagc atggtcagaa gaagatcatt tattcgaaga atattcccaa caaactatta 4620 aggaaatagc caacagagcc attaccaagt ttgaaataat cctgtttact ccaactagat 4680 ggataacaac agaaaaagat aggctgagaa taatttcaga ttatcatatg accccttcgg 4740 gaggacatat aggccagtac agactgtacc agaaaataag ggaaaaatat aaatggaaaa 4800 atatgaaaga tgatatcaag aaatacgtac gaaattgtaa agcatgcata gttaataaga 4860 cgactagaca tactaaagaa aaaacagttg taactacaac accgacaaaa ccatttaaca 4920 taatttcaat cgacacagta ggacctctaa caaaaactaa caaaaacaac aggtatgcaa 4980 taaccataca atgtgactta acgaaatata tcgtagtaat acctatccat aacaaagaag 5040 caaacactat agccaaggca ttggtagaaa acttcatcct tacattcgga acatttatag 5100 aattaaaatc agatcaagga ctagaatata acaatgaaat attacacaaa atctcagaaa 5160 ttttaaaaat caaacaaact tttagcacag cttaccaccc acagacaata ggatcattag 5220 agagaaatca tagatgtcta aacgaatacc taagaagcta tacaaacgaa catcatgatg 5280 actgggacga ttggacaaaa ttttacgaat ttgtttacaa tacaacagaa cactcagaca 5340 caaactacac accatacgaa ctagtatttg gtagaaaagc gaatttacca caagatatat 5400 ttaaaacaaa aattgaacca gtttataata ttgaccaata ttatttcgaa atgaaatata 5460 aactccaaaa atcaaacgaa attgccagag aaaatttgat aaaagcaaaa attaaaagac 5520 agcaaacctt aaataaagat acagtaccac ttattataaa cttaggagat caggtatatt 5580 tggaaaatga aaataggaaa aaattagatc cagtctacat tggacctttc acagtagtaa 5640 gtgaccaagg gcctaattgc gtaatacaaa ataatacaac aaagaaaatc tctacagtac 5700 acaaaaatag attaattaag tacacaggag aataacttca atcattgtaa ttcattacgt 5760 tattctatta aaggggggag g 5781 // ID GYPSY15-LTR_AG repbase; DNA; ANG; 794 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY15-LTR_AG is an LTR of retrotransposon GYPSY15_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY15_AG; GYPSY15-I_AG; GYPSY15-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-794 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY15_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 172-172 (2003). XX DR [1] (Consensus) XX CC GYPSY15-LTR_AG is a long terminal repeat of GYPSY15_AG CC (its internal portion is deposited as GYPSY15-I_AG). XX SQ Sequence 794 BP; 287 A; 201 C; 125 G; 181 T; 0 other; tgtagcatat ctgctataca gaacatattg taatacacac tatcagccat agacacatag 60 taacacactt tggaaagcca aatcacacca ttgtaacaca ttcattataa cacactcagt 120 aaatcagatc ataccataca cacccggaag agctcaggcc acaccgttca tataaaaaca 180 attgtgacac acttaacaga accaaccgaa acgtacaaca taagcaggaa cagtttatgc 240 tttataaccc aaagcaggaa cggtatatgc attaaaccca aaacaggaac ttagtgacaa 300 gaaccatcgt tcttgtcaac aaaagacaaa cgtaactagc ttagtgaaaa caataacttt 360 tcaacataca acccggaacc aaatgtgaga aacccttaac ttcttttgag tataaataaa 420 accaactccg atcatggcag gtcagattcg ttcggactgt caggatagga ctatgcccat 480 cctacatcta tcgagttctc atttaaagag tttagatatg tccccctgcc caaggtaagg 540 cctctaaact ccaacgtttc cagtaaggga aacctcctaa aggtttctat gatcgcctgg 600 acgatcaagt gtcgaaagtc cgcttcgaaa cagtatccac ctcagttcta cttaattcgg 660 caaacgatga cgaaggccac cctgaccaac gcgagttcaa agttactaca tggcaaccgt 720 cccaaaatcg aacatactgt gactatctca aaacatacca catggcgacc gtaacagtca 780 tcaaacatat taca 794 // ID Clu-168_AG repbase; DNA; ANG; 1412 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-168_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1412 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1453-1453 (2010). XX DR [1] (Consensus) XX CC TA TSD. >92% identical to consensus. XX SQ Sequence 1412 BP; 477 A; 256 C; 251 G; 428 T; 0 other; tagggtaagt gtaccaatta tggctataat atgtcatcaa gataccgatt ttacgcctgt 60 aaaaataagt atgggtaaaa tttcaaactt ctttatgttt tcttgtagcc tacaatgtgt 120 agaacacgag ggtatttttt attttgactc agtgtttatc gaaatataga aaatttcaat 180 taaaattgca tctcaaggta ccaattatgg ctatagtgtt cctagcaatt cgccatagct 240 gtaccaattg tggctgtatg ctaaaactcg tggatattta caccaaaaat tactttgtga 300 cattgtttta actgaaatca agtaataata agttcactgc accaatataa gtatgtaacc 360 accattaaat acgtaaaatt aacattttaa aatatccgtt ttctatcact ggttgtattg 420 atatccacgc tcaattaatc gcacaattga tgaggctaat cctaaaaaaa tactttcgta 480 cacctttctg atgacaccgt gcttggacac acacctgaca atccttcata tcacctggaa 540 gctataccga agccagctac cgtttcagcg cgatacgaag aggagcacaa atcagggtgg 600 tctgtttgaa tttgtttaaa tgaagaatcg tagccatagt tcaactatct tctggccaag 660 atacttccgg ccaagactag cactaactat cgatcaggga agaacaagag cgcatttggc 720 aataaggcga cacgagcggc ttcagatttc gctttcttcg tatgaaagtc attcaagact 780 gcattatttt ggtgaaactg attcgatttc attccactga ctattccact gattactgat 840 tcgaacactt tacaaataaa tgatgcttaa aacacatatt aacactttag attgcgaaaa 900 taacaacgag aaacccgagt tgttcacttg taaactacgc gctacgacca aaacgacagt 960 tgtcaacaaa gcatccatag catactgcga tatttggtac accgaaggtt agttgtacca 1020 attatggacg tacagaaaaa aggcatattt tcgatgaaat tttggtagat tttgatgtgt 1080 agtcgttaaa taactaatat tttgcaccaa aatgatgatt ttgagataaa attatggtta 1140 ggtgcttgaa aatcgctaat atcctacagt gctgctctta gcaaaatggt aagagatacc 1200 aaaaatcttg gcaacacatt ttaaaagggt gatatttaac aaacggaagt gaccagcacg 1260 gaactaatga agcacaaatg tgttaaaatg ttcataattg aaatgaaaaa tccttccgtg 1320 gttttagaat gcaaatagac tgttttagag cgaaaatagt gttcgttata gccataattg 1380 gtgctatagc cacaattggt acacttaccc ta 1412 // ID GYPSY36-I_AG repbase; DNA; ANG; 5414 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY36-I_AG is an internal portion of retrotransposon GYPSY36_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY36-I_AG; GYPSY36-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY36_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5414 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY36_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 64-64 (2004). XX DR [1] (Consensus) XX CC GYPSY36_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase CC RNase and Integrase is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY29_AG, GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, CC GYPSY34_AG, CC GYPSY35_AG, GYPSY37_AG and GYPSY38_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY36-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 1423-aa GYPSY36_AGp gag-pol like CC polyprotein CC (pos. 1086-5354). CC The sequence of the LTRs flanking GYPSY36-I_AG is deposited as CC GYPSY36-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1086..5354 FT /product="GYPSY36_AGp" FT /translation="MERFLSEYTRAGLVRMCETAGLALSGTKEELAKRLLE FT AGVTSDGQREENIGEINASGYFDVDADSETGSNGATAAGGAMAVTAESRAI FT VTEVVSQKMDVPRSIETHAPFVQPYSFRDVEEGIDPYGNDLTKDIGAWFDD FT FEGVANMAHWTDEQRFIMCRRKMVGVARSFLSTERNIISYAALRSKLMKEF FT GEIVRSSDVHRRLMSRVKRPQETMLEYVYEMQRIARDTEIDVESIIEYIVD FT GVACDTKVRASLYRARSMAELKEELLRMERAEKKSVVKRNNGQQDQRNGGV FT GLRKCYICGELGHEAQKCVKSTKCFECGRSGHRAKDCSIKKRLVADTRAIV FT PSTNERSPVKCFKCGGVGHIARNCNKNSYSVMGGQIQESNKELPTKQVSVC FT GKEYAALIDTGSEVSLMREDVFAGLPPQFKQWKQSNHILRGLGGVPQNGFG FT EATLAVNIDNWVYQVQWVLVPYEAIRTPLIVGMDFLHSVNYNINNETVMIE FT PRRNEHVLGIQRNEEEHERQQHVNNILSEVMNTMCASETMEPVIPPQFRGE FT IMTMIETYNNTSCSERKETCPVKLEIVPDGQIMPFRHPPSRLSFSESEAVD FT AQVDEWLKEGIVRPSVSNFASRIVIVKKKDGSNRVCVDYRKLNSMILKDGF FT PIPVIDDVVQKLQCAQWFTVMDLENGFFHVPVAENSKKYTAFATRRGLYEF FT NRAPFGLCNSPAVFIRYVNHVFRDLITNNLLDLYMDDMVIHGSTERECLWK FT TERVLKVAAEHGLNIKWKKCQFMQREITFLGHRVKTGQISPSVEKINAVKH FT FRIPQNVKAVQAFLGLTGFFRKFVKDYSKIARPLTDLLKKDTSFEITGSAL FT EAFKRLKEELIKEPVLRLFDPEAKTELHTDASKAGFGATLLQWVEGKLHPV FT YFWSKKTSEPESNKHSYVLEAKAVFLAVKKFRQYLLGRPFKLVTDCAAFKC FT TLKKSEVPQEVLPWVMFLQDFVFEVEHRSGKKMQHVDCLSRYPTEVMTVTN FT ELTARIRKNQQQDEMVKAISEILLDRPYGSYKLKGGLLYTVVDGNELLVIP FT RNMRKQIIENAHNDGHYGAQRTMHTIRQKFWIPHLEVLVKQHISNCVKCIL FT HNKKLGRQEGCLNPIDKGDAPLRTLHLDHVGPMDATSKQYRYILTVVDGFS FT KFVWLYPTRTTNAEEVLQKLESWSSIFGYPTRVITDRGAAFTAKVFAEFVQ FT KQNIEHIVSTTGVPRGNGQAERINRTVLSVLGKLSMGDTAKWYKQVNRVQR FT SVNGHLNSSTGRSPFELMFGVRMRQASDDLLHEMLEKEWYEEYERDRQGMR FT QEARKEIELAQKRYKEQFDKKRKPEHGYKIGDLVAIKRTQFVAGRKLANEF FT LGPYEITKVKRNGRYDVKKAASCEGPQVTSTSDDNIKLWSYATIESSDEDE FT LDEDQELKE" XX SQ Sequence 5414 BP; 1577 A; 937 C; 1625 G; 1275 T; 0 other; tttggtgtca gaagtgggat cacgacgaac gcgctttggg acaagtgtga atttcctgga 60 aattcgattg cgatcgacgt tgtgtggagt gagacagtga ggcatcaaag cgctgggtgc 120 gctggacgga gcacaaaaga acagtgtacg agtgtagcgg cgtttcgtac gggtgttcgg 180 tgcgatcagg cgctgggtgc gcaagttttg gcggtgtgaa agaaggcgct gtgtgcgcga 240 gtagttgtgc gtgtttatgc gttggtgcgc tgtgtgcgcg tgaagttgcg gcttgcaaca 300 aaaaggcgct gggtgcgcga gaagctgtgt gcgttggtgc gctgtgtgcg cgggaagttg 360 cggcttgcaa cgaaaaggcg ctgggtgcgc gagaagctgt gtgcgttggt gcgctgtgtg 420 cgcgtgaagt tgcggcttgc aacgaaaagg cgctgggtgc gcgagaagct gtgtgcgttt 480 ttttttgcgc tgggtgcgcg agtgtgctag gtgaaaaaag gattagcgct gtgtgcgcgt 540 gaagttgcgg cttgcaacaa aaaggcgctg ggtgcgcgag aagctgtgtg cgttggtgcg 600 ctgtgtgcgc gtgaagttgc ggcttgcaac gaaaaggcgc tgggtgcgcg agaagctgtg 660 tgcgtttttt ttgcgccggg tgcgcgagtg tgctaggtga aaaaaggatt agcgctgtgt 720 gcgcggtgaa gttgcagtta gcatcgaaaa aggcgcgggg tgcgcgagaa gctgtgtgcg 780 tttgtgcgct gtgtgcgcgt gaagttgcgg tttgcaacag aaaggcgctg tgtgcgcgag 840 tggctttcgg cgtttctgca ctgggttcgc aagttaagcg gtaggcaagg agattggtgc 900 ggtgggcccg aaaagctgcg gtttgaaaga ggtgtgctgc ataactaggc gtcgtgcttg 960 taagtgctgg gtgcctgagg agcagacgct gtgcgtgtgt gtgtgtgagg aactaacgct 1020 gtgcgtttgg ggtgcgaaaa agagttgcgt tttctagcat tttggcgcgt tatatacaaa 1080 aaaaaatgga acgttttcta agcgaatata cgcgtgcggg attggttcgg atgtgcgaaa 1140 cggcgggact agcactgtcc ggaaccaagg aagaattagc taagcgttta ttagaagcag 1200 gggtaacaag cgacggccag cgcgaggaaa atatcggcga aataaacgcg tctggatatt 1260 ttgacgtgga tgcggattca gaaactggca gcaatggtgc gactgcggcg ggtggtgcta 1320 tggctgttac cgcggaatcg agagcaattg tgacagaagt ggtgtcacaa aagatggatg 1380 tgcctagaag tatagaaaca catgcaccgt ttgtgcagcc ctattcgttc cgcgacgtgg 1440 aagaaggcat tgacccttac ggcaatgacc ttaccaaaga cattggggca tggtttgacg 1500 atttcgaggg agtggccaac atggcgcact ggacagacga gcaaagattt attatgtgcc 1560 ggagaaaaat ggtgggagtt gcgcggagtt ttctctcgac tgaaagaaac ataatatcgt 1620 acgcagcttt gcgatcaaag ctgatgaagg aatttggtga aatagtgcgg tcgagtgatg 1680 tgcataggcg gcttatgtca cgggtaaaac gtccccagga aacaatgctc gaatacgtgt 1740 atgagatgca gcgtatagct cgagatacgg aaatcgatgt tgagagcatc atcgagtata 1800 tcgtggacgg tgtggcgtgc gacacaaaag tgcgtgcatc cctttatcga gcgcgcagta 1860 tggccgagct gaaagaagag ctgttgcgca tggagagagc ggagaagaaa agtgttgtga 1920 agaggaacaa tggtcaacaa gatcaaagga atggtggggt aggtttgcgg aagtgttata 1980 tttgcgggga gttagggcat gaagcacaaa agtgtgttaa gtcaacaaaa tgttttgaat 2040 gcggtagatc ggggcatcgg gcgaaggatt gttctatcaa gaagcgactg gtagctgaca 2100 cgcgcgctat tgtaccgagt acaaatgagc gatctccggt aaagtgtttc aaatgtggag 2160 gcgtggggca tatcgctcga aactgtaata aaaatagcta cagcgtgatg ggaggtcaga 2220 tccaggaatc aaataaagaa ttaccaacaa aacaagtgag cgtgtgcggg aaggaatatg 2280 cggctctgat cgacaccggg agcgaggtat cactcatgcg tgaagatgtt ttcgcgggac 2340 ttccaccgca gttcaaacaa tggaaacaat cgaaccacat tttgagaggg ctaggtggag 2400 tgccccaaaa tggattcggc gaagccacac ttgccgtaaa catagacaat tgggtatatc 2460 aggtgcagtg ggttctggta ccatacgagg cgattagaac accccttatc gtaggtatgg 2520 attttttgca ttctgttaat tacaatatta acaacgaaac ggtcatgatc gagccccgaa 2580 gaaacgagca tgtgttgggg atacagagaa atgaagaaga acacgaacga caacagcatg 2640 tgaacaatat tttaagcgaa gtgatgaaca caatgtgcgc cagcgagaca atggaaccgg 2700 ttatcccgcc tcaatttcgg ggcgagataa tgacaatgat cgagacatac aataatacca 2760 gctgcagtga gagaaaagaa acgtgtccag ttaagttgga aattgttcca gacggacaga 2820 taatgccgtt ccgccaccca ccaagcagat tgtctttttc ggaatctgaa gcagtggatg 2880 cgcaagtgga tgagtggctt aaagagggta ttgttaggcc ttctgtgtcg aattttgcga 2940 gccgcatagt tatagtgaag aaaaaggacg ggagcaaccg ggtgtgtgtc gattatcgga 3000 aattgaattc gatgattttg aaagacggat tcccgatacc agtaatcgat gatgtcgtgc 3060 aaaaactaca atgtgcacaa tggttcacgg ttatggacct ggagaatgga tttttccatg 3120 tgccggtagc agagaatagt aaaaaatata cagcgtttgc aacgagacga gggttgtacg 3180 agtttaatcg cgcgccgttt ggtttgtgta attcgcccgc agtgtttatc cggtatgtta 3240 atcatgtgtt tcgggatctt attacaaaca acctactgga tttgtacatg gatgacatgg 3300 ttatccatgg ctcaaccgaa cgagaatgtt tatggaaaac ggaaagagtg ttaaaagtgg 3360 cagcagagca cggtttgaac attaagtgga aaaagtgcca gttcatgcaa agggaaataa 3420 cgtttttggg tcatcgggtg aaaactggac agatatcgcc aagtgtggaa aaaataaatg 3480 cagtgaaaca ttttcgtatt ccgcagaacg ttaaagcagt gcaagcgttt ctgggattaa 3540 cggggttttt ccggaagttt gtaaaagact attcgaaaat cgccaggccg cttacggatt 3600 tgttgaagaa agatacatct ttcgaaataa cgggaagtgc attggaagcc ttcaaacggc 3660 taaaagagga attaataaaa gagccagtgt taaggttatt tgatcccgaa gcaaaaacgg 3720 aacttcatac ggacgcttca aaagcagggt ttggtgcaac attgttacag tgggtagaag 3780 gaaaattgca tccagtgtat ttttggagta aaaagacaag tgaaccggaa tccaacaaac 3840 atagctacgt gctcgaggca aaagcagtgt ttctagcggt taaaaagttt cggcagtatc 3900 ttttgggtcg tccttttaaa ttagttactg actgtgcagc gtttaagtgt acattgaaaa 3960 aatcggaagt gccgcaggaa gtgttaccat gggtgatgtt cctacaagat tttgtgtttg 4020 aagttgaaca ccgatcgggt aaaaaaatgc agcacgtgga ctgtttgagc cgatatccga 4080 cggaagtgat gactgtgaca aatgaactaa cggcacgtat tcgtaagaac caacaacagg 4140 acgagatggt gaaagcaatt agtgaaattc tattggatag gccctacgga tcgtataaat 4200 tgaaaggtgg acttttatac actgtagtgg atggaaacga actattagtg attccacgaa 4260 acatgaggaa gcagatcata gaaaacgcac ataacgacgg gcattatggg gcacagcgta 4320 ctatgcatac cataaggcag aagttttgga ttccacatct agaggtattg gtgaagcaac 4380 atatctcgaa ctgtgtgaag tgcatcttac acaacaaaaa gttaggacga caagagggat 4440 gcttgaaccc aatcgacaaa ggagatgcgc ctctacgcac attgcatttg gaccatgtgg 4500 ggcccatgga cgccacttca aaacaataca ggtatatttt aaccgtggtg gatgggtttt 4560 ctaaatttgt ttggctttat cctactagaa caacgaatgc ggaggaagtg ttgcaaaaac 4620 tggagagttg gtcgtcgata ttcggatatc cgacccgagt aataacggac agaggggcag 4680 cgttcacggc taaggtattt gccgaattcg ttcagaagca aaacatcgag catatcgtct 4740 ccactacggg agtaccacga ggaaacggcc aggcggaacg tataaatcgg acggttttat 4800 cagtgctggg aaaactatcg atgggagata ccgctaagtg gtataaacaa gtcaacagag 4860 ttcaacggtc ggtaaacggg caccttaatt cgtcaacagg gcgatctcca tttgaattaa 4920 tgtttggggt cagaatgaga caggcatctg atgatttatt acacgaaatg ctggaaaaag 4980 aatggtatga agaatacgaa cgggatagac agggtatgag acaagaggcc agaaaagaaa 5040 tagagttggc gcaaaagaga tataaggagc aatttgataa gaaaagaaag ccggaacacg 5100 ggtataagat tggtgatttg gtagcaataa agagaactca atttgtggca ggaagaaagt 5160 tagctaatga atttttgggt ccgtacgaaa taacgaaggt aaaaagaaat gggcggtatg 5220 acgttaaaaa ggcagctagt tgcgaagggc ctcaagttac gagtacgagt gacgacaata 5280 tcaaactatg gtcatatgct acgattgaat cgtctgacga agacgaatta gatgaagatc 5340 aggaattgaa ggaataagga taagtaaact aggtgtagga catcgaggat cgaagtccat 5400 caggaagggc cgag 5414 // ID RTAg4 repbase; DNA; ANG; 7072 BP. XX AC AB090813; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon RTAg4 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; RTAg4. XX NM RTAg4. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7072 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090813; Positions 1 7072. XX FH Key Location/Qualifiers FT CDS 1147..3318 FT /product="RTAg4_1p" FT /translation="MSDEAQPQQVSAPYQLWPRKGSVVVMQPQPIERPATP FT MMELCYSSDDDELNSTIIAMPEPASECEAAEAAMDLEPPAAAQPTPTASPV FT PGNMVVAGPIDAGSCALLMAQLQNIGAQLTTALEELRLCREENAALRRENE FT LLLTGTRSVLELQTAANATLQQSSGQGGNRETARKRQQRLRRRERERQQQQ FT QQQQQQQQQQQQQQQQQRQQQQQCQQQRQQQPQQQQLQQPQQQLWTTVVRG FT RPSQRHRQPQQQQQQQQQQGERYVPPQLRQQRQQQQRPRQQQQQQQQQQQQ FT QGERYVPPQLRQQRQQQQHQQQQQQQQQQRQQQQRQQQRQQQQRQQQQQQQ FT QQQRQQQQRQQQQQQQQQHQQQQQQWQQQQQQQQQPRQSLPHRKQTQLQLS FT PRLQQQQQQQQQSQQQQQQQPQQLLWTTVVRSCPSQRQRQLQQQQQQQQQQ FT QQGERYVPPQLRQQRQQQQPQQQQQQRPQQQRPQQQRPQQQRSQQRKPAKP FT ELIEVSPNEGQDWESLLLLVQTAVRTDERYKPLKDHVVLGRRTSKALLRLT FT LSRKANAQYMLQQVPAIVGSAGVCRHVTEMASLVIHDVDPLAREDDLTSLI FT DSKFESGAGIVSTTMTKMADGTQRAYVRLPAMFVSELDGTKIKLGFCVSKV FT RAAPPTPRERVRCYRCLELGHWAHDCRSPDDRQNMCIRCGVVGHMAKVCTS FT QPKCLKCGGPHTIGHPDCARSALQ" FT CDS 3237..6614 FT /product="RTAg4_2p" FT /translation="MHFSAKVPQVRWSTHNWTPRLRPVGLAMTPQLRVMQV FT NLGRGERAQDIALQTAQEKRVDVLLLLELYRPPANNGRWAFDCSKKVAIVA FT TGSLPLQRIWCSNTPGLVAAEIGGTTFLSCYAPPRQPTDEFERFIEAVQLE FT TLTHSQVVIAGDFNAWHVEWGSERNSEKGEELLSAIQQLDLVVLNQGTTST FT FDGNGAATASIVDVAFATPTIAQPGTWNVCGDYSYSDHRYITYTVGTIVPV FT VNEPSSPRMRHQGRIRHADRRYKATQFSQRAFRARFSERAVSHERMVEIML FT ATCDKTMQRVTTSHSDPHRDLFWWTPLLRLLRENCDRARDRMRQTSDLQER FT SIAAAEHRTARAELGKAIKASKRNSFQELIDIAEENVFGAGYLVVLSHLRG FT GRTPPETERDRLEHIVSDLFPQHPPLVWPEAADIEGEEQPGAVADVSDDEL FT KLIARRMANKKAPGLDGIPNAAVKAAILEHTGVFTALYQDCLVNGTFPAAW FT KRQRLVLIPKPGKPSGVSCSYRPLCMLDALGKVLERLILNRLHEFLEDPES FT PRLSDRQYGFRRGCSTIGLIQRVVEAGQRAMSFGRANRRDKRFLLVAALDV FT RNAFNTASWQAIATALRTKRVPAGLQRIIHSYFQDRELVYETSEGPVVRSV FT TAGVPQGSILGPTLWNTMYDGVLDIALPPDAEILGYADDLVLLVPGTTPDN FT VKAAAEEAIISVMEWMARHHLELAPAKTEMVVISSTKAPTRITVRVGDVDV FT TSSRSIRYLGVTLQDKLSWLPHVKEVTERAGKIADATSRLLRNHSEPRASK FT AKLLASVSESVMRYAAPVWSKELQKREPGRLLERVQRKMALRVARAFRTVR FT YETATLLAGLTPICLLLDEDARVYQRLSAVNRTDTRANIRKQERQATIEQW FT QQQWDAEADTSRHTRWAHRVLPNIGSWQSRKHGDVSFHLCQVLSGHGFFRD FT YLCRNGFTSSPDCQRCSGVPETAEHAMFECPRFAEVRQQLLGEGITDPVRP FT ENLQQHLLRDAESWSRICEAAKRITASLQQAWDDERAALAAHGNEQHFEEV FT ADLEARRAEIRRARNDRRNASRRAARARQRELQRAGRPPSPPPSPRTAARR FT ADLRLRQARFRARRRQAI" XX SQ Sequence 7072 BP; 1751 A; 2015 C; 2113 G; 1193 T; 0 other; aatagtgtta ttcaatcgaa attcggtgaa tttttgcgaa attgaaaaat cgtgttttgt 60 gtatacattt gcccccccgg tagcaccaaa tttctgggtg ttaaaaaagt gcgcaaaaat 120 ttgtgtttca ggcgccggaa acacttgatt ttggataatg cttctaaagc gattaaaagt 180 gttttcgaga cgttttgtga agttttaagt gcataaaagt gggtgtttat atccccatac 240 aatttgtatg ggggactttg aacgtttata tcggagtgaa aaacgcacgg atctgtgccg 300 gatgtgtctt ggaaggtgcg cgcaatgacg cttaaggtta agtgacaaaa atcagaaccg 360 aaatcggtga agaaaccagt gaaaaaagtg aatttaaagt gtggaacaag ccgggtgaca 420 ggttacattt gctacccaag ggacaaaccc gcggtgacag ccgttaccca aggagcaatt 480 tttaaatttt cggccgagtg acggttggcc gaaaaactgt tagcggcgga aaattatgaa 540 attcttaagg gtggtagagg acgtgtccct gatcacggga aaaattattt cccccccctc 600 tccccctccc accgctcccc cctggggcta gaatagagta gggcgcccga taagcgtcag 660 ataaggtcga gatcgcccga aaattctggc tttcggcagg gcggtaaagt tggacccccg 720 cagtccggga aaattatttt ccccactccc accctccccc ctcccggtga aaaatttttg 780 aagtggcaag tcaaaaagag gtccaaaaag gttggaattc ggggttcgag ctgtttggac 840 gtataaacgc accggggaaa ttcgggactg gatcggaata attccccacg cgtgagacgc 900 accatcccga attttacccc accccctccc aggggggctt aaattggacc ttcttcacat 960 taacggcaga gttgaagcgg aaatcgacag aaacgggtag cagaacacca gctaggcggc 1020 gaagcatccc aggggtcatc ccgacaccct ggatcattcg agccccgggg tcatttcgac 1080 ccccagggtt atttcgaccc ccaatgcaac ttgccaaagc aggtgtgagc aataacaagg 1140 caggggatga gtgacgaggc ccaaccccag caagtgagtg cgccgtacca gctgtggccc 1200 aggaaagggt cagtggttgt gatgcagcca caacccatcg agcgcccggc cacaccgatg 1260 atggagctgt gctactcaag cgacgacgat gagctgaaca gcaccatcat cgcgatgcca 1320 gaacccgcgt cggagtgtga ggcagcggag gcggccatgg acttggagcc accagcagca 1380 gcacagccaa caccaacggc atcccctgtc ccgggcaaca tggtagttgc tgggcccata 1440 gacgcgggaa gttgcgccct gctgatggca cagcttcaaa acatcggcgc ccagctaacg 1500 acggcgctgg aggagctgcg actttgccgc gaggaaaacg cggcacttcg tcgcgagaat 1560 gagttgctgc tcacgggcac tcgttcggtg ctcgagctgc agactgcagc gaacgcaacc 1620 ctgcagcagt cgtcgggaca aggtggaaac cgggagacgg cccggaagcg ccagcaacgg 1680 ctgaggcggc gagagcggga acggcagcag caacagcagc agcagcagca gcagcagcag 1740 cagcagcagc agcagcagca gcagcaacgc cagcagcagc agcagtgtca gcagcagcgt 1800 cagcagcagc cgcagcaaca acagctgcag cagccgcagc agcagctttg gacaacggtg 1860 gtaagaggcc gcccgtccca gcggcatcgt caaccgcagc agcagcagca gcagcaacag 1920 cagcaaggtg aacgctatgt tccaccacag ctccggcagc agcgacagca gcagcagcgc 1980 ccgaggcagc agcagcagca gcagcaacag cagcagcagc agcaaggtga gcgctatgtc 2040 ccaccacagc tccggcagca acgacagcag cagcagcatc agcagcagca gcagcagcag 2100 cagcagcagc gtcagcagca gcagcgtcag cagcagcgtc agcagcagca gcgtcagcag 2160 cagcagcagc agcagcagca gcagcgtcag cagcagcagc gtcagcagca gcagcagcag 2220 cagcagcagc accaacagca gcagcagcaa tggcagcagc agcagcagca acagcagcag 2280 ccgcggcaaa gtttgcctca tcgcaaacag acgcagctgc agctttctcc acgactgcag 2340 cagcaacagc agcaacagca gcagtcgcag caacaacagc agcagcagcc gcagcagctg 2400 ctctggacaa cggtggtaag aagctgcccg tcccagcggc aacgccaact gcagcagcaa 2460 cagcagcaac agcagcagca gcagcaaggt gagcgctatg tcccaccaca gctccggcag 2520 caacgacagc agcagcagcc gcagcagcag cagcaacagc gtccgcagca acagcgaccg 2580 cagcaacagc gacctcagca gcagcgatca cagcagcgaa agccggccaa gcccgagctt 2640 atcgaggtat cacccaatga aggtcaggat tgggagagcc ttctgctgct tgtgcaaacg 2700 gcagttagga ctgacgagcg ttacaagccg cttaaggacc acgtcgtcct gggccgccgc 2760 accagtaagg cgttgctgcg actcacgctc agccgcaagg cgaatgcgca gtatatgctg 2820 cagcaggtcc ctgccatcgt gggcagtgct ggagtgtgtc ggcacgtcac ggaaatggcg 2880 tcactggtca ttcatgacgt cgacccgcta gcccgagagg acgatctcac ttcgctgatt 2940 gacagcaagt tcgagtcggg agcgggaatt gtgtcgacca caatgacaaa gatggccgat 3000 ggtacacagc gtgcgtacgt gcgattgcct gcaatgtttg tgagtgaact cgacggcacc 3060 aagataaagt tgggattttg cgtcagtaaa gtcagagccg cgccaccgac ccctcgagag 3120 cgtgtgcgct gctatcgctg tctcgagctg ggtcattggg cccatgactg ccgttcaccc 3180 gacgaccggc agaacatgtg catacgctgc ggcgttgtgg ggcacatggc aaaggtatgc 3240 acttctcagc caaagtgcct caagtgcggt ggtccacaca caattggaca ccccgactgc 3300 gcccggtcgg ccttgcaatg accccacaac tgcgagtaat gcaggtgaac ttgggcagag 3360 gagagagggc ccaggacatc gccctccaaa ctgcccaaga gaagagagtg gacgtgctgc 3420 tgctgttgga gctgtatcga ccgcctgcca acaacggtag atgggccttc gactgttcga 3480 agaaggttgc catcgtcgca actgggtctc ttcctctcca gaggatttgg tgcagcaaca 3540 caccgggact cgtcgctgcc gagataggcg gcaccacttt cctcagctgt tacgctccac 3600 ctcgtcagcc caccgacgag ttcgagcgct tcattgaagc agtacagctc gagacgctta 3660 cccattcaca agtcgtcatt gccggcgact tcaacgcctg gcatgtggaa tggggaagcg 3720 agcgtaacag cgagaaggga gaagagctgc tcagtgccat ccagcagcta gacctggttg 3780 tgctaaatca gggcacgacg agcaccttcg acggcaacgg agcggcaaca gcgagtatcg 3840 ttgacgtggc gtttgcgaca ccaaccatcg cgcagccggg aacgtggaac gtgtgcggtg 3900 attactcgta ctccgaccac cggtacatca cgtacactgt tggcaccata gttcccgtcg 3960 taaacgagcc ctcatcacca cggatgagac atcaggggcg cattcgacac gcggatcggc 4020 ggtataaggc gacgcagttc tcgcagcgag ccttccgagc gcggttctca gaacgggcgg 4080 tcagtcacga acgcatggtt gagatcatgc tcgccacgtg tgacaaaacg atgcagcggg 4140 ttacaacgtc gcatagtgac ccccatcgtg acctgttctg gtggacgccg ctgctcaggc 4200 tgctgcgaga gaactgcgat cgcgcccgcg atcggatgcg gcagaccagc gatcttcaag 4260 agcggagcat tgccgcagcg gaacatcgca cagcgagggc ggagctgggg aaggcgataa 4320 aggccagcaa gaggaactcg ttccaggagc tgatcgatat cgccgaagaa aatgtgtttg 4380 gagccggata tctcgtcgtt ctgtcccacc tccgtggtgg acggacgcca cccgagacgg 4440 agcgggacag gctcgaacac atcgtgtccg atctcttccc ccagcacccg cccctcgtct 4500 ggccagaagc ggcagacatc gagggagagg agcagccagg agcagtagca gatgtttcgg 4560 acgatgagct caaactcatc gcacgtcgca tggccaataa aaaggccccg ggactcgatg 4620 gtatcccgaa tgcggcggtg aaagcagcca tcctcgagca cacgggggtt ttcacagcgt 4680 tgtaccagga ctgcctcgtt aacggcacgt ttcctgcagc gtggaagagg cagcgccttg 4740 tactcatccc gaagccagga aaaccctccg gagtgagctg ctcgtaccgg cccctgtgta 4800 tgttagatgc actgggcaag gtgcttgaac gcttgatcct gaacaggctg cacgagttcc 4860 tagaagatcc ggaatcaccg cgactgtcgg accggcagta tggtttccgc agagggtgct 4920 cgaccatcgg tctcattcag agggttgttg aggccggcca gcgtgcgatg tcgttcggtc 4980 gagcgaaccg acgcgacaaa cggttccttc tagttgctgc gctagatgtg aggaacgcgt 5040 ttaacacggc cagctggcag gccatcgcca ctgcgctgcg gacgaaacgt gttcccgccg 5100 gcctccaacg tatcatacac agctatttcc aggaccggga gctggtgtat gaaacctccg 5160 aaggcccggt agtgcggtcc gtcacggcag gggttccaca ggggtctatc ttgggcccca 5220 ccctgtggaa cacgatgtac gacggtgtgt tggacatcgc cctgccaccc gatgcggaga 5280 tcctggggta tgccgacgac ctggtgctgc tggtcccagg cacaacccca gacaacgtga 5340 aagctgctgc ggaagaggcg ataatatcag tgatggagtg gatggctcga caccacctcg 5400 agctggcgcc ggcgaaaacg gagatggtcg tgatctccag caccaaagcc ccaacgcgga 5460 tcaccgtccg agtaggtgac gtggacgtca cctcgtcccg ctcgatccgc tatctcggtg 5520 tgaccctcca ggacaagttg tcatggctgc cgcacgtcaa ggaggtcacc gagagggctg 5580 ggaagatcgc cgacgccaca tccagactgc tgcgaaacca tagcgaacca agggcatcga 5640 aagcgaagct gctagcttcg gtgtccgagt ccgttatgcg ttatgcagca ccggtatgga 5700 gcaaggagct gcaaaaacgt gagcctggtc gcctgctgga gcgtgttcag cgaaagatgg 5760 cactgagggt ggcacgagca ttccgtaccg tgaggtatga gactgccacc ctcctagctg 5820 gtctgacccc catctgcctg ctgttggatg aggacgcccg ggtctatcag cgactaagtg 5880 ccgtcaaccg caccgacacg agggcgaaca tccggaagca ggagcgacag gccacgatcg 5940 aacaatggca gcaacagtgg gacgcggaag ccgacaccag ccggcacacg cgttgggcgc 6000 accgtgtgct acccaacatc ggcagctggc agtcaaggaa acacggagat gtgtcgttcc 6060 atctgtgcca ggtactctcg ggacatggct tcttccggga ctacctgtgt cgcaatggct 6120 tcacatcgtc ccctgactgt cagcggtgca gcggcgtccc tgagaccgcg gagcacgcga 6180 tgttcgagtg cccgaggttt gctgaagttc gtcagcagct actcggcgag ggaattacgg 6240 acccggtccg tccggaaaac ctccagcagc acctgttgcg cgatgccgaa agctggagcc 6300 gtatctgtga agctgctaag cggataacgg cttcacttca gcaagcctgg gacgacgaga 6360 gagcagccct agcagcccat ggcaacgagc agcacttcga agaagttgcc gatctggagg 6420 cacggcgagc agaaatccgt cgagcacgga acgaccggcg aaatgcgagc cgccgagcag 6480 ccagggcacg gcaacgagag ttgcagcgag caggacgtcc cccatctcca ccaccatcgc 6540 ccagaactgc ggcacgtcgt gcagatcttc ggctgcggca agcgcggttt agagcgagaa 6600 ggcgtcaagc gatataggac gcgagacgcc tgtatgggga gcacagtcct gcatcaacat 6660 cgtcatcaag cagcagcgac gacgattcag acggccgagg aagcgcggat atcgcagccg 6720 gaccgtctgg aatgcgcaac cgcgcacatg aacgcgaaaa cgaagccacg gacggtggcc 6780 tgagtgctgc agaagaagcc gcagcggtcg aggcggaagt tgcctcccgc tagacgttct 6840 gccttcttta gaaaagaagc gtctactagc aagaacgagt gctctactaa gggagaatag 6900 aagttgaact gaattgaaca ataaaaaaaa cggaaggtgc atcttgcacg gaataggttg 6960 aggcaattaa gtctagcatc cccctgcagg gtacgccctc gcgggtaata atgtaggggt 7020 gagggagggt ctgaattcca ctgaataaag aaacccgctt tgaaaaaaaa aa 7072 // ID BEL6-I_AG repbase; DNA; ANG; 5286 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL6-I_AG is an internal portion of the BEL6_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL6-I_AG; BEL6-LTR_AG; BEL6_AG; Bel clade; RNase H; integrase; KW peptidase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5286 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL6_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(4), 71-71 (2003). XX DR [1] (Consensus) XX CC BEL6_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL6-I_AG, an internal portion of BEL1_AG is flanked by CC BEL6-LTR_AG CC LTRs. The BEL6-I_AG consensus sequence was reconstructed based on CC multiple alignment of 13 copies; they are ~0.3% divergent from CC the consensus sequence. Some copies can be active. CC The consensus sequence encodes one 1733-aa BEL6_AGp Bel-like CC protein CC (pos. 65-5263). CC BEL6_AGp is composed of the peptidase (pos. 150-275), reverse CC transcriptase, RNase H (pos. 1090-1180) and integrase (pos. CC 1450-1600) CC domains. XX FH Key Location/Qualifiers FT CDS 65..5263 FT /product="BEL6_AGp" FT /translation="MTRTPRTSRTPTVKPAPQPGWVDVIVAVRPHVHMRDL FT AYGELIRIRDNIVQAKEEGKALTAVQCKVFGKKADSALEEHNKHYGEIIKH FT DESEVHDAKFKETVKLHEEVMMEIESASEALAAQSISPKALIPAPSVSSVI FT VNAPLPRPIPSFSGKYEEWARFKTIFKDIVDKSSEDSRIKLYHLEKALSGE FT AAKVIDEKTINDGNYERAWQLLSERYDNKRRMVDLHISGLLNLKKVNEESY FT VGLRGLVESVESHVENLKYLGENFTGLSCAMVIHLIANALDIETKKLWEAS FT VPRNELPCYEKTLCFLKERVSVLERCQGNVAAGQRGRFVSKGPSSSGVTPM FT RSNAATTVCSVVVCELCKGAHETFKCSELMRLHGKDRESFIRSKRLCLICL FT KPGHWRNRCKSHLNCNRCHGPHNTVLHWDNGPEPVSREPEQVSASQHVNNT FT DAAGSIVLLQTVLLYAELPNKEVLLCRAMLDSGSQISCVTEALATKLGVNL FT VNVNVPVKGIGNIESSVKKKCTFTVKSRCSDFAMDVTCFVYREITGRIPSV FT YFDTSKWNLPDKSMLADPYFNNPSCVDILLGMDCLSEIMESGSVKLAKTLP FT MMTDTHFGWVIGGRVAEIHKEREVYTNVVTKENLETIVQRFWEVEDVTSER FT SVKVEDEICEEHFVKTHYRESSGRYVVQLPLRESISQLCCSRSVALRRFYL FT LETKLLKNPSLREQYQAFMSEYEELGHCRVVDESEDDGSVKRWYLPHHAVL FT NPAKNTTKCRVVFDASAKVNGLSLNDVMMTGPKVQHDLLSIKLRFRMPRYV FT VSADISKMFRQIKVDPCDSPLQRVFWRASPNEQLRVLELTTVTYGTAAAPF FT LATRTLLQLARDEREGFPLASRIIEENFYVDDGLFGANDIETVHAAQVQLI FT EVFRKAGMTLHKWSANDERLLESIPFEDRDALTKIGDCEANEIIKTLGLMW FT NPMNDEFIFLTKVPSKGRTPTKREVLSAIAKIFDPLGLISPVVVLAKILMQ FT KLWLAKLDWDDQISEPLIEEWDNFLEALPTENQVRIPRHVVSTNAVSLEIH FT GFADASLKAYGACVYIRSIERNGEAQLRLVISKSRVAPLSNVTIPRMELLA FT ASLLCRLVKKVLEALKDFNFETINLWSDSQIVLAWLKKPIECLNVFVRNRV FT AEINENRAFIWRYVRTHQNPADVISRGQSASLLVSNEMWWNGPEFLRTCEI FT PNVCIDELNDDEIPELRNEVICNVIVPLKVLPILEKYESFRKTQRILAYIV FT RFKQNTKRRKGERINDPNPTIPELRESMRWIIRAIQHQDLPEVVSAVKNSK FT PLQRYQDLNLFLDGELLRVGGRIRHANLAFGNKHQLLLPNRNVITHRLIAT FT IHRENLHVGPSGVIAILRQQFWVVNARSTVRMVLHKCITCFRSKPTTLEQQ FT MGDLPSYRVTAAPTFQRVGLDFAGPIMLKSGIRRVAAIKGYICVFVCMVTK FT AIHLEAVEDLSTGAFLSALRRFVSRRGIPEEIFSDNATNFVGAKNELRELY FT EMFRKEATGQGIFQFCQEKEIVWKMIPPGAPHFGGIWEAGVKSVKSVLKKI FT YKSASLTITDFSTLLCQIEAILNSRPLYAHSNDPNDLECLTPAHFTIDRPL FT IGVAEPSYLDMPEGRLNKWQRIQQLRQQFWNRWQREYLCELQTRYKWTKIR FT DNVKEGALVLIKEDNTPPQLWKLGRIARVFPGEDGLIRVVDVKTRSGEFKR FT PVHKLALLPVIDP" XX SQ Sequence 5286 BP; 1547 A; 946 C; 1376 G; 1417 T; 0 other; ttggtccttc gaacttcgga tattgaccct ttgtgctttg gatatcggta aaaatcgcgt 60 catcatgact cgaacgcctc ggacgtctcg cactccaacc gtaaagccgg cgccacaacc 120 cggttgggtg gatgtaattg ttgctgtaag gccgcatgtg cacatgcgtg accttgcgta 180 tggcgaattg attcgcataa gggacaacat tgtgcaggcg aaggaagagg gaaaggcgct 240 gaccgcggta cagtgcaaag tgttcgggaa gaaggcagat agtgctttag aagaacacaa 300 caaacattat ggtgagatca taaaacatga cgaaagtgaa gtgcatgacg ccaagtttaa 360 ggagacagtg aaattgcatg aggaggtgat gatggaaata gaaagtgcat ctgaagccct 420 tgcggcacag tccatttcac caaaggcatt gatacctgct cctagcgttt cgtcagtgat 480 tgtgaatgct ccgcttcctc ggccaatacc ttcgttcagt ggcaagtatg aggagtgggc 540 gcggtttaaa acaatattca aggatattgt tgacaaaagt agtgaagatt cgcgtatcaa 600 gttgtatcat cttgaaaagg ccttgagtgg tgaggctgcc aaggtaatag atgagaaaac 660 aataaacgac ggtaattatg aacgtgcgtg gcaactgtta tccgaacgtt atgataacaa 720 acggcgtatg gtagacttac atattagtgg acttttgaat ttgaaaaaag tgaatgagga 780 aagttatgta ggtttgcgcg gtttggttga atcggttgaa agtcacgtgg aaaatctaaa 840 gtatttaggt gaaaacttta ccggtctgtc gtgtgcgatg gtgatacatt tgatagcgaa 900 tgcgttagat attgaaacaa agaaactatg ggaagcgagt gtccctagaa atgaacttcc 960 gtgttatgaa aaaaccttgt gcttcctaaa agaaagagtg tcggttctag aaaggtgcca 1020 aggaaatgtt gcagcggggc aaaggggacg ttttgtttcg aaagggccca gctcaagtgg 1080 tgttacgcct atgagatcga atgcagctac aaccgtgtgc agtgtagttg tgtgtgaatt 1140 gtgtaagggt gcccatgaaa cgtttaagtg ctctgagctg atgcgactgc atggaaagga 1200 tcgggaaagt tttattagat cgaagcgtct ttgcttaatt tgcctaaaac cgggacattg 1260 gcgaaatcgt tgcaaatcgc atttgaattg taacaggtgc cacggaccac acaacacggt 1320 attgcattgg gacaacggtc cagaaccggt ttcaagggaa cccgaacaag tgagcgcgag 1380 ccagcatgtg aacaataccg atgcggcggg aagtattgtt ctcttgcaaa ccgtcttgct 1440 ttacgcggaa ttaccaaaca aagaggtgtt gttgtgtcgt gcgatgttgg acagtggatc 1500 gcaaatcagt tgtgtgactg aagcacttgc aacgaaatta ggcgtaaatt tggtaaacgt 1560 gaacgtaccg gttaagggga ttggcaacat tgaatcttcg gtgaagaaaa agtgtacttt 1620 tacagtgaaa tctcggtgta gtgattttgc tatggacgtg acatgttttg tgtatcgtga 1680 aataacaggt agaataccct ctgtgtattt cgatacgtcg aagtggaatt tgccggataa 1740 atcgatgttg gccgatccat atttcaacaa tccaagttgt gtagacattt tattgggaat 1800 ggattgtctt tcggaaataa tggaatctgg ttcagtgaaa ttggcaaaaa cacttcccat 1860 gatgacggac actcatttcg gatgggtgat aggtggacgt gttgcggaaa tacacaaaga 1920 gcgtgaggtg tacacaaatg ttgtgacaaa agaaaatttg gaaacaatag tgcaaaggtt 1980 ttgggaagtt gaagatgtga caagtgagcg ttccgtgaag gtggaagacg agatatgtga 2040 agaacatttt gtgaagacac attatcgtga atccagtggc aggtatgttg tgcagttgcc 2100 tttaagggaa tcgataagcc aattgtgttg ttcgcgaagt gtagcgctgc gtaggtttta 2160 tttgttggaa acaaaactgt tgaaaaatcc atcgttacgt gaacaatatc aagcgtttat 2220 gagtgaatac gaggagttag ggcattgtag agtggttgat gaaagtgagg acgatggctc 2280 cgtgaaaagg tggtatctgc cgcatcatgc tgtattgaac ccggccaaga acactaccaa 2340 gtgccgtgtg gtgttcgatg cttcggcgaa ggtaaacggc ttgtctttga acgacgttat 2400 gatgaccggc cctaaggtac aacacgactt actttccatc aagctgcgtt ttagaatgcc 2460 tcgatatgtg gtaagtgccg atatatctaa aatgtttcga cagataaagg tggatccctg 2520 cgacagcccg ctacaacgag tattctggag agcttcgccg aatgaacaac tgcgtgtgct 2580 tgagcttacc acggtgacct acggaacggc agcagcacct tttctagcaa cgcgaacgtt 2640 gttgcagcta gcgagagatg aacgagaagg ttttccctta gccagtcgca ttattgaaga 2700 gaatttttat gtcgacgatg gattgtttgg ggcaaatgat atcgaaactg tccatgctgc 2760 acaagtgcaa ctcattgagg tgtttagaaa ggcgggtatg acacttcata aatggtcagc 2820 gaatgatgaa aggcttttgg aatcgatacc atttgaagat cgggatgctc taacaaaaat 2880 cggcgattgt gaagctaacg aaatcatcaa aacgttaggt ttgatgtgga atccgatgaa 2940 tgatgaattt atatttttga caaaggtacc gtctaagggc agaacaccta ccaaacgaga 3000 ggttctgtct gccattgcaa aaatatttga cccattgggg cttatttctc cagtggttgt 3060 gttggcaaag attttgatgc aaaagctatg gttagcaaag ttggactggg atgatcaaat 3120 ttctgagccg ttgatagaag aatgggataa ttttttggaa gcattgccaa cagaaaatca 3180 agttcgaatt cctcgacatg tagtgagtac aaatgcagtt tcattagaga tccatggatt 3240 tgcggatgct tctttgaagg catatggagc ctgcgtctat ataagatcaa tagagaggaa 3300 tggtgaagcg caattgagat tagtgatcag caaatctcgg gttgcccctt tatcgaacgt 3360 gacaattcct agaatggaac tgcttgcagc ttcattgctt tgtcgtctag tgaaaaaagt 3420 actggaagca ttaaaggatt ttaatttcga aacgattaac ttgtggtcgg atagccaaat 3480 agttctggca tggttgaaaa agccgataga gtgcctgaat gtttttgtac gaaatcgtgt 3540 agccgagatc aacgagaacc gagcattcat ttggcgttat gttcggacac atcaaaaccc 3600 tgcggatgtt atatcaaggg gtcaatcggc gtcgctactg gtttcaaatg agatgtggtg 3660 gaatggtccg gagtttttac gtacttgtga gataccaaat gtttgcatcg atgaactaaa 3720 tgatgatgaa attccggaat tgcgcaacga ggttatctgc aatgttatcg taccgctgaa 3780 agttctccca attttggaga aatatgaatc ctttagaaag acacaaagaa tacttgctta 3840 catcgtgcgg ttcaaacaaa acacaaagcg tcgcaagggc gaacgcatta atgatccaaa 3900 ccctactatt cccgaattgc gagaatcgat gcgttggatc attagagcaa ttcagcatca 3960 ggatctaccg gaagtagttt cggccgtaaa aaatagcaaa ccgttgcaac gttatcaaga 4020 ccttaacctt tttttggatg gtgagttact gcgggtagga ggtcgtataa gacatgcaaa 4080 tctggctttt ggaaacaagc atcaacttct tcttccaaat cgaaatgtga ttacacatcg 4140 tttaattgca acaattcatc gtgaaaattt gcacgtcggg ccatctggag tgattgcgat 4200 tcttcgtcag caattttggg ttgtcaatgc gcgatcaacg gttcgaatgg ttctgcataa 4260 atgcattacc tgtttccgaa gcaagccaac tactttggaa caacagatgg gtgatcttcc 4320 tagctaccgt gtaacagctg ctcctacctt tcaacgagtt gggcttgact ttgctgggcc 4380 aattatgctg aaatccggta tacgtcgtgt tgcggcaata aagggataca tatgtgtatt 4440 cgtgtgtatg gtgactaagg cgatccattt ggaagcagtg gaggacttat caactggtgc 4500 ctttctctca gctttgagac gatttgtatc aagaagagga atacctgaag aaatcttcag 4560 tgataatgcg acaaattttg tcggagcaaa aaatgaacta cgagagctgt atgagatgtt 4620 taggaaggag gcgacaggcc aaggcatttt ccagttttgc caagaaaagg agatcgtatg 4680 gaaaatgata ccccctggtg cccctcactt tggaggaatc tgggaggcag gggtcaagag 4740 tgttaagagt gtactgaaga agatttataa atctgcatct ctaacaataa ctgattttag 4800 cacgcttcta tgccaaattg aagccatctt aaactcaagg ccattgtatg ctcactcaaa 4860 tgatcctaac gatttggaat gccttactcc agctcatttt acaattgatc gtcccttgat 4920 tggagttgca gaaccatcat atttagatat gcctgaaggt cgcttgaata aatggcagag 4980 aatccaacaa ctgcggcagc aattctggaa tagatggcaa agggagtatt tgtgcgaact 5040 gcaaactaga tacaagtgga ccaaaataag ggacaacgta aaggagggag cattggttct 5100 aataaaggag gataatacgc ccccgcaatt gtggaagctg ggacgcattg caagggtgtt 5160 cccgggcgag gacggattga tccgggtagt ggatgtgaag acaaggagtg gtgagtttaa 5220 acgtcctgtt cataaattag ctcttcttcc agttatagat ccgtagcatc aacctggccg 5280 ggagga 5286 // ID Clu-87B_AG repbase; DNA; ANG; 851 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-87B_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-851 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1448-1448 (2010). XX DR [1] (Consensus) XX CC 8bp TSD. ~94% identical to consensus. XX SQ Sequence 851 BP; 298 A; 126 C; 114 G; 313 T; 0 other; ggtaaaagta aggctcagcc ctctacgtta cgctagcgtt acgcgtaacg cctgttaatc 60 acaaacccaa gcagcatttt tgaacgtttg ttttaaccgc cgtggatgag caaattaata 120 gataatcatg tcttattatt gtttactttt atatgataat tagaaaagct ttgttgataa 180 ttagaaaagt tgtcacaact tcaagaatta taattttaac aatcttataa ataaaagatg 240 taaataataa aaaaatgatt atttatttat ttacccagtt attgataaag tagtataatg 300 aaatacttac aagtatttat taacttcatt aacctcgttt taaaaagagc cctcgcatac 360 ctacaagtaa tttttattaa cttcattaac ctcattttaa aaagagccct cgcatctaaa 420 gtaattttta ttcaaaatgg ccggaaacat agattgaaaa ggtaatagtt atatttcata 480 tatataattt aattaaaaat attttaatat tttatatttt ctaatcatag gaaatgttac 540 tccaattgaa agcttttggg gtgttcatga gcttccttct ttcagacttt ttttcattaa 600 tcataatgca acaatgaaaa atattaaacc tgttatttaa ttgttattta aaaattaaat 660 ctccttgaaa tcggttcatc ttacagcaat taactttttt ttcatctcaa tttaaaatat 720 tttttattca aataatgtac actggtcggt cgcctggatg acaaaactgt agtaaaaatg 780 ctgcttggga atattttaag ggatgtaacg cttctccaat gtaacgtaga gggctgagcc 840 ttacttttac c 851 // ID COPIA1-LTR_AG repbase; DNA; ANG; 153 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA1-LTR_AG is a long terminal repeat of the COPIA1_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA1-I_AG; COPIA1-LTR_AG; COPIA1_AG; Copia clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-153 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "COPIA1_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 50-50 (2003). XX DR [1] (Consensus) XX CC COPIA1-LTR_AG is a long terminal repeat of the COPIA1_AG LTR CC retrotransposon. There are ~20 copies of COPIA1-LTR_AG in CC the genome. XX SQ Sequence 153 BP; 48 A; 26 C; 22 G; 57 T; 0 other; tgttgagaaa gcaacatgtg tgcaatgaac acatcatagt agtctttaga aaatgtttaa 60 tattgtcagt ctagtttaaa atacattaca tacaagttca ttgtttagtt tctgctcctc 120 ttttattcca ctgtgttata actacgctca aca 153 // ID COPIA3-LTR_AG repbase; DNA; ANG; 110 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA3-LTR_AG is a long terminal repeat of the COPIA3_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA3-I_AG; COPIA3-LTR_AG; COPIA3_AG; Copia clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-110 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA3_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 52-52 (2003). XX DR [1] (Consensus) XX CC COPIA3-LTR_AG is a long terminal repeat of the COPIA3_AG LTR CC retrotransposon. There are ~20 copies of COPIA3-LTR_AG in CC the genome. XX SQ Sequence 110 BP; 36 A; 19 C; 20 G; 35 T; 0 other; tgttgtgatc atagcaacct ctctagttta aacgaagtgt agtaggcttt attgaaatag 60 agaataaatc agtctgcatt ttcctttcgt acgaacaaga agttccaaca 110 // ID GYPSY9-LTR_AG repbase; DNA; ANG; 751 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY9-LTR_AG is an LTR of retrotransposon GYPSY9_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY9_AG; GYPSY9-I_AG; GYPSY9-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-751 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY9_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 178-178 (2003). XX DR [1] (Consensus) XX CC GYPSY9-LTR is a long terminal repeat of GYPSY9_AG CC (its internal portion is deposited as GYPSY9-I_AG). XX SQ Sequence 751 BP; 269 A; 189 C; 118 G; 175 T; 0 other; tgtagcatat ctgctataca caacttatta taatacacac tatcagctat agacacattg 60 taacacactt tggaaagcca aaccacacca ttgtaacaca ttcattataa cacattcagt 120 aaatcagatc ataccataca cacccggaag agctcaggcc acaccgttca tataaaaaca 180 attgtgacac acttaacaga accaaccgaa acgtacaaca taagcaggaa cagtttatgc 240 attacaaccc aaagcaggaa cagtataagc attaaaccca aaacaggaac ttagtgacaa 300 gaaccatcgt tcttgtcaac aaaagacaaa cgtaactagc tgagtgaaaa caataacttt 360 ccaacataca acccggaacc aaatgtgaga aacccttaac ttcttttgag tataaataaa 420 accaactccg atcatggcaa gtcagattcg ttcggactgt caggatagga ctatgcccat 480 cctacatcta tcgacttctc atttaaagag tttagttatg tccccctacc caaggtaagg 540 cctctaaact ccaacgtttc cagtaaggga aacctcctaa aggtttctat gatcgcctgg 600 acgatcaagt gtcgaaaagt ccgctcgaac gaagttttaa tccacctcgg ctcatgtcag 660 taaaaactga cctaccgcga cggtactcga agtacagttg tacgatcatc gcattacatg 720 gcgaccgtag ctatcatccg aatacattac a 751 // ID Waldo1_AG repbase; DNA; ANG; 5580 BP. XX AC AB090814; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon Waldo1_AG DNA, complete DE sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; gag; KW reverse-transcriptase; Waldo1_AG. XX NM Waldo1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090814; Positions 1 5580. XX FH Key Location/Qualifiers FT CDS 605..2101 FT /product="Waldo1_AG_1p" FT /translation="MRIHVPCAVSVLGCAYTQRTKCAACLDSPDGMNGNES FT LHPRPLGSALKDIGAFFGRSSKTPRSPPNDNVQGSVSPAVDVVEVMPEEQT FT SASMECQETSHPIKEQGFEVSASKLQEALMVARELHTYTKDRNNVHAPIKK FT MSVSILSALSCIERELLTMKLRAERAEKALREVQSEPPETPMTGKRSRKAR FT TPEEAEDAKRAKNDAPSCNRPDAEYSEGVKNSENGELWSTVVSKKAQRKKK FT MGTMAEGKQTRAGEHNGPVKPVPRRPKTEAILVETTEVSTHKDILRKLKAD FT PELQSFGKQVVRIRSTKNGGLLFELKKSDQTECESFSGKIQQAIGEAGNVK FT SLGQMETVEIRFIDEETEAADVERDLRNQITGLEGYKVEVTMKTSFSGMQT FT ALVKLPVKLVSVVTGAGKVQIGWSVCPVRINIPSRRCYRCWQTDHISQDCC FT GPDRRDCCLRGGEKGHFAATCRLPPRCVLCPDGSNAHHSSGAFCPAAKKTA FT PWK" FT CDS 2095..5241 FT /product="Waldo1_AG_2p" FT /note="endonuclease and reverse transcriptase." FT /translation="MEVAQINLNHCEEAQALLSQVMVEEIGDIAIVSEPYS FT APTGSSSWVADKTGNAAIWVTGTIQRVVSNTFEGFCIAEVNGVFFCSCYAP FT PSWELERFHVMLDNLVAELDGHRPLVIAGDFNAWAVEWGSKRTNSRGDAVL FT ESFARLGVTLGNAGTTPTFNRNKRTSIVDITFCSTTLSERLNWRVSDALTL FT SDHNVVRYAIIQGHRLTSSSAHGSRVGGRGWKTESFNEDFFKELLAFGDFG FT EAVSASQIVIALTKASDGAMPRRKPPNGATSRRQPVYWWNASIKIQRAECV FT AARRKMQRERCPEVKQQLRIVYIAARSELQRAIKASKRQHFLKLCDEIARK FT PWGLAFNTLMNKVKSSEPVEQCPVKLKSIIETLFPTHPTINTPETDPVPIT FT RQEIINLANRLKCGNAPGIDGIPNMAIKAAMLAYPDVFKKRAPPNLSPTIN FT TPETDPVPITRQEIINLANRLKCGKAPGIDGIPNMAIKAAMLAYPDVFKNV FT LMNTLTTGQFPSMWKIQKLVLIPKPGKPPGHPSAFRPLGLVDNLAKVQEMV FT ILDRLTKYTEGPHGLSDRQFGFRKKRSTVDAILAVLEKGIAAFQRKRCGAR FT YCALITIDVKNAFNSASWEEIAAAVERMKIPPHLCRLSRNYLDGRVLQYDT FT AEGVKTNAIPAGVPQGSVLGLTLWNVIYDGVLTLALPHGVEITGFADDIAI FT TVSAVSIEEVEMLATDAVGRIDRSMRDAKLVIAHAKTEFIVISSHKVHQKA FT SIMAGTVQVESTRSLKYLGVVIDDRLKFKSHLEEACKKVMKAINALAAFTP FT NIGGPCSSIRRLHANCAISVLRYGAPVWAHILKEKQHQNTVNKVHRKLAMR FT VTSAYRTISYEAVCVIASMMPLCITLEEDSKNFRKSRAGESFTETAKKASR FT QASMRQWQNEWSNSLNGRWTYLLIPDVGAWLDRKHGDVDYFVTQVLSGHGC FT FRSYLHRFNRASSSRCPACKDEDETVDHVMFHCPRFAEERLQLNESCRVEV FT GCSNLVQVMLQHTDSWEAAATTMRLILTKLHQKWKQDQQLGDI" XX SQ Sequence 5580 BP; 1512 A; 1273 C; 1549 G; 1246 T; 0 other; gttgagtgat gagttggagt tgagtcatga aggacgtgct tttgctgtcg cttctgatcg 60 aagctgtttt tatagtgttt tagtgcgatt cgctggtgct tttcctggac aaggaatttg 120 gttctgcctg tggtttccgt gcgtgtagcg ggttctggcc gcgggccttt ttctgctgag 180 gggctgaagc gcggaacgtg ctaatcttgt gcaacgcggg tgaagccgcc actatagaac 240 aaggtcgtgt tagaaggaag tgcggttgtt tgtagtggtg ttgccgtgga aagtatctcg 300 tgcatttagg atacacactg taccacttaa atattgattg tgcgcgtgaa gtatagttcg 360 gtattacggg tgcccctacg cacacttgag tttagcgagt agtgtaattg tgctggtgaa 420 gctagcagca cccagtgcgt gagtgactcg aacgtaagtt ctgtgagttt cagtgactta 480 tcgttgcgtt cgttgcgttg tgtgtggtat ccaaacacca ccggtatact gtgtgcgtag 540 tgagtgaatt acgatcatcg agtgcgtgag tgcctttgag tttacgtgcg gaagaagcat 600 taagatgcgc atacacgtac cgtgcgcggt tagcgtctta gggtgcgcat acactcaaag 660 aactaaatgc gctgcgtgtc tggattcgcc tgacggaatg aacggaaacg aatcgcttca 720 tcctcgtccg ctagggtctg cccttaagga cataggcgcc ttttttggtc gtagcagcaa 780 gacgccaaga tcgcccccga atgataatgt gcagggttct gtttcaccgg cggtggatgt 840 cgttgaagtt atgccggaag agcagacttc agccagcatg gagtgccagg aaacttccca 900 tcctatcaag gaacagggtt tcgaggttag tgccagcaaa ctgcaagaag ctctgatggt 960 agcaagggag ctgcatacgt atacgaagga tcggaacaat gtacatgctc cgataaaaaa 1020 gatgtcagtg agcatcctct cggcgttatc gtgtatcgaa cgggagctgc tgactatgaa 1080 gttgcgagcg gagagggctg aaaaggcgct tcgcgaggtc caatcggaac ctccagaaac 1140 acctatgact gggaagagaa gtaggaaagc gagaacgcca gaggaagcag aggacgctaa 1200 acgagcaaaa aacgatgctc cctcttgtaa ccgtccagat gcagagtaca gcgaaggggt 1260 taaaaactca gagaacggtg agctatggag cacagttgtc agcaaaaagg cccaacgcaa 1320 gaagaagatg ggaaccatgg cggagggtaa gcaaacaaga gcgggtgaac acaacggccc 1380 agttaaaccc gtaccgcgac gaccaaaaac ggaggcaatc ctagttgaga caactgaagt 1440 atcaacgcac aaagacatcc tccgcaagct taaagctgac cctgagctac agtcgttcgg 1500 caaacaagtt gttcgaataa gaagcacaaa aaatggagga ttgctatttg agcttaagaa 1560 aagtgatcaa acggaatgcg aaagcttttc cggaaagatt caacaagcca ttggtgaagc 1620 tggcaacgta aagtctttgg gacaaatgga gacagtagaa attcgtttca tcgacgaaga 1680 aacggaagca gctgacgtag agagggatct gagaaaccag ataactggcc tcgaaggcta 1740 taaggtggag gtaacgatga agacttcctt ctctggtatg cagaccgctc tagtgaaact 1800 cccggtaaag ctggtgtcgg tggttacagg agcaggaaaa gtgcaaatcg gttggtcagt 1860 ctgcccggtg cgtataaata taccgagcag aagatgttat cgctgctggc aaaccgacca 1920 catttctcag gactgttgtg gaccagacag aagggactgc tgcctgcgtg gcggggaaaa 1980 agggcacttc gctgctacgt gtcgtctgcc accacgatgt gtgctttgtc cagatggatc 2040 caacgcgcat cactctagcg gagcattctg tccggcggct aagaaaacag caccatggaa 2100 gtagcccaga taaacctaaa ccattgtgaa gaggcacagg cactactgag ccaggtgatg 2160 gtggaggaga taggagatat tgccatagtc tctgagccat acagcgctcc aacaggctct 2220 agtagctggg tggcagataa gactgggaac gctgcaatat gggtgacagg cacaatacaa 2280 cgggtagtat ctaacacctt cgagggtttc tgcatagccg aagtaaatgg agtgtttttc 2340 tgcagctgtt atgctccccc aagttgggag ctagagaggt tccatgttat gttagacaac 2400 ctcgtagcag agctggatgg gcatagacca cttgtaattg ccggtgattt caatgcctgg 2460 gctgttgaat gggggagcaa gcgaactaat agcagaggtg atgctgttct cgaaagcttc 2520 gcccggctgg gagtaacgct gggaaatgcc ggcacaaccc ccacgttcaa cagaaacaag 2580 aggacatcaa tcgttgatat tacgttctgc agtaccaccc tctcggagag attgaactgg 2640 cgagttagtg atgcactcac tctcagcgac cataatgttg tcagatacgc catcatccaa 2700 gggcataggc taacatcatc atcagcgcat gggtcccgag ttggtggtag aggctggaaa 2760 accgaatcct tcaatgagga tttctttaag gagcttttag ctttcggaga tttcggcgaa 2820 gccgtgagcg cctctcaaat cgtgatagcc cttaccaaag cctctgatgg cgctatgcca 2880 agaagaaaac ctcctaacgg ggcgacaagc cgtcggcagc ctgtgtactg gtggaacgcg 2940 tcgattaaaa tacaacgggc tgaatgcgta gcagcacggc gcaaaatgca gcgagaaaga 3000 tgccctgaag taaaacaaca actcaggatc gtatacattg cggctaggtc agaacttcaa 3060 agagcgatta aagccagtaa aaggcaacac ttcctcaagc tatgcgacga aatcgctcgg 3120 aagccctggg gacttgcctt caatacactc atgaataagg taaagtcttc ggaacctgta 3180 gaacagtgtc ctgtaaagct gaagagcatt attgaaactc tgtttccaac acatcctacg 3240 atcaacacgc cagagactga ccctgtgcct ataacgagac aggagattat taacctggcc 3300 aatcgtttaa aatgtggcaa tgctcctggg atagatggca tccccaatat ggcgatcaag 3360 gctgctatgt tggcataccc tgatgtgttc aaaaaacgtg caccaccgaa cctcagtcct 3420 acgatcaaca cgccagagac tgaccctgtg cctataacga gacaggagat tattaacctg 3480 gccaatcgtt taaaatgtgg caaagctcct ggaatagatg gcatccccaa tatggcgatc 3540 aaggctgcta tgttggcata ccctgatgtg ttcaaaaacg tgctaatgaa caccctgact 3600 accggccagt ttccgtcaat gtggaaaata cagaaattag tcttgatacc gaagccaggt 3660 aagccgccag gccacccgtc agctttcagg cccttgggac ttgtggataa cctggccaaa 3720 gtgcaagaga tggtgatttt ggaccggctt acaaaataca ctgagggacc acacggtctg 3780 tctgatcgcc aattcggctt ccggaaaaaa cgatctacag tggacgcgat actggcggtg 3840 ctggaaaaag gtattgcggc gtttcaacgg aagcgatgtg gagctcgata ctgtgcgctc 3900 atcaccattg acgtgaagaa tgccttcaac agtgctagtt gggaggaaat tgcggcagcg 3960 gtagagcgca tgaagattcc cccgcacttg tgcaggctgt cgaggaatta tcttgacggt 4020 cgcgtgctgc agtatgatac ggcggaagga gtgaaaacca atgccatccc cgcaggcgta 4080 ccccagggat cagtgcttgg tctcaccctg tggaacgtca tatatgatgg agtactgacc 4140 ctcgccctcc ctcatggtgt cgagatcact ggttttgcag acgacatcgc tatcactgtc 4200 tcagctgtgt ctattgagga ggtagagatg ctcgccaccg acgcagtcgg tcggattgac 4260 cggtcgatgc gggacgcaaa gctagtgatt gcgcatgcga agactgagtt catcgtcatc 4320 agcagtcaca aggtgcacca aaaagcatcg ataatggcag gaactgtaca agtcgagtct 4380 accagatcgc tgaagtatct tggggtagtc attgatgacc gactgaaatt taagagccac 4440 ctcgaggaag cctgcaagaa ggtcatgaaa gcaataaatg cactagcggc attcactcca 4500 aacattggcg gaccgtgtag cagcattagg cgccttcatg ctaactgcgc catctcggtg 4560 ttgaggtacg gagcgccggt ttgggcgcat atactaaagg agaagcagca ccagaacacc 4620 gtgaataagg tgcacaggaa gttggccatg cgtgttacta gcgcgtaccg taccatttcg 4680 tacgaagcgg tatgcgttat tgcgagcatg atgcctcttt gcatcaccct cgaggaggac 4740 tcgaagaact tccggaaatc gcgtgcgggt gaatctttca ctgagactgc caagaaagcc 4800 tcaaggcaag catcgatgcg gcaatggcag aacgagtgga gtaactcgtt gaacggaaga 4860 tggacctact tgctgatccc cgacgttgga gcatggctag acaggaaaca tggtgatgtg 4920 gactactttg tcacccaggt tctttccggc catggctgtt ttagaagcta tctgcacagg 4980 ttcaatcgcg cctcttcatc tcggtgccct gcgtgcaagg atgaagacga gacggtggac 5040 cacgtcatgt tccactgccc tcgattcgcc gaggaacgcc tgcagttgaa cgagagttgc 5100 agagtagaag tgggctgttc taacctggta caagtcatgc tgcagcacac cgactcatgg 5160 gaggcagcgg caacaacaat gcgtctgatc ctgaccaagc tgcaccaaaa atggaagcaa 5220 gaccagcagc tcggcgatat ttaaattcgt cgtgagtgta tgtgttagtg aacgcgtgag 5280 tgcgcggcgg aaaaagtgtc gtttagtgtc tagtgtttcg cgtcgtcgtc atgaccgtct 5340 tgtgttcgcg tgataactgt caaatctgtc tcgcgaactt tggctcatcg tgcagtagcg 5400 aggatgaaat gctaatgcat aagcccttcc ccaagaagca taccgaaagg tgaacccatg 5460 gggaagggta tatggcccaa ggagggggtt tactgggtaa gaatcccatg tcaacacccg 5520 tgcgacaacg ggagtctttc gaagattccc cctccttgta gaacaaaaaa aaaaaaaaaa 5580 // ID Mariner-N14_AG repbase; DNA; ANG; 244 BP. XX AC . XX DT 31-MAY-2005 (Rel. 10.05, Created) DT 31-MAY-2005 (Rel. 10.05, Last updated, Version 1) XX DE Mariner-N14_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW mariner; Interspersed repeat; Mariner-N14_AG; KW nonautonomous DNA transposon; mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-244 RA Kapitonov V.V. and Jurka J.; RT "Mariner-N14_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 5(5), 122-122 (2005). XX DR [1] (Consensus) XX CC Mariner-N14_AG copies are ~98% identical to the consensus CC sequence. They are flanked by the TA target site duplications. CC This element forms a palindrome. Classification: a nonautonomous CC Mariner/Tc1-like DNA transposon. XX SQ Sequence 244 BP; 75 A; 53 C; 51 G; 65 T; 0 other; cagggttttc caggagttct catagctgtg ggacacttta ttgactcttt cttacgtgaa 60 atgaacttaa tgtaatggga attggactct atagcaccct tgttggacaa atccaatagg 120 aatttccaag gagcctgtcc aaaaagggtg ccatagagtc caatttccat catatgaagt 180 tcacttccca taagaaagag tcaaggaagt gtcccacaac tatgagaacc cctggaaaac 240 cctg 244 // ID Ag-CR1-22 repbase; DNA; ANG; 4007 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE A CR1 clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; Ag-CR1-22. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4007 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4007 RA Kojima K.K. and Jurka J.; RT "CR1 clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 5 CC sequences with >98% identity. XX FH Key Location/Qualifiers FT CDS 3..833 FT /product="Ag-CR1-22_1p" FT /translation="KEIRTGLSQLNESQKRTPLSSHPPAKLRRVLPAKRLF FT SELVSATNTSPAISLVAKRTTNINNNNINNNIAVPSLAEHNNINITNNITS FT RQPIAIYGTDNDSPNLPTVQNRFVPKMWLHLANIAPGTSCEQVVESVKRRL FT ATTDVIAFSLMGKNFDTTLARPMSFKVRIPAHLRETALSSSVWPSNLRVRE FT FILRDNRPASSHPLACLTPMNAHETPLXSINAVPIDNIPKPAEQTINEKPN FT SSSPSNVQPIVHNMEHSPIPADRTAQNNDVHHEHSS" FT CDS 837..3896 FT /product="Ag-CR1-22_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="SLRTQEASASRYSHTNTSCNDHARTNTATHLKMFSVR FT DIKNISHAKPTDTALKPCDIHFHYQNVRGLRTKLTEFGLNTLDAKYQIIIL FT TETWLDNSIPSSLLFDPGYSVYRCDRSLSNSMHTRGGGALIACSSLLNTRE FT LTLPRNSLEQVWVTVRLSSCTIFIGTVYIPPNRANDTDTLTSVIESVSAIV FT NHAKTNDLIFLFGDFNFSAVSWSLAQQTFGHSSSFAHYVANSSSSVRSRFL FT DEINASDLYQISCIKNSLNRQLDLVFANFIAATYCIDLHPCPTPLVSPDLY FT HPAIDLTVRIPSTMPNPTLTQASRPRMNFYKTNFTRLDELITNFNSQFRIS FT QFNSVDEALSIFIRFLLSSFDSCVPIITPKSGPAWSDRHLKKLKRAKEAAL FT KKYTKYRSPVNKSILNETTRLYRSYNKIRYGGYLSRLEKNFIKRPKALWSF FT RKKHNSTNVTPKSIYHNDRAGSTASDMCDIFACRFEEAYTISTTDDNIITN FT ALINTPNDVIDLNLTNISQESVLNVLKAVKPKASPGPDGIPAIILSRCCES FT IAPIFTKIFNFSLQQGEFPSIWKSSWIVPIFKKGDKEDAANYRGITSLSAG FT AKVFEKVIQTSLLTASYHYISPQQHGFVPKRSTITNLVDFTSQCLRNMDSR FT IQTDAVYMDLKAAFDSIDHNILLAKMNKLGLGRNLICWLESYLVNRSYAVS FT FSRCHSRSFIASSGVPQGSNLGPLLFVLFINDLSYVVPAANHLMYADDIKI FT YMTIKNDTDHSELQQYLLDFHQWCNRNRLSLCIEKCRVISFTRAREIKNTS FT YTVNGLPIERTNTMKDLGVILDSKLSFDQQTEEVITRGNQLLGMLFRITRD FT FKDPVCIKALYCGIIRPVLEYACVVWKPSTQRLSERMESIQRRLSRYATRL FT LPWPPGSQLPPYESRLLLLGLQPLHIRRRTAQQLFVAGLLQNNIDCPSLLQ FT QLNFYAPAVQLRSRPLLALNRSRTGYGSRDPLNSMIRVFNEVRHLFDFGTS FT VQSFKNCLRSELLR" XX SQ Sequence 4007 BP; 1212 A; 974 C; 714 G; 1104 T; 3 other; caaaagaaat tcgtactggt ctatcgcagc ttaatgaatc gcagaagaga acaccactga 60 gcagtcaccc tcctgccaaa ttgcgcagag ttttaccagc gaagagatta ttctccgagt 120 tggtcagtgc tacaaacact tcacctgcta tctcgcttgt cgccaaaaga acaacaaaca 180 tcaataacaa caatatcaac aataacatag ccgtaccgtc actagctgag cacaacaata 240 ttaacatcac aaataacatc acttctcgcc agccaattgc tatttatggc acggataacg 300 attctccaaa ccttccaacc gttcaaaacc gctttgtacc caaaatgtgg cttcatttag 360 ctaacattgc tcctggcact tcctgtgaac aagtcgtgga atctgtgaaa cgacggctgg 420 ccacaaccga cgtcattgca tttagcttga tgggcaaaaa tttcgacact actttggcta 480 gaccgatgtc gtttaaggtt cgcatcccgg ctcaccttcg tgaaactgcg ttatcgtcct 540 cggtttggcc ttctaatcta cgtgtacgtg aatttatact acgtgataac cgcccagcmt 600 catcccatcc tttggcttgt ctcacaccga tgaatgcaca tgaaacccca cttmaaagca 660 tcaatgcagt gcctatcgat aatattccga aaccggcaga gcaaaccatc aatgaaaagc 720 caaatagttc atcacctagt aatgtacaac caatagtgca caatatggag cattcaccta 780 tcccagctga tcgtacggct caaaacaatg acgttcatca cgaacacagc tcatgaagct 840 tacgcacaca agaagcatca gccagtcgct attcgcacac taacacttca tgtaatgacc 900 acgcacgcac taacacagca actcacttga aaatgttttc agttcgcgat attaaaaaca 960 tttcacatgc caaaccgact gacactgctt tgaaaccatg tgatatacat tttcactacc 1020 aaaatgttag aggactacgt accaaactca cagaatttgg attgaacaca ttggacgcta 1080 aatatcagat aataatactc accgaaacct ggcttgacaa ttcaattcca tcatcgctac 1140 tgtttgaccc gggctattcg gtgtatagat gtgatcgtag cttgtctaat agtatgcata 1200 ctcgtggtgg tggcgcactg atcgcatgct catccttatt gaacacccgt gaactcactc 1260 tacctcgtaa ctcacttgag caagtgtggg ttaccgttcg cctatcatct tgcacgatat 1320 tcatcgggac tgtttacatt cctcctaatc gagctaacga cacagatact cttacatctg 1380 tgattgaaag tgtaagtgca atagtaaacc atgcaaaaac taatgatcta atatttttgt 1440 ttggagattt taatttttcg gcagtatcgt ggtcactagc acaacaaaca tttggtcact 1500 catcatcgtt cgcgcactat gtggcaaatt catcatcttc ggtcagatcc mggtttttgg 1560 atgaaattaa cgctagcgat ctttatcaaa taagctgtat caaaaactcg cttaaccggc 1620 agctcgactt agtattcgca aattttattg ccgcaactta ctgcatcgat ttgcacccct 1680 gcccaacacc actagtctct cctgacctgt accatccagc catcgattta acggttcgaa 1740 ttccaagcac aatgcctaat cctacactga cgcaagcaag tagacctagg atgaattttt 1800 ataaaaccaa ttttactcgg cttgatgagc tcataactaa ttttaatagc cagttcagaa 1860 ttagccagtt taattctgtc gatgaagctt tatcgatatt tatacgcttt cttttatcct 1920 cgtttgattc gtgtgtacca attatcacac caaagagcgg ccctgcttgg tctgaccgcc 1980 acctaaaaaa acttaaaaga gctaaggaag ctgcattaaa aaaatatacg aagtaccgat 2040 cacctgttaa caaaagcata ttaaatgaaa caactcgctt gtaccgtagt tacaataaaa 2100 tacgctacgg tggctacttg tctcggttgg agaagaattt tatcaaaagg cccaaagctc 2160 tctggagttt tcgtaagaaa cacaatagta caaatgtgac acctaaatcg atttaccaca 2220 atgaccgagc tggttcaact gcttcagata tgtgtgatat atttgcgtgt agattcgagg 2280 aggcatacac aatctcaact actgacgaca atattatcac caacgcttta attaatactc 2340 ctaatgatgt catcgacctt aatttaacta atatatcgca agaatctgtt ctaaatgtgc 2400 tcaaagcagt caagcccaaa gctagtcccg gcccagatgg tattcctgcc atcattctca 2460 gccgatgctg tgagtccatt gcgccgatct tcaccaaaat ttttaatttt tcgctgcaac 2520 aaggagaatt cccctctatt tggaagagtt catggatagt ccccatattt aaaaaaggtg 2580 acaaggaaga tgcagccaac tatagaggca taacatcact cagtgctgga gccaaagttt 2640 ttgagaaagt gatccaaaca agtttactca cggcttcata tcactatatt agccctcaac 2700 aacacggttt cgttccaaag cggtcgacga ttactaacct tgttgacttt acatcacaat 2760 gtctccgtaa catggatagc aggattcaaa cagatgcagt ttatatggat ttaaaggctg 2820 cttttgatag catcgatcac aacatccttt tagccaagat gaacaagttg ggacttgggc 2880 gaaatttaat atgctggtta gaatcgtacc ttgtcaatcg atcgtatgct gtatctttct 2940 cacggtgcca ttcaaggtct ttcattgcat catctggtgt gccacaaggt agcaacctgg 3000 gtccactact gtttgtactg ttcatcaacg acctatcgta tgtagtacca gcagcgaatc 3060 atttaatgta tgcagatgac attaaaatct acatgacgat aaaaaatgac acagaccact 3120 ccgagctaca acaatacctt ctcgattttc accaatggtg caatcgaaac cgcctgagcc 3180 tctgcattga aaaatgccga gtcatttcat tcacacgcgc acgagaaatt aagaatacca 3240 gctacacagt taacggttta ccaatagaac gcactaatac tatgaaggat ctcggagtca 3300 tacttgattc aaagctgtca tttgaccagc aaacagagga ggtgattacc cgaggtaatc 3360 aattgcttgg catgttgttt cggataacaa gagatttcaa agacccagtc tgcatcaaag 3420 cgttatactg tggtattatc cgcccggtac tcgaatacgc ctgtgtcgtc tggaagccct 3480 caacccaaag actttctgag agaatggaat cgatacagcg tcgcttatcc cgatacgcaa 3540 cacgcttgtt accttggcca cctggcagtc agttaccacc ttatgaatcg cgacttctgc 3600 tcctcggact gcaaccgctc cacatccgac gccgtaccgc acaacaacta tttgtggctg 3660 gtttactgca aaataacatc gactgtcctt ccctactcca gcagttaaac ttttacgcgc 3720 ctgctgttca gcttcgctct cgtccgttac ttgccctgaa ccgtagtcga actggatatg 3780 gatctagaga ccccctcaac tccatgatta gggtgtttaa cgaagttcgt catttgtttg 3840 attttggaac ctcagtgcaa tcctttaaaa actgccttag aagcgaatta ttaagatgat 3900 cttactactt attttaatct aatattagac ttaagaacat tcacacggat ccattgttat 3960 ccattgaacg agtaataaac aaattaaaca aattaaacaa attaaac 4007 // ID RTE-3_AG repbase; DNA; ANG; 1597 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE RTE-like non-LTR retrotransposon - a consensus sequence. XX KW RTE; Non-LTR Retrotransposon; Transposable Element; RTE-3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1597 RA Jurka J.; RT "RTE-like non-LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 648-648 (2009). XX DR [1] (Consensus) XX CC It is likely to be 5'-truncated. XX FH Key Location/Qualifiers FT CDS 122..1564 FT /product="RTE-3_AG_1p" FT /translation="MGPERLTVEMHQLIVKVWEQEELPEEWKLGVIHPVYK FT KGDRLDCSNFRAITVLNAAYKILSQILFCRLAPLATNFVGSYQAGFVGGKS FT TTDQIFTLRQILQKCRERQIPTHHLFIDFKAAYDTIDRKELWSIMQRYHFP FT GKLIRLLEATMNGVQCKVRVSNLTSESFESHRGLRQGDGLSCLLFNIALEG FT VIRGAGLDNDIRGTILYRSLQFLGFADDIDIIGRTTAKVCEAYTRLKREAA FT RIGLRINATKTKYLLAGDSDHLGSSVLVDGDSLEVVKEFCYLGTVVTSDND FT ISSEIRRRIVQGNRAYYGLHRLLRSRRLRARTKCEIYRTLIRPVVLYGHES FT WTIRAEDANALGVFERRILRTIFGGVFEHGAWRRRMNHELAELYGEPSILT FT VAKAGRIRWLGHVMRMPDSCPTKKVFDSDPQFGIRRRGAQRTRWLDQVKRD FT LSEIGCLHGWEAAARDRASWRXIVDRAMSHRRALS*" XX SQ Sequence 1597 BP; 388 A; 410 C; 487 G; 311 T; 1 other; cagcatcggc gcagacgacg aggtgccccc gccatctctg gatgagattg ccagcgccat 60 caagcagctt aagagcaata agtctgccgg cagcgatgga ctggcggccg agctcttcaa 120 gatggggccg gagaggctta ccgtcgaaat gcatcagctg atcgtgaaag tctgggagca 180 ggaggaacta ccggaggagt ggaagctggg tgttattcac ccagtctaca aaaagggcga 240 caggctggat tgctcgaatt ttcgagccat cacagtcctt aatgccgcct acaagatcct 300 gtcccagatc ctgttctgca gacttgcgcc ccttgctaca aattttgtcg gcagctacca 360 agctgggttt gttggaggca aatccaccac cgaccaaatt ttcactctac ggcagatcct 420 ccagaagtgc cgagagcgcc agatcccaac gcaccacctg ttcatcgact tcaaggcggc 480 ctacgacacc atagaccgga aggagctatg gagcatcatg cagcggtacc acttccctgg 540 gaagttgatc cggctgttag aggccaccat gaacggggtg cagtgcaagg tgagagtatc 600 gaacttgacg tcggaatcgt tcgaatctca caggggtctg aggcaaggtg acggactctc 660 ctgtctgctc ttcaacatcg ccctggaagg tgtcattcga ggcgcggggc tagacaacga 720 catccgtggc acgatcctct accggtctct ccaatttctt ggcttcgcgg atgacatcga 780 catcatcggc aggacaacag cgaaggtgtg tgaggcgtac acccgactca aacgcgaagc 840 agcaagaatt ggattgagaa tcaatgcgac gaagacgaag tacctgcttg ccggtgactc 900 agaccatctg ggaagcagtg tattagttga cggcgacagt ctcgaggtag taaaggagtt 960 ttgctatctc gggacggtcg ttacttcgga caacgacatc agcagcgaaa tccggagacg 1020 cattgtgcag gggaatcgtg catactatgg gcttcaccga ctgctgagat ccagaagact 1080 tcgagcccgc acgaaatgtg agatatatcg cacattgatt cgcccggtgg tcctctatgg 1140 acacgagtcc tggaccatcc gagcggagga tgcaaacgct ctgggcgtgt ttgagcgacg 1200 catcctccgg accatctttg gcggtgtgtt cgagcatgga gcgtggagga gaaggatgaa 1260 ccacgagctt gctgagctgt acggcgaacc gagcatcctg acggtggcga aggctggcag 1320 gatacgatgg ctggggcatg tcatgaggat gccggactca tgccccacca agaaggtgtt 1380 cgacagcgat ccccagttcg gcataaggcg caggggagca cagcgaactc gatggctgga 1440 ccaggtgaag cgagacctgt cggagatcgg gtgtctgcat ggatgggagg ctgcagccag 1500 ggaccgagca tcctggagaa tkattgttga ccgggccatg tcacatcgac gtgctctatc 1560 gtgagcaggc caacaagaga gagagagaga gagagag 1597 // ID GYPSY26-LTR_AG repbase; DNA; ANG; 195 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY26-LTR_AG is an LTR of retrotransposon GYPSY26_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY26-I_AG; GYPSY26-LTR_AG; GYPSY26_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-195 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY26_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 18-18 (2004). XX DR [1] (Consensus) XX CC GYPSY26-LTR_AG is a long terminal repeat of GYPSY26_AG (its CC internal CC portion is deposited as GYPSY26-I_AG). XX SQ Sequence 195 BP; 63 A; 43 C; 48 G; 41 T; 0 other; tgttatatac acctcgtaca caccgtgcgt atccaaccac gatgtgtaaa ccgctgtgac 60 agttgacgcg cggcggcaga tatggaagtg agggaagaag aaaatacaaa gaggaatttg 120 gcggcacgct ctctcgcgta cgaacaacca aacagagtgg tcgtgtttca ttcaatatat 180 ccgaaagata taaca 195 // ID MARINERN12_AG repbase; DNA; ANG; 801 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE MARINERN12_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN12_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-801 RA Kapitonov V.V. and Jurka J.; RT "MARINERN12_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(3), 63-63 (2003). XX DR [1] (Consensus) XX CC There are ~20 copies of MARINERN12_AG in the genome, CC they are ~11% identical to the consensus sequence. CC MARINERN12_AG copies are flanked by the TA target site CC duplications. CC The consensus sequence forms an imperfect palindrome. CC Putative classification: a nonautonomous Mariner/Tc1-like DNA CC transposon. XX SQ Sequence 801 BP; 143 A; 235 C; 239 G; 174 T; 10 other; gtgatgggcg tttcggagcg caccaacggc tccggagccg gctccggact aacggctccg 60 gagccggctc cgactttttc ccggagccgg ttccgctccm acggvtcgac ggctccggag 120 ccggctccgc tccgacggct ccggagccgg ctccgcacca acggcccgga gccggctccg 180 caccaacggc tccgtagccg gctccggact aacggctctg cactcacggc tccgttccaa 240 cggctgcgga gccggcttcg ggcccaccgt tccggagccg gcgacgaatc aacggctccg 300 gaccaacggc tccgaactaa cggttcaata gagacatcat tgtatgatag cgtaataaaa 360 aaacgaatat atcttctcgc acgtgtggaa gctaacacac gatgtgattg cagtattgac 420 atgcatgacg ttgtgtagca ttcgttggtc tcgcctttct taacaatatt caaaagtmat 480 cgatgttttt tttttgataa attcttttaa acatgaccta ctccgctatt tacggatttc 540 ctcgattgaa acttcattga aaaagttmtt ttggtccttt gtgttaaaac cggctccgga 600 gccgttggag cggagccgtt ggtgcggagc cggctccgga gccgttgggt gcggagccgg 660 ctccgragtc gtgggwgcgg agccggctct ggagccgtgg gtgcggagcc ggctccggag 720 ccgtbggtgc ggagccggct ccggarcytg tstggagccg gtcggagccg gctccggctt 780 cggagccgtt ttgcccatca c 801 // ID MTANGA_LTR repbase; DNA; ANG; 119 BP. XX AC AF387862; XX DT 09-MAR-2006 (Rel. 11.02, Created) DT 09-MAR-2006 (Rel. 11.02, Last updated, Version 1) XX DE Anopheles gambiae mtanga retrotransposon (LTR portion). XX KW Copia; LTR Retrotransposon; Transposable Element; MTANGA_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-119 RA Rohr C.J., Ranson H., Wang X. and Besansky N.J.; RT "Structure and evolution of mtanga, a retrotransposon actively RT expressed on the Y chromosome of the African malaria vector RT Anopheles gambiae."; RL Mol Biol Evol 19(2), 149-162 (2002). XX DR EMBL/GenBank/DDBJ; AF387862; Positions 1 119. XX SQ Sequence 119 BP; 39 A; 20 C; 21 G; 39 T; 0 other; tgaaggaatc caagcatgca tgattgattc ttcatcatag tagtcattta aatgttcaat 60 aaaggttagt tcgctttcat cactaggaga gaatagttca catttactgc gttacaaca 119 // ID RETRO20_AG_LTR repbase; DNA; ANG; 223 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO20_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; NINJA; RETRO20_AG_I; RETRO20_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-223 RA Jurka J. and Drazkiewicz A.; RT "RETRO20_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 6-6 (2002). XX DR [1] (Consensus) XX CC Related to NINJA of Drosophila simulans. 5 bp target site CC duplication. XX SQ Sequence 223 BP; 69 A; 45 C; 55 G; 54 T; 0 other; tgttaggata gctcgaacga actcggcttt caattgtaaa cgcgtgaagt ggcaggtcgt 60 tagcggaaga taagaacagg aaacacgatt atccgggcgg acttgatcgg tctcttgatc 120 ggtgaacgcc agactaatcg gacgtgtttt taaaccctac gctacaaaat aaagttaaag 180 gttctaaaaa atcctaaagt cttgtttacc gggagccgca aca 223 // ID TC1N-2_AG repbase; DNA; ANG; 766 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE TC1N-2_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW HATN6_AG; TC1N-2_AG; mariner/Tc1 superfamily; KW nonautonomous DNA transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-766 RA Kapitonov V.V. and Jurka J.; RT "TC1N-2_AG, a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(5), 98-98 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of TC1N-2_AG in the genome. CC They are ~93% identical to the consensus sequence. CC TC1N-2_AG copies are flanked by the TA target site duplications. CC This element has 337-bp terminal inverted repeats. CC Classification: a nonautonomous Tc1-like DNA transposon. CC One subfamily of TC1N-2_AG is composed of ~10 copies that CC harbor an insertion of HATN6_AG element at the same position. CC Presumably, the last element was multiplied as a portion of CC a new composite TC1N-2_AG-like element. XX SQ Sequence 766 BP; 215 A; 166 C; 165 G; 211 T; 9 other; cacctgtttc catctacgtt cgaatctcgt gccgtttgca tcgaatcgca aaccgcctgc 60 ttcgaatcgc atgccactac gatcgaatgc gtgcacccat ggttgaatcg cgtgccagta 120 cggttgaatc gcgttccact atggtcgaat catatgccac tacgatcgaa tcgtgtgccg 180 ctaccattgg atttctgaaa gcaagagcat agtaaactgt ttgtttacgt ttgaaagctg 240 tttcyttctt cgccgacggc atttcatttg taaaatagta ctaataaaag ggatttattc 300 tctgattata ggtaaacccc ttgttatcga cgaatattgt gnaaccntgt tgatnntaan 360 atctttacga tcaacaaaca ccgaatrtta gttgaagatt ttaccatcaa caagggacca 420 wggkttcaac cgatattcgt cgataacaag gattttacgt gtaatcagtg aataactgcc 480 ttttattagt attttacaaa cgaaatgccg tcggcgaaga aagaaacagc tttcaaacgt 540 aaacaaacag tttactatgc tcttgctttc agaaatccaa tggtagcggc acacgattcg 600 atcgtagtgg catatgattc gaccatagtg gaacgcgatt caaccgtact ggcacgcgat 660 tcaaccatgg gtgcacgcat tcgatcgtag tggcatgcga ttcgaagcag gcggtttgcg 720 attcgatgca aacggcacga gattcgaacg tagatggaaa caggtg 766 // ID DNA-4_AG repbase; DNA; ANG; 325 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; DNA-4_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-325 RA Jurka J.; RT "Non-autonomous DNA transposons from mosquito."; RL Repbase Reports 10(9), 1429-1429 (2010). XX DR [1] (Consensus) XX CC TA tsd. >98% identical to consensus. Probably still active. CC Likely Mariner/Tc1. XX SQ Sequence 325 BP; 116 A; 60 C; 49 G; 100 T; 0 other; tacagtcttt ccccgagtta cgcgacactc gagttacgcg aattcgagtt acgcgaattt 60 ttatttttga cagttcagat gtcaaatcag tacaattttc tccatcaatt gtcaaatgaa 120 aaataattac cgataaataa ccaaattttc tacaccaatt gcatcaaata agtattaaat 180 tacataaatg aactaaattc aatcaaaaat tatgataaat aaagtatatt tttgctgtaa 240 catgtgatat tcgacttacg cgaaaatccg agttacgcga atgtctccgg aacgcattat 300 tcgcgtaact cggggaaaga ctgta 325 // ID Clu-111_AG repbase; DNA; ANG; 824 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-111_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-824 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1449-1449 (2010). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 824 BP; 227 A; 176 C; 176 G; 242 T; 3 other; tacacctgtt ctcactttcc tctgaatcgt tatccggttg ctatggatcg caatccatga 60 gaatggatcg caatcaacta cgtctgaatc gttttcagct acgtctggat cgttttcagc 120 tacgtttgga tcgctctcag ctacctttgg attttgcggt agctcgcgca aatttcgttg 180 cagcgcaacc gtttattcgg tgaccgcggg acacatttgc accccaagga aagaactaaa 240 caattaatgc attcaattgt aaaatctwat cgtatttttt atatgaagct tacatcacca 300 ttcatacgtt tttgattatt wgaccaagtt ttccaaacaa accwgcacaa tcttcttctt 360 tttggcgtaa cgacctacgt ggtcattcag tcctgttcag gctttcgagg tttagccgga 420 tattccttct ttggtatgga gggacggtct ggttgggaat ctattctgat gcgtagaagc 480 cgacaaatct gtcaaatgtt gttcattgct taaaaatggg ttttcaacac aagttctcaa 540 ttgggttcaa acatagaaca atacgaaatt gttgcttttc aaatcgtctg ttatgaggag 600 aatcgtgagg tacgccgaac ggtcactaaa taaactttgc gccgcaacga gatttgcgcg 660 agctaccgta aaatccaaag gtagctgaga gcgatccaaa cgtagctgaa aacgatccaa 720 acgtagctgc aaacgattca gacgtagttg attgcgatcc aatctcatgg attgcgatcc 780 atagcaaccg gataaagatt cagaggaaag tgagaacagg tgta 824 // ID BEL10-I_AG repbase; DNA; ANG; 5733 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL10-I_AG is an internal portion of the BEL10_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL10-I_AG; BEL10-LTR_AG; BEL10_AG; Bel clade; PHD domain; KW integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5733 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL10_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 27-27 (2003). XX DR [1] (Consensus) XX CC BEL10_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL10-I_AG, an internal portion of BEL1_AG is flanked by CC BEL10-LTR_AG CC LTRs. The BEL10-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 7 copies; they less than 1% divergent from CC the consensus sequence. CC The consensus sequence encodes a 1746-aa BEL10_AGp Bel-like CC protein CC (pos. 451 5688). CC BEL10_AGp is composed of the PHD (pos. 32-77), protease, reverse CC transcriptase and integrase (pos. 1400-1550) domains. XX FH Key Location/Qualifiers FT CDS 451..5688 FT /product="BEL10_AGp" FT /translation="MLFCHTTVCRRIVVKMATKKSNYFVENKNGACRLCTE FT PNGDSPFVRCEECDRYFHLACAKLSAVPTAEEEWLCIKCQDIKIQYQQEEK FT KSTPDQAILELAMVLQNNSLEASRHIKKTTLVNLPEFDGKPQDWPHFKKTF FT EDTTIEAGFSKLENLNRLQRFVKGEAEKAVRALLLDPQNVPAIMGRLEEQF FT GRADQVYKELLKDVVKIKVEGQMKIMELSDALDNLVTNIKILGKRSYLNDP FT RLIDELLAKLSIDKQLNWAQHKASLEAANSEITLDQFNDWMRQISKALRSL FT PKRTDRPQNKVNVHKYHNHKPPRNSQQMNIAQNLHTVCRLCKGQHTLPECE FT IFLQKNISDRIAFTTQHHLCFTCLLSSEHVQRNCPLSKVCSVQGCIRRHHP FT LLHEVPQPIYYHYNDVRVYYQIVPITVRNGSVAMKTFAFLDAGSSLTLMEE FT DLANKLGLQGRQDPLTMTWTQNLTVEQDTSRRVQLTIVNDQGKEFLLKEVR FT TVKNLQLPQQTIDTNRLLALYPHLKGIKIKSFTNAYPTILIGLSHSHLIMP FT LDRRMGRPEEPMAIKTKLGWIIFGNEYTQPATRSDHFMVHKNEEIMNRMIQ FT QYFSTEDFGVKVTKPLVPKEIEQANKILEKTLEKKDGYYQVGLLWKPEVAH FT FPNSYPNALKRLVSLEKQLNKNRELQVWAKNTFADYIQKGYLRKLTPSQVA FT IATPKTFYLPHFVVVNRNKPIPKPRLVFDAAAEVNHISLNSQLLSGPDEMA FT SLFGVLLRFREGNICVTGDIQEMFHRVRIREEDQDSQRILWRDCENRCPDV FT YVMQVMTFGATCSPACAQVVKNTNAAAHAETHPLAVDPIVRQHYVDDYLDS FT FFTMEDAIETVRQVIEVHKAADFHIRNFTSNRGELLKTIPQDRVQQKTTSV FT QIEEKGQDYEKILGVHWNPTIDYFGFQVKMNKVTKDRPTMREVLSFVMSVY FT DPLGLISHVTIAGRILMRELHVITKDWDSTIPDELQEKWEECLRVVKSAEK FT IRIPRQLVITLNDPLELHTFVDASENAFAACVYARTVTKEGTVFINLVAGK FT ARVAPINPLSIPRLELQAAVLGVRLTESIRKELRLFVKHITYWSDSEVVLS FT WLKNRRKYAQFVAHRVGEILESSRADQWRWVPSRENPADIATKPYPNDWIW FT VNGPSFLRNNESDWPHKEIVETNEECRVVAVHQHAETAIPVENYSSWPRLL FT KHLTILKKFADFIRNRSAFTRSIQPNDVQLVRNGMYRSAQWEGFPEEMATL FT TKGQPVPKQSPLNKLSPFLDSDGIMRSRGRLENISTLPQSTRTPIILPQKP FT RLVKLLVRNFHERYLHQADNVVIGAIRQEYWIVNLRAVLKNIKKCCQKCIL FT KTAAPVAPFMAPLPEFRAHPYTPPFYNTGVDYFGPIEVQVKRSLEKRWGAI FT FSCMNTRAVHIELAEKLDTDSFMVCLKNFQNRRGKICNMYSDNGKNFVGAE FT RELRELVTEIDKRMGHESALKYEIKWHFNPPSAPHFGGVWERQIQNIKKGL FT RHMFSEWSHRHPTPETLRATLIEIEAMLNSRPLTHIPLENEEDEILTPFHF FT LIGRNGSHIPPVVNETSAANRQQYKLTQHYSKIFWDRWKKEYLPTLIRRNK FT WTNHVEPIKVGDIVVLFDDNAPPGKWIKGRIVKANMAPDGQVRSVEVKVGE FT NILKRPAVRVAVLNVEQKEHLLLHTTSQQHQGTNRIRPLQPNNDNNETKPP FT RKKTKIAPCHWAKQLLENNPPSTSTN" XX SQ Sequence 5733 BP; 1903 A; 1264 C; 1255 G; 1311 T; 0 other; ttggtggctc cagagaggag agcctcaccg tagatagtgt catcgcgtcg ttgttagacc 60 ggagtggacg agccgtgtag ctaccgtgct ccgtgttacc aagtgcgccg tggtcccgag 120 tgtccagccg caccggaagg gtcgagccca gagtgtgctt ctggaaccaa agcgtccagc 180 ccatccgtga cgtcgtcaca tcatcatcat cagaccggat tggaagaaag ctaagtccgt 240 gcttctggcc agagtgtcca gccatccgcg tcattgaagt gcgccgtggt cccgagtgtc 300 cagccgcatc aatttatgtt atttgtcata cgtccttctg tcattggacc ggattggaaa 360 aagagctaag tccgtgcttc tggccagagt gtccagccat ccgcgtcgtt aaagtgcgcc 420 gtggtcccga gtgtccagcc gcatcaattt atgctgtttt gtcatacgac cgtgtgtcgt 480 agaatcgtcg tgaaaatggc taccaagaag agcaattatt ttgttgagaa caaaaacggt 540 gcatgtcgcc tctgtaccga accaaacggt gatagtccgt ttgttagatg tgaggaatgc 600 gatcgatatt ttcacctggc ctgcgccaaa ctatcggcgg taccaaccgc agaagaagaa 660 tggttgtgca taaaatgcca agatatcaag atacaatatc aacaagaaga aaagaaaagt 720 acaccagatc aagccatatt agaactcgct atggtattac agaacaatag tttggaagct 780 agccgccata tcaagaaaac cacattagta aaccttccag aatttgatgg gaaaccacaa 840 gactggccgc atttcaaaaa gacatttgaa gacacaacaa tagaagcagg ttttagcaag 900 ctagaaaact tgaatcgatt gcaacgattt gttaagggag aagccgaaaa agcagtacgt 960 gccttactct tggatccgca gaacgttccg gccatcatgg gtcgtttaga agaacaattt 1020 ggaagagcgg accaagtgta caaagagctg ttgaaagatg tggtgaaaat aaaggtcgaa 1080 ggacaaatga aaataatgga gctttctgac gctcttgata accttgttac aaatattaag 1140 attttaggaa aaagatctta cttgaatgat cctcgactca tcgatgaatt gttggcaaaa 1200 ctttcgatcg ataaacaact gaattgggca cagcacaaag ccagtctaga ggcagcaaat 1260 tctgagataa cgcttgacca gttcaacgat tggatgcgcc aaatatcaaa ggcattgcga 1320 agtctgccaa aaagaacaga tcgaccacag aacaaagtaa atgttcacaa atatcacaat 1380 cacaaaccac cacgaaattc ccagcaaatg aacattgctc aaaacttgca cacagtttgt 1440 cgcttgtgca aaggacaaca tacgttgcca gagtgtgaaa tctttttaca gaaaaacatt 1500 agtgaccgaa ttgcattcac cacacaacac catctgtgtt tcacatgtct cctctccagc 1560 gagcatgtcc aaagaaactg cccattgtcc aaagtatgta gtgtgcaggg atgtattcgg 1620 cggcatcatc ctctgttaca tgaagtgccc cagccaatat attatcacta taacgacgta 1680 cgtgtctact accagatcgt tccgataacc gtacgcaatg ggtcagtggc tatgaaaacg 1740 tttgctttct tggacgctgg atcatcacta acactcatgg aggaagattt ggcaaacaag 1800 ttaggtttac aaggccggca agatccatta acaatgacgt ggacccaaaa tctaacggta 1860 gaacaagaca ccagtcgtcg agtgcaactt acgattgtaa atgatcaagg aaaagaattc 1920 ctattgaaag aggtgcgaac cgttaaaaat ctacaactac cacaacaaac aatagacaca 1980 aatagattgc ttgctcttta cccacacctg aaaggcatta aaataaaatc gtttacaaac 2040 gcttacccta ccatcctcat aggcttaagt cacagccacc ttatcatgcc attagaccgc 2100 agaatgggcc gtccagaaga accgatggcc atcaaaacaa aactcggatg gataatattc 2160 ggaaatgagt acacacaacc ggcaacaaga tcggatcatt tcatggtaca caaaaacgag 2220 gaaataatga atagaatgat ccaacaatac ttcagtacgg aagattttgg agtgaaagtg 2280 acaaaacccc ttgtgcccaa agagatagag caagcgaata aaattctaga aaagacgcta 2340 gagaaaaagg atggatatta tcaagtggga ctgttgtgga aaccagaagt ggcacacttc 2400 ccaaatagct atccaaacgc actgaaacgc ctagtaagtt tggaaaagca gttgaataag 2460 aatagagagc tgcaagtctg ggcaaagaac acatttgcag attacataca gaagggttat 2520 ctgagaaagt tgacgccaag ccaggttgct attgcaacac caaaaacttt ctatctaccg 2580 catttcgttg tagtcaaccg aaacaaaccg ataccgaaac cacgattggt gtttgacgcg 2640 gctgctgaag taaatcacat ttcgctgaac tcccaattac tatctggccc tgacgaaatg 2700 gcttcacttt tcggtgtctt gctacgtttt cgagaaggca acatctgcgt cacaggggac 2760 atccaagaga tgttccaccg agtaagaata cgggaagaag atcaggactc tcagcgtatt 2820 ctgtggcgag attgtgaaaa ccggtgtccc gatgtgtacg tcatgcaagt catgacattt 2880 ggcgcgactt gttcgcctgc atgcgcacaa gtagtaaaga ataccaacgc agctgcccac 2940 gccgaaacac accctctggc agtagaccca atagttcgac aacattatgt tgacgactat 3000 ctcgatagtt ttttcacaat ggaagatgca attgaaacag taagacaagt aattgaggta 3060 cacaaagctg ctgattttca cataagaaat tttacctcaa accgggggga gttactgaaa 3120 acaattcctc aagatcgtgt acaacaaaaa acaacttcag tacaaatcga agaaaagggc 3180 caagactacg agaagatact tggtgttcat tggaacccaa cgatcgatta ttttgggttt 3240 caagtcaaaa tgaacaaagt aacaaaagac aggccaacaa tgagagaagt tcttagtttt 3300 gtcatgagtg tttacgatcc gctgggctta ataagccacg taacaatagc tggcagaatt 3360 ttaatgagag aacttcatgt tataacgaag gattgggact ccaccattcc agatgaatta 3420 caagagaaat gggaagaatg tctacgcgta gtaaagagtg ccgaaaaaat acgaattccg 3480 aggcaattgg taataacctt gaatgatccg ctagaattgc atactttcgt agacgcttct 3540 gagaacgcat ttgctgcttg cgtttacgcc cgcactgtaa ctaaggaagg aacagttttt 3600 attaatcttg tggccggaaa ggcacgcgtg gcgccaatta atccattgtc cattcctagg 3660 ctggagttac aggctgcggt tctgggtgta cgtctaaccg aaagtataag aaaagaactt 3720 cgcctattcg taaaacacat aacatactgg agtgattcgg aagtagtatt gagttggctc 3780 aaaaatcgtc gtaaatacgc acaatttgtt gcacatcgtg tgggcgaaat tttagaatca 3840 tctcgtgccg accaatggag atgggttcca tctcgagaaa atccagcaga cattgcaaca 3900 aaaccatatc caaatgattg gatttgggta aatggacctt cgttcttaag aaacaacgaa 3960 tcagattggc cacacaaaga gatcgtagaa acgaatgaag aatgccgcgt agtagctgta 4020 caccaacacg cagaaacagc tataccagtc gagaattatt cttcgtggcc tagattgtta 4080 aaacatctaa caattctgaa aaagtttgcc gattttatta gaaatcgttc agcatttaca 4140 cgttccattc agcctaatga cgtccaatta gtccgaaatg gaatgtatcg cagtgctcaa 4200 tgggaaggat ttcccgaaga aatggcgact ctcaccaaag ggcaaccagt accaaagcaa 4260 agccctttaa acaagctatc accattttta gactccgatg gcatcatgcg ttcccgtgga 4320 agattggaaa acatcagcac gcttccccaa agtacacgca cgccaatcat actgccacag 4380 aaaccaagat tggtaaaatt gttagtgaga aatttccatg aacgttattt gcatcaggcc 4440 gacaacgtag ttatcggcgc aattcgacaa gaatattgga tagtgaattt acgagcagtg 4500 ctaaagaaca taaagaagtg ttgccaaaag tgcattctta aaaccgctgc accagtagcc 4560 ccctttatgg cacctctccc agaattcaga gctcacccat acacaccacc gttttataat 4620 acaggggtcg actattttgg acctatcgaa gtacaagtaa aaaggtcgtt ggaaaaacga 4680 tggggtgcta tcttctcgtg catgaatacc agagcagttc atatagaact agccgagaag 4740 ctagacacag acagttttat ggtatgcctg aagaatttcc aaaatcgccg aggaaaaata 4800 tgcaacatgt atagcgataa cggaaaaaac tttgttggtg cggagcgcga attaagagaa 4860 ttagtaacag aaatcgacaa gcgcatggga catgaatcgg cgctcaagta cgaaataaaa 4920 tggcacttca atcccccttc cgccccccat tttggtggag tatgggagcg gcaaatacaa 4980 aacatcaaaa aaggattgcg gcacatgttt tccgaatgga gtcaccgaca cccgactcca 5040 gagacacttc gtgccacttt aattgaaatc gaggcaatgc ttaattcccg gcccttaaca 5100 cacataccgc tggaaaatga agaagacgaa atactcacac catttcattt tcttattgga 5160 cgaaatggca gccatatacc accagtagtg aacgaaacct cggcagcaaa caggcaacag 5220 tacaaactca cacagcatta ctcaaagata ttctgggacc gttggaaaaa agaatatctt 5280 ccaacactaa tacgccgaaa taagtggact aaccacgtag aacccataaa ggtgggagac 5340 atagtcgtgc tgttcgatga caacgcacct ccaggaaagt ggatcaaagg aagaattgtg 5400 aaagcgaata tggccccgga tgggcaagta cgatcagtgg aagtaaaagt gggagaaaac 5460 atattaaaac ggccagctgt ccgagttgcc gtattaaatg tagagcaaaa ggaacatttg 5520 cttctccaca caacatctca gcaacaccag ggcacaaatc gaataagacc gcttcagcct 5580 aacaacgaca acaatgaaac gaagcctcct cgtaagaaaa caaagatcgc cccatgtcat 5640 tgggcaaaac agttattgga aaataaccca ccaagcactt cgaccaacta agcaagaagc 5700 gtaatccccg cctgtgaatt acgtggggga gaa 5733 // ID GYPSY42-LTR_AG repbase; DNA; ANG; 251 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY42-LTR_AG is an LTR of retrotransposon GYPSY42_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY42_AG; GYPSY lineage; GYPSY42-I_AG; GYPSY42-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-251 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY42_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 77-77 (2004). XX DR [1] (Consensus) XX CC GYPSY42-LTR is a long terminal repeat of GYPSY42_AG (its internal CC portion is deposited as GYPSY42-I_AG). XX SQ Sequence 251 BP; 66 A; 33 C; 55 G; 97 T; 0 other; aattatgtgg tataaccatt attaaccttg gtactggtat aacggtgata cgcgtggctg 60 tagtggcagc tttgtgacgg tgacattcga tcgagagacg atctgataac tctctttttc 120 cgttatctcc cgagcagtac agacgtctag gacgagatag gttttttttc ttattttttt 180 atagttagga gttaggattt aggttatgtt agaataagtt agtttatttt tttgtaaata 240 aaaacttaat t 251 // ID GYPSY9-I_AG repbase; DNA; ANG; 5973 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE GYPSY9-I_AG is an internal portion of retrotransposon GYPSY9_AG - DE a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY9-I_AG; GYPSY9-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY9_AG; mdg1 lineage; reverse transcriptase. XX NM GYPSY9-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5973 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY9_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 177-177 (2003). XX DR [1] (Consensus) XX CC GYPSY9_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, GYPSY13_AG, CC GYPSY14_AG, CC GYPSY15_AG, GYPSY16_AG, and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY9-I_AG consensus was reconstructed after multiple CC alignment of 12 copies. CC The consensus encodes the 458-aa GYPSY9_AG1P gag-like protein CC (pos. 890-2263) and the 1175-aa GYPSY9_AG2P (pos. 2215-5739). CC The sequence of the LTRs flanking GYPSY9-I_AG is deposited as CC GYPSY9-LTR_AG. XX FH Key Location/Qualifiers FT CDS 890..2263 FT /product="GYPSY9_AG1p" FT /note="gag-like protein" FT /translation="KQYEKGMITISDKIEILKYIFKKIQDPNQRTCTRTQL FT RIKAEETFKEIQNDIEKNRYKYTFNKLLEFSKISNALIQNIIAMSTSKTND FT DSRNKSSDFSTNSSEDKNINLTLLTTSRLSFKLLAQIIFVLLKLHKKIKMA FT NFDIKTATGLVPTYDGSPDTFNAFEDASTLLFELNPNHEEMLVKFIRTRLT FT GKARIGLPSNITTFNELISDIKRRCEEKTTPDKVIAKLKSIKTRDAQSICN FT EVELLSEKLKVIYLKQGIPEKIANDMAIKTGIDTLKEKVANSETKILLKAG FT TFATITDATQKVMENESEETNRTTNVLNINTQRYNRNYPRNQGNQNRNNQN FT NYGPNTNQPYRNYYNNRNNNNFNNRNNNNSSYNNNLRNNIRYNNRFIQGYN FT NERYNSNFNGNQNRPNKNLRAITNGETGQQQRNIYHTTAQIQEVEENNFLD FT QQESTQTLDQFTH" FT CDS 2215..5739 FT /product="GYPSY9_AG2p" FT /translation="FFRPTGVNSNPRSIYSLELNGVDYIKIRLSIANNETS FT ILLVDTGASISLFKASKLKKNHGPIRSDSISLTGISNTPIYSKGSTTCTIY FT FNNLELEHDFVLVPDEFNIGADGILGRDFYKLYRCSINYHLEMLTFTCQGE FT EIQHNIEEDDGKGFILPIRSEVVRKVYLPNITEDTIVFAQEIQPGVFCGST FT IISKDNQLIKFINTRQRNIYITNAEFKPITEPLANYEVKQVNNKVGEINND FT RLQKLLQKIKVDKIPTTEIYNLKKIVTEYNDIFCVEDDPITTNNFYPQKIE FT LKDNIPTYIPNYKQIYSQADEIQNQVDKMLKNDIIEHSVSPYNSPILLVPK FT KSTDDNKKWRLVVDFRQLNKKVIPDKFPLPRIDTILDQLGRAKYFSTLDLM FT SGFHQIPLENDSRKFTAFSTGSGHYQFKRMPFGLNISPNSFQRMMAIAMAG FT LTPELAFVYIDDIIVTGCSARQHISNIVKVFDRLRSYNLKLNPEKCSFFKT FT EVTYLGHKITDKGIYPDDSKFETIRNFPLPKNADEVRRFVAFCNYYRKFVQ FT NFAKIAKPLNNLIKKNVKFIWTDECQKAFNSLKESLLSPAVLKYPDFTKEF FT ILTTDASDVACGAVISQITDGKDHPIAYASKSFTQGEKNKPIIEKELTAIH FT WAINYFKPYLYGRKFTVKTDHRPLVYLFGMKNPTSKLTRMRLDLEEFDFKI FT EFLAGKTNVVADALSRIITDSDELKASIPKNKSDISNPILMVNTRAMIKKN FT NVVENKENKKKKEDIIEEYNPTNMYETDRPSDTTKMLKMKSNINEKDEFIE FT LVIYNHNYYKALGKFKIPITSAKQSQTLEFVLHESCLIARKYHKNLAIASN FT DRLFEFYSMSTIKEIINKMITDVHIVVFTPPRLVEDTEEQIKIMSNFHTTP FT SGGHIGQFKLYSKIKDKFKWKNMKADIIKYVKSCKACATNKILKHTKEKTV FT VTTTPSKPFNIITIDTVGPLPKTANNNRYAVTIQCALSKYIVIIPIQNKEA FT NTIAKALVENFILTFGNFLEMRSDQGLEYNNEILAQISKILEIKQTFAAAY FT HPQTIGSLERNHRSLNEYLRSYTNEHHDDWDQWTKFYEFVYNTSVHSTTNF FT TPFELIFGRQANLPQELYKTKVELLYNIEQYYNEMKFKLQKSHEIAKQKLI FT QAKLQRQAKLNENINK" XX SQ Sequence 5973 BP; 2513 A; 996 C; 950 G; 1514 T; 0 other; tggcgaccgt gaccttaatc tgcaatcgac aggcagaagt gtaagcaaat tatttcaagt 60 aaaatgtgat acgtacaacg gtaacgtaaa gcttaagtgt cgtgtgacga accaaaagta 120 ttgtgaccat aaccggcatt cggcggaaag tgaaaaaaaa agtgtcaaat gtgcaatttt 180 tttttacgga acattcatca acgcagctgg cgcaagtaac ccatatgtaa acaaaagcaa 240 taacgatgaa agaacagcta caagcaagtt tcgtacagtg cagaatgaaa aataaaaaca 300 tttagtgcag tgtaagtgcg aaatgaatgt ttaattgatg aacaaacatt aaccgtacac 360 agtgttgtga gaacctacct aaatgtgtaa aaacgtatta acgctcatgc gatcggttta 420 cctagtagag taaccttgaa atgggtaaac tatagaaaaa agcacacttt gttaaaaagg 480 gttgttgtta cccaagttca gcacagtaca gtatagcaat ctatacgatt cttggagcaa 540 gtgatatgga atgataaagt gcagtgcaaa acaagctgtg gtcagtgtga ttgaatggaa 600 caaccgttcc caccgtggaa gacgaaaaga aaaaccgagc ccactgccta atctggattg 660 gcagacgaga atgaacaagt gtcccaaccc cgacaagaca tcgagcgaca actgatgaca 720 tcgtggaaat tcacaccaag caccgattaa actgcaagca ccggtataaa tacagcgtag 780 aatcagcagc aacaaaatgc acgtcgcata ttccattaag gtactgtaac tttaaaatat 840 ggtcaaatgg gttttcaaag ctaggaaata ggaaatagga agtaattgaa aacaatatga 900 aaaaggaatg attactataa gcgataaaat cgaaattttg aaatatatat ttaaaaaaat 960 acaggatcct aatcaaagaa cttgtacaag aacacaacta agaattaaag ctgaagaaac 1020 atttaaagaa attcaaaatg acatcgaaaa aaatagatac aaatatactt ttaacaaact 1080 attagaattt agcaaaattt ctaatgcatt aatacaaaat atcattgcta tgagcacatc 1140 aaaaaccaac gacgattcac gtaacaagtc atctgatttt tctacaaata gctcagagga 1200 taaaaatatc aatttaacac tgttaacaac tagtagatta tcatttaaac tcttagctca 1260 gatcatattt gtcttactaa agctgcataa aaaaattaaa atggctaatt ttgatattaa 1320 aacagccaca ggtcttgtac ccacatatga cgggtcaccc gataccttta atgcatttga 1380 ggacgcttca acgctattgt tcgagttaaa tccaaaccat gaagaaatgc tcgtaaaatt 1440 cataagaacc agactgactg gtaaggctag aatagggtta cctagcaata taacgacttt 1500 taatgaatta atcagtgata taaagagaag atgtgaagaa aaaacaacac ctgataaagt 1560 aatagcaaaa ctaaaatcca taaaaacaag ggatgcacaa tcaatttgca atgaggtcga 1620 actactttca gagaaactga aggtaattta tcttaaacaa ggaattccag aaaaaatagc 1680 taacgatatg gcaataaaaa caggtatcga cacacttaaa gaaaaggtag caaactctga 1740 aactaaaata ctgttgaagg caggtacttt tgcgacaata accgatgcaa cacagaaagt 1800 aatggagaat gaaagtgaag aaacaaacag aactactaat gtccttaaca taaatacaca 1860 acgctacaat agaaactatc ctagaaacca aggtaaccag aataggaaca atcaaaacaa 1920 ctatggtcca aatacaaacc aaccataccg aaactactat aataacagaa ataataacaa 1980 cttcaacaat cgaaataata acaatagcag ctacaacaac aatctgcgaa acaatattcg 2040 ctataacaac agatttattc agggatacaa taatgaaaga tacaactcca acttcaatgg 2100 taatcagaac agaccaaaca aaaacctaag ggcaataaca aatggagaaa cagggcaaca 2160 acaacgaaac atataccata ctacagctca gatacaggaa gtagaggaaa ataatttttt 2220 agaccaacag gagtcaactc aaaccctaga tcaatttact cattagagct aaatggagtt 2280 gattatataa agattagatt gagcattgca aataatgaaa catcaatatt actagttgat 2340 acaggagcat caatttcttt attcaaagca agcaagttaa agaaaaatca tggcccaata 2400 cgttcagact caatatcatt aacagggata tctaacacac caatttattc aaaaggaagt 2460 acaacatgca ctatatattt caacaatcta gaattggaac acgattttgt attagttcca 2520 gatgaattca acataggagc agatggtata ttaggcagag acttttacaa actatacaga 2580 tgttcaatta attatcattt ggaaatgctt acattcacgt gtcagggaga agaaattcag 2640 cataacattg aagaagatga tggaaaagga tttattttac caatcagaag tgaagtggta 2700 cgtaaagtat atcttccaaa cattacagag gataccatag tcttcgcaca agaaattcaa 2760 ccaggagtat tctgtggcag cactattatt tcaaaggata atcagttaat taaattcatc 2820 aatacaaggc agagaaatat ttacataaca aacgcagaat ttaaacccat tacagaacca 2880 ttagcaaatt atgaagtgaa acaagtgaat aacaaagtag gagaaattaa taatgataga 2940 ttacaaaaac tattacaaaa aattaaagta gataaaatcc ccactacgga aatttataat 3000 ttgaagaaaa tcgtaactga atataacgat atattttgcg tagaggatga tcccattact 3060 acaaataact tttaccctca gaaaattgaa ttaaaggata atattcctac gtatataccg 3120 aattataaac aaatatactc acaggctgat gaaatacaaa atcaggtaga caaaatgctt 3180 aagaatgaca taattgaaca ctcagtctcc ccatataact cacctatatt attggttcca 3240 aagaaatcaa cggatgataa taaaaaatgg agacttgtag tagatttcag gcagctaaac 3300 aaaaaggtta taccagataa atttccatta ccaagaatag atacaatatt agatcagcta 3360 gggagagcaa aatattttag cacccttgat ctcatgtcag gattccatca aataccttta 3420 gaaaatgatt caagaaaatt tacagctttt tcaaccggat cagggcatta ccaatttaaa 3480 cgtatgccgt tcggtttaaa cattagcccc aacagctttc aacgcatgat ggctatcgct 3540 atggcaggat taactcctga gctagcattt gtatatatag acgatataat cgttactggc 3600 tgcagtgcac ggcagcacat cagtaacata gttaaggttt ttgataggtt aaggtcttac 3660 aatttaaaat taaatccaga aaaatgttca tttttcaaaa cagaagttac ttatttaggt 3720 cataagataa cagataaggg tatataccca gacgattcta agtttgaaac aattagaaac 3780 ttccctttac ctaaaaatgc agatgaagta cgaagatttg ttgcattttg taattattat 3840 cgtaaatttg tacagaattt tgcaaaaatt gctaaacctt taaataacct aattaagaaa 3900 aatgtcaagt ttatttggac agatgaatgc caaaaagcat ttaatagttt aaaagaaagc 3960 cttttgtctc cagcagtctt gaagtatccg gattttacta aagaattcat actaacaact 4020 gatgcttcag atgtagcatg tggagcagtt atttctcaaa tcacagatgg aaaagatcac 4080 ccaatcgcgt atgcaagcaa aagcttcacg caaggagaaa agaataagcc tatcatagaa 4140 aaggaactta cagcaattca ctgggcaatc aattatttca aaccatatct atacggcaga 4200 aaattcaccg taaaaacaga tcatagacca ttagtatacc tgtttggtat gaaaaaccca 4260 acatctaaac taacaagaat gagactagat ttagaagaat ttgattttaa aatagaattt 4320 ttagcaggta aaaccaacgt agtagcagat gctttatcca gaattataac tgattctgat 4380 gagcttaaag catcaattcc aaaaaataaa tcagatatat caaatcctat tttaatggta 4440 aatactagag ctatgataaa gaaaaacaac gttgtagaaa acaaagaaaa caaaaagaaa 4500 aaggaagata taatagaaga atataatcca acaaacatgt atgaaacaga caggccatct 4560 gatacaacta aaatgttgaa aatgaaatca aacattaacg aaaaagatga atttattgag 4620 ctcgtgatat acaatcacaa ttattataaa gcgctgggaa aatttaaaat acccataact 4680 tctgcgaagc aaagtcaaac actagagttt gtactgcatg aatcatgcct aatcgctaga 4740 aaataccaca agaatttagc aattgcatcg aatgacagat tattcgaatt ttactcaatg 4800 tcgaccataa aagaaataat taacaaaatg ataacagacg ttcacatcgt cgtatttaca 4860 ccacctagac tggtagaaga tacagaagaa cagatcaaaa taatgtccaa tttccacaca 4920 acaccatcag gaggtcatat aggacaattc aagctgtata gtaagataaa ggataaattc 4980 aaatggaaaa atatgaaagc ggatatcatc aagtatgtaa aaagttgtaa agcatgcgca 5040 actaataaga tcttaaaaca tactaaagag aaaactgttg tgacgacaac accctctaaa 5100 ccttttaaca tcataacgat tgataccgta ggtcctttac caaaaacagc aaataataat 5160 cgatacgcag ttacgatcca atgcgcatta tcgaaatata tcgtaatcat cccgattcaa 5220 aacaaagaag caaatacaat agcaaaagca ttagtagaaa attttattct tacatttgga 5280 aactttttag aaatgagatc agatcaagga cttgaatata acaatgaaat tttggcacaa 5340 atatcaaaaa tattagaaat caaacaaaca tttgcagcag catatcatcc acaaacaata 5400 ggatctttag agcgaaacca tagaagttta aatgaatatt tacgaagtta caccaatgag 5460 catcatgatg actgggatca atggactaaa ttttatgaat tcgtttataa cacatcagta 5520 catagcacaa ctaatttcac acctttcgaa ttaatatttg gcagacaagc taatttacca 5580 caagaattat ataaaacaaa agtagagcta ttatacaata tagaacaata ttacaatgaa 5640 atgaagttta aattacaaaa atcacatgaa attgcaaaac aaaaacttat tcaagcaaaa 5700 ttacaacgtc aagcaaaatt gaatgaaaat ataaacaaat gaaacatacg aataggagat 5760 tatgtctatt taacaaacga aaatagaagg aaattagatc cagcttatat aggaccattt 5820 acagtcttag aaattacaaa tacaaactgc gttataaaac acaatcaaac aggaaaaacc 5880 acaacagtac ataaaaacag attaaaacat ttttagtgaa taatgcactc tttcgtataa 5940 actcaatcgt acattattca aaaagggtgg agg 5973 // ID L2B-1_AG repbase; DNA; ANG; 5227 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE An L2B clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW L2B; Non-LTR Retrotransposon; Transposable Element; Ag-L2-2; KW L2B-1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5227 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-5227 RA Kojima K.K. and Jurka J.; RT "CR1 clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [1] Named as Ag-L2-2 CC [2] Consensus update and characterization as L2B. This consensus CC is generated from 7 sequences with >91% identity. XX FH Key Location/Qualifiers FT CDS 364..1710 FT /product="L2B-1_AG_1p" FT /translation="MSCKKCIRPVLISDDPVTCMGICHGQFHRACTRLTKP FT AAKMINDIVNVKYWCDDCLEXEGSGASKSDEMLEIKAILENIKLSIVAIXE FT KIKTSVCEVVSIGITNMQKSSIDAIETKLASSVSDAVNTGISSIEKKLNAK FT VACNMPQKNLVEKSDATSLILQDTEHTDSIGTTEWTTLTRKRKRTNSGRME FT TPRNTKLPNVSKIDNQVKNNTGTLIIIPKIDQTCDKTRSDMRANLDPRIHK FT ITNFRNGKPGQIIVEWASQEGIDHIRKEIQTSLGEKYLTSLPTRKFRIIGL FT SENYKEDEMVDLIKTQNNGFSTAYIKILGKFEKADYKYKKYNVIIEVDHDT FT AIYLDREEKINIGXDRVYIHESFHIMRCFKCGQFGHKSTTCDNEETCSKCS FT GKHKTSDCKSSSESCINCSTQNMQRGLKLNTAHSAFSKECPTFKWLKSRKQ FT QLSK" FT CDS 1747..4515 FT /product="L2B-1_AG_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="MLYMNVAGLSSSHIMLNHIVQTIQPWLVFITETHIVE FT AEAFEQFGIPGFTAISCLSNSRHTGGVTIFIKESITYKILINEARDGNWFL FT SISATISKKSFIFSLVYHSPSSSNAHFVNLLEWWFDKFLDLGEDNILIGDF FT NINWLNSAESKPLRQLMESLNMNQIITEETRITQRSKSLIDHVYCNHNSFI FT AKTDPTLKIADHETIVLTLQKCNVSLNNKKTFKSWKRYSKDKICSSVNVNL FT PITNLCLHDKAHRLEQVIKNSIAQLIDVAVVNNNNYQKWYTGDLQCKKQRL FT DKRYFTFLKTNSELDWLSYKTSRNCYLRELRMTKRNYFERKINESKNDSKQ FT LWKLLKSWLQTGSNETGRVQFNDKIIEDESEIARLFNEHFVNSVMDIHKSI FT PQCNEPEQIKNMEQTRKRFHFQPIDIQTLQRICFKLKGASGLDDINGKVIR FT DCLSVTGDTLLNIINDSLITGQFPNTWKESIVIPIPKINGTIIAEEFRPIN FT MLNTLEKILELVVKEQLVQYLKENKLLVPEQSGYRESHSCETALNLVLEKW FT KTNLNKKHITLAVFLDLKRAFETISRPLLINALKNFGIVGQELDWFKNYLE FT GRTQRTRFKTTTSESLENSLGVPQGSVLGPILFIMYINDMKNVLKFCDINL FT FADDTVVFLSCSSVKEATTKLNEDLASLDQWLKYKKLKLNAKKTKMLVLSR FT TKTLFNLQINIDDEAIERVKEIKYLGVTIDEELKFKSHIDNIIKKMAKKCG FT IICRLKKDLSVFAKIQLYKSLMAPHLDFCPSILFMANAGQTTRMQKLQNRA FT MRAILGCGWYTSSLIMLDTLKWMSVKQRVMYQTMIFVYKLLNSHLPQYLCD FT RVVRGLNVHDYNTRRAADPRVPKVETVSAMNSLFFKGIQTFNMMPRVVCNA FT ETIPQFKRLCATHIKATIV" XX SQ Sequence 5227 BP; 1904 A; 814 C; 1005 G; 1498 T; 6 other; ttctcgctgg tgtgcgatga tgagtgaggt gtagtttgtg twgctctccc tgttgaagtg 60 tttaaagtgg tgaaaaggtg taaatagtgt ggaaattttt gttacacatg acgactgaat 120 aacaggactt tgatatgtag tgaagttgta tcgagagacg attgaaatac tcttgatttt 180 aattaattta ttttgtgttg tttcagtagc tgtaaacagt ttcaatgtgt tagtgaactg 240 tacgaacgaa ggaggatctc ctgaaatagc gtataacgtg tattagtatt agtagtaagc 300 aaaacgggca tcgatcatta ctaatacast cggtgtgctg atgagtgttt gtaaacactc 360 gtcatgagtt gcaagaaatg tataagaccg gtgttgatta gtgacgaccc ggttacttgt 420 atgggtatat gccatggtca atttcatcga gcttgcacaa ggcttactaa gccggcggct 480 aagatgatta acgacatagt gaacgtgaag tattggtgcg atgattgctt ggaacwagaa 540 ggctcaggtg catcaaaatc cgacgaaatg ctagaaatca aagcgatttt agaaaatatt 600 aaattgtcaa ttgttgctat cgakgagaaa attaaaacga gtgtgtgtga agttgttagc 660 attggtatta ccaatatgca gaaatcttcg atagatgcta ttgaaacgaa attggccagt 720 agtgttagtg atgcggttaa caccggcatc tctagcattg agaaaaaatt aaatgcaaaa 780 gttgcatgca atatgccaca aaaaaacttg gtagaaaaat ctgatgcaac ttcccttatt 840 ttgcaggata ctgaacacac tgattcaata ggtacaacag aatggacaac tctaactcgt 900 aaaagaaaaa ggacaaatag tggaaggatg gaaactccta gaaataccaa acttcccaat 960 gtctcaaaaa tagataatca agtaaaaaac aatactggga ctttaataat tataccaaag 1020 atagatcaaa cttgtgacaa aactagaagc gacatgagag cgaatttgga tccaagaata 1080 cataaaataa caaatttccg taatggtaag cctggtcaaa ttattgttga atgggcttca 1140 caagaaggga ttgaccatat aagaaaggaa atacagacat ctttgggtga gaagtatcta 1200 acttccttgc caactagaaa gttccgtatt atcggtttgt cggaaaatta caaagaagac 1260 gaaatggtag atctcatcaa aacscaaaat aatggctttt caacagcgta catcaaaatt 1320 ttgggaaaat ttgaaaaagc tgactataaa tacaaaaagt ataatgtcat aattgaggtg 1380 gatcacgaca cagcaattta cttagacaga gaagaaaaga ttaatattgg attkgataga 1440 gtctacatac acgaatcttt tcatatcatg agatgtttta agtgcggaca atttggtcat 1500 aaaagcacaa catgtgataa tgaagaaact tgttccaaat gtagtggtaa acataaaaca 1560 tcagattgta agtcgtcatc tgaaagctgc attaactgtt ccacccaaaa tatgcaaaga 1620 ggattaaaac tgaacactgc tcattcagcc ttcagtaaag aatgtcccac gtttaaatgg 1680 ttaaaatcaa gaaaacagca attgagtaag tagcaattgc agaataaaca aagtaagaac 1740 atacagatgt tatacatgaa tgtggctgga ttgtcgtcta gtcacataat gttgaatcat 1800 attgtacaaa caattcaacc ttggttagtg tttataaccg aaacacacat tgtggaagct 1860 gaagctttcg aacagtttgg aattcctgga ttcacggcaa tttcatgctt atccaattct 1920 cgacatactg gaggtgtgac tatatttata aaagagtcga taacttacaa aattttaatt 1980 aatgaagctc gcgatggaaa ctggttttta tcaatttcag cgacaataag taagaaatca 2040 ttcatattta gtctagtgta ccattcacca agttctagca atgcccattt tgttaattta 2100 ctcgaatggt ggtttgataa atttcttgat cttggagaag acaacattct aataggagat 2160 tttaacatca attggttgaa ctccgcagaa tctaaaccac tgagacagct aatggaatct 2220 ttgaacatga atcagattat tactgaggaa actcgaataa ctcagcgaag taaatctcta 2280 atagatcatg tatactgcaa tcataactca ttcatagcga agactgatcc tacattgaaa 2340 atagcagacc atgaaacaat tgttctaact ttacaaaaat gcaacgtttc attaaacaat 2400 aaaaaaacat ttaaaagttg gaaaagatac agtaaagaca aaatatgttc atctgtgaat 2460 gttaatctac caatcacaaa tttatgtttg catgataaag cacaccggct agaacaagta 2520 ataaaaaata gtatagctca actcattgat gttgctgtag taaataataa caattatcaa 2580 aagtggtaca ccggggattt acaatgtaaa aagcagcgct tggacaaacg ctacttcaca 2640 tttttaaaaa ccaattcaga gttagactgg ttatcttata aaaccagtag aaattgctac 2700 ctacgtgagc tcagaatgac gaaaagaaat tattttgagc gaaaaataaa tgaatcaaaa 2760 aatgatagca aacaattgtg gaagttgtta aaaagttggt tgcaaacagg ttccaatgaa 2820 acaggtaggg tgcaatttaa tgacaagata atcgaagatg aatcagaaat tgcccggttg 2880 tttaatgaac actttgtcaa tagtgttatg gacatacata aaagcattcc tcaatgcaat 2940 gaaccagagc aaattaaaaa tatggagcaa acaagaaaac gatttcactt tcagccgatt 3000 gatattcaaa cactccaaag aatctgtttc aaattgaaag gagcttctgg acttgatgat 3060 ataaatggta aagtgatacg agattgtctc agtgtcacag gggacacgtt acttaacatt 3120 attaacgatt ctttgattac gggacaattc ccaaatactt ggaaagaatc aatagtgatt 3180 cccatcccta agattaatgg tacaatcatt gcagaagagt tcagaccgat aaatatgcta 3240 aacaccctag aaaaaatttt agaactcgtg gtgaaggagc aattggttca atacttaaag 3300 gaaaataaat tattagttcc agaacaatca ggatatagag aatctcattc ctgtgaaaca 3360 gctctgaacc ttgtacttga gaagtggaaa accaatttaa acaaaaaaca tataacatta 3420 gctgtctttt tagatttgaa aagagccttt gaaacaatat cgagaccttt gttaatcaat 3480 gccttaaaaa acttcggtat tgtgggacaa gaacttgatt ggtttaaaaa ttatttagaa 3540 gggagaactc agagaactcg ttttaaaacc acaacatcgg aatctcttga aaactcattg 3600 ggagtccccc aaggaagtgt acttggtcct attttgttta ttatgtatat aaatgacatg 3660 aaaaatgtat taaaattttg tgatattaat ctctttgctg acgacacggt ggtctttctc 3720 tcttgtagct ctgtaaaaga agcaacaacg aagctgaatg aggatctggc ctcgctagat 3780 caatggttga agtacaaaaa attgaaatta aatgcaaaaa aaactaaaat gttagtgctg 3840 tcccgtacca agactctctt taacttacag atcaatattg atgacgaggc catagagaga 3900 gttaaagaaa taaaatatct tggagtcaca atagatgaag aactgaaatt taaaagccat 3960 attgataata tcattaagaa aatggcgaaa aaatgcggga ttatttgtcg actaaaaaaa 4020 gacttaagtg tttttgcaaa gatacagctt tacaagtctc ttatggctcc acatctggat 4080 ttttgtccgt caattctttt tatggctaat gcaggacaaa cgacaagaat gcaaaaactg 4140 caaaacagag caatgcgagc tattctagga tgcggatggt atacgtcgtc attgattatg 4200 ctggataccc taaaatggat gtcggtgaaa cagcgggtaa tgtaccaaac aatgattttt 4260 gtttataaac tcctgaacag tcacttgcct caatatcttt gtgatcgggt tgttcgagga 4320 ttaaatgtgc acgactacaa cacaagaaga gcagcagatc cgagagttcc taaggttgag 4380 acagtatcag caatgaactc tttgtttttc aaaggcatac aaacttttaa tatgatgccc 4440 agagtagttt gtaatgcaga gactataccc caatttaaga gactgtgtgc aactcacatc 4500 aaggccacta ttgtatgaag tatgactgtt gtgacaatag taaggaagta gattaattta 4560 gaaaagacga aatccgcgaa tgcgtgaatg aaatataaat tgataaataa atgttttagt 4620 cacttgttgt gtaaacctca cttgtcatca ccagctttga tgatgatgat gattttagct 4680 ttttttttgt tttttgcttt tatctttttc tatatatctt tttaaaatta aatttaaaaa 4740 agttaaaata aataaacaag ttaacacaaa aacaaaaata aaggaaaaga atgagtcgat 4800 atagcattga gacacgcgcg cgaaaaatag tatatataat ttgataactt tcctgggggt 4860 agtggtccgg tccaatgcaa cccgaaacgt gaatcgtcat aaattgtaca gtaacggaat 4920 aaggaaaaag tacgctgtga gatagttttg aaagtggttc tctatcgcca ttagttggtt 4980 gtaggttgaa ccggctatac aagagaaacg aaagcaggta ccatgtattg agaccgacca 5040 tgactgcttc ataaagagga gaagaaagaa gttttgtagc caccacgttc ttgaaatact 5100 tacttgcctt atgatttgat tatttgcata attactaatt tacaacaaca caattatcga 5160 taagatacaa tcgttcctct caatactata tcggggtaag aggtgggact catcatcatc 5220 atcatca 5227 // ID GYPSY39-I_AG repbase; DNA; ANG; 6337 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY39-I_AG is an internal portion of retrotransposon GYPSY39_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY39-I_AG; GYPSY39-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY39_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6337 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY39_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 70-70 (2004). XX DR [1] (Consensus) XX CC GYPSY39_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY40_AG, GYPSY41_AG, GYPSY42_AG, GYPSY43_AG, GYPSY44_AG, CC GYPSY45_AG, CC GYPSY46_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY39-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 359-aa GYPSY39_AG1p gag-like CC polyprotein (pos. 574-1650) and the 1168?aa GYPSY39_AG2p CC pol-like polyprotein (pos. 1605-5108). CC The sequence of the LTRs flanking GYPSY39-I_AG is deposited as CC GYPSY39-LTR_AG. XX FH Key Location/Qualifiers FT CDS 574..1650 FT /product="GYPSY39_AG1p" FT /translation="MPKYTPFPKVRSLNFTKDKKIPVKMEQNIHALIESMR FT SLEERVNTLQAENDSLQRFQQQQQPTTSSRSADYFRIPDPIRVIPGFDGNK FT KQLIGWLTTVKKTLDLFKNNVAPEIFSVYEQTVINKIEGKARDTICVNGNP FT TTFDEVADILQNVYGDRNNIATYQTQLWNLKQNESLSLHYRKTKEILINMK FT SVARQNTVYASHWEAINLFLEQECLAAFINGLSKTYFGYAQTAQPEDLESA FT YAFLCRFQNAERTKSNTNNGFEKHPSVDKNVKRDNKFSQPKHLENRPFNNM FT KPSSDRTKIVPMDVDQSLRSNMRNKIFSHTAEEDDGKTISEESDSDEETDN FT EEDVNFQITANLNPPS" FT CDS 1605..5108 FT /product="GYPSY39_AG2p" FT /translation="RRCKFSNYCKLKPTELNAINGLPFYTTKLNGSNIKLL FT VDSGANKNFLNPELVPLTQRVKCETITIKNKNGKFKSQECTFITILDKKLK FT FYLFKCHNYFDGILGYESLSDMETLIDTKNHKLILPDRTIDLEIRTLEPPS FT ITLNANNSQIIKLPVSQTKGNIFLPKDIQIKDAIIPSGVYKAKQGYAKVLA FT RNYGDTLTFHWTKPAKTYELDKMIEINHMSTHPYNDHPQNYIANLENLIRT FT DHLNKEEKHKLFEILKQNLGIIHQEKEKLSCTTAIRHRIKTKDDIPIHTKT FT YRYPNIHKAEVNRQIEEMIADGIIQHSISPWTSPIWIVPKKSDASGKEKWR FT IVVDYRKLNEKTIDDRYPIPNIEEILDKLGRSMYFTTLDLKSGFHQIEVDK FT KDRPKTAFSTEKGHFEFIRMPFGLKNAPATFQRAMNNILGDLVGRNCLVYL FT DDIIIFGKSLQQHLDNLNKVLKKLIESNLKVQLDKCEFLRKECEFLGHIVT FT QDGIKPNPNKIEKILHWPIPKTTTHIKGFLGILGYYRKFIKDFSKLTKPLT FT KCLKKGSKITHNDEFINCFNDCKQMLTTDPILKYPDFSRKFILETDASDFA FT LGAVLSQKFEDGKEHPIAYASRTLNETECNYSATEKELLAIVWATKHFRPY FT IFGTVFEIRTDHKPLVWLRQKNDLNRRLLHWKLALEEFEFEIKYKKGTLNG FT NADALSRITENPALTSELNANTASNNTDTMTQHSADTNDDEFIPATEKPLN FT EFRNQIILEENNNDSREAITLFGNYHRITIKTPLFGPAMLINIIKEYASPK FT CVTGICCSEKILQNLQIIYKNYFSRAKSIKLFWTKTILEDVTDEIDQDLII FT ERQHNANHRGIIETKLHIARKYYFPNMKTKITKFINICKICQKAKYERRPY FT KQKYQITATPKKPLEIVHMDIFIISDKHFLTLCDKFSRLTMAIPLQSRNAI FT HILKALTQFFANVGRPCLLVMDQECSFKSTTIKQFLEENNVEFHYTSVGQS FT SSNGTIEIVHRTIRELHNIISLRDSTKNLSTTSKINLAVSIYNDSIHSQTN FT ISPKELFFGFKNEDPIPEDLQERIKQKEELYRVYATKQKEKKSKYIEKLNK FT TREEPELFKDNDTIFERKRNNLKHEERYRETKVFDNLVTNILDRNGRKIHK FT SKLKRKRKT" XX SQ Sequence 6337 BP; 2560 A; 1250 C; 970 G; 1557 T; 0 other; ggcgcagtcg gctaggccca attcacgttc agtgacaagt gttcaagtac agtgcagtga 60 aataacggaa ataaatctgt gtatcgcccg gactttgctc catcgcccga tccgtggttg 120 actcgtgacg atcgcgtgaa tttcattatt tcataaagtg taagaacaag tgcacgaacg 180 gttggtttca actatttaga attggaagag aacattcgaa caaaagctag tggcaagact 240 ccccagccag cgttactggg taaactaatc aacgcaggac aatcgctgac ccaacccgtg 300 gacaacgagt gaccaaagaa gaaaaggaag gacaatcgct gacccaatcc gtggacaacg 360 agcgaccaag gaagaaaagg aaggacaatc gctgacccaa tccgtggaca acgagcgacc 420 gaagaagaag aggaaggaca aagcaaagga caggtaacgg aaaactctaa tttcccgaca 480 cgtggataaa aaaggtgagt tacatcctaa agcataagtt cccctactct tataataaaa 540 aggcggtagg caacaaacaa tttttatcat tgaatgccaa aatatacccc tttcccaaaa 600 gttcgttctt tgaacttcac taaggacaag aaaattccag tgaaaatgga acaaaatata 660 cacgcattga ttgaatctat gcgctcgtta gaggaacgcg ttaacacact ccaagcggaa 720 aacgatagtt tacaacgctt ccaacaacaa caacagccaa cgacctcgag tcgcagtgct 780 gactattttc gtatcccgga tccaattagg gtcattcccg gattcgacgg aaacaaaaaa 840 cagttaatcg gttggttaac aacagttaaa aaaacactag acctcttcaa aaacaatgtg 900 gcaccagaaa tattcagtgt ctacgaacag accgtaatta ataagataga gggcaaagca 960 cgtgacacta tttgtgtgaa cggaaaccct actacttttg atgaagtcgc agatattttg 1020 cagaacgttt atggcgatcg aaacaacatc gcaacgtatc aaacacaact gtggaaccta 1080 aaacaaaacg aatctttgag ccttcactat agaaaaacaa aagaaatttt gatcaatatg 1140 aaatcagtag caagacagaa tacggtatac gcttcacatt gggaggcaat taacctgttt 1200 ttagaacaag aatgcctcgc tgcgttcata aatggtctga gcaaaacata ttttggatat 1260 gcacaaaccg cacaaccaga agacttggaa tcagcttatg cctttctatg cagattccaa 1320 aacgctgaaa gaaccaaatc aaacaccaat aatgggttcg aaaaacatcc tagtgtcgac 1380 aaaaatgtaa aaagggataa taaattttca caacctaaac accttgaaaa tagaccattc 1440 aataacatga aaccatcttc agatcgtaca aaaattgttc ccatggacgt cgatcaatcc 1500 ttacggtcca atatgagaaa caaaattttc tcgcatacag ccgaagaaga tgacggaaaa 1560 acaatatcag aagaatccga ttcagacgaa gaaaccgata atgaagaaga tgtaaatttt 1620 caaattactg caaacttaaa cccaccgagt taaacgcaat taacggacta cccttttaca 1680 caacaaaatt aaatggttca aacatcaaac tcttagtaga cagcggagcg aacaaaaact 1740 ttttaaatcc agaattagtt ccactaactc aaagagtgaa atgcgaaaca atcaccataa 1800 aaaataaaaa tggtaagttc aaatcacaag aatgcacatt cataaccatt ctagataaaa 1860 aattaaaatt ttaccttttc aaatgtcata attattttga cggaatactc ggatacgaat 1920 cactctccga tatggaaaca ctaattgata caaaaaatca taaacttatt ctacccgatc 1980 gtacaatcga cttagaaata agaacattag aacccccttc aattacatta aatgctaata 2040 attctcaaat tataaagctt cctgtttctc aaacaaaagg aaacattttt cttccaaaag 2100 acattcaaat aaaagatgca ataattcctt caggcgttta taaagcaaaa caaggatatg 2160 ctaaagttct tgccaggaat tacggcgaca cattaacatt tcattggact aaaccagcaa 2220 aaacatatga attagacaaa atgatcgaaa taaatcacat gagcacacac ccatataatg 2280 atcatccgca aaattacata gcaaatctcg aaaatttaat tcgcaccgat catttaaaca 2340 aagaagaaaa acataaactt ttcgaaattc ttaaacaaaa tctaggcatc atccatcaag 2400 aaaaagaaaa actttcatgc accaccgcta ttagacatcg aataaaaact aaagacgata 2460 tacccattca cacaaaaact tatagatacc ccaatattca taaagcagaa gtgaataggc 2520 aaatcgaaga aatgattgct gacggtatta tccaacattc tatatcccca tggacatccc 2580 caatctggat agtccctaaa aaatcagatg caagtggaaa ggagaaatgg cgcatcgtgg 2640 ttgactatag aaaattgaat gaaaaaacga tcgatgatcg ataccctatt cccaatatag 2700 aagaaatttt ggacaagtta ggtagaagca tgtacttcac tacacttgat ttaaaatccg 2760 gctttcatca aattgaagtt gataaaaagg atagaccaaa aacagctttt agcacagaaa 2820 aaggacactt cgagtttata cgaatgccct ttggcttgaa aaacgcacct gctacttttc 2880 aacgagccat gaataatatt ttaggtgacc ttgtaggaag aaactgtctc gtatatctag 2940 atgatattat tatttttggt aaatcactcc aacaacattt ggacaatttg aataaagtgt 3000 tgaaaaaatt aattgaatca aatttaaagg ttcaattaga caaatgtgaa tttttaagaa 3060 aagagtgtga attcttagga cacatcgtaa ctcaagatgg catcaaaccc aatccaaaca 3120 aaatagaaaa aatattgcat tggcccatac caaaaacaac aacccacatt aaaggcttct 3180 taggaatcct tggttattac agaaaattca taaaagattt ttcaaaactg accaaacctc 3240 tcacaaaatg tcttaagaaa ggttctaaaa taacacataa tgacgaattt ataaattgtt 3300 tcaatgattg taaacaaatg ctcactactg atccaatttt aaaatatcct gattttagca 3360 gaaaattcat tttagaaaca gacgcaagtg acttcgccct tggcgctgta ctttcacaaa 3420 aattcgaaga tggtaaggaa cacccaatcg cttacgcttc tagaacgctt aacgaaacag 3480 agtgcaacta ttcagctacc gaaaaagagc tactcgccat tgtttgggca accaagcatt 3540 tccgaccgta cattttcgga acagtttttg aaataagaac tgaccacaaa cctcttgttt 3600 ggctaagaca gaaaaatgat ttaaatagaa gacttcttca ttggaaactg gccttggaag 3660 aattcgaatt tgaaattaag tataaaaaag gtactcttaa tggaaatgca gatgcactat 3720 cccgcataac ggaaaaccct gctttaacat ccgaattaaa tgcaaatact gcctctaaca 3780 atactgacac tatgacgcaa cactcagcag atactaatga tgacgaattc atcccagcaa 3840 ctgaaaaacc actaaatgaa tttagaaatc aaataatact agaagaaaat aataatgatt 3900 ctcgagaagc aattacactt ttcggaaatt atcatagaat aacaataaaa acaccccttt 3960 ttggaccagc tatgttgatc aacataatca aagaatatgc atctcccaaa tgtgttactg 4020 gcatttgttg cagcgaaaaa atcttacaaa acttacaaat tatctacaaa aactatttct 4080 ctcgcgcaaa atcgatcaaa ctattttgga caaaaacaat tttagaagat gtaactgatg 4140 aaattgacca ggatttgatt attgaacgac aacataatgc aaaccataga ggaattattg 4200 agaccaaact acatattgct agaaaatatt attttccaaa catgaaaact aaaattacta 4260 aatttattaa catttgcaaa atatgtcaaa aagctaaata tgaaagacgc ccatataaac 4320 aaaaatatca aattactgcc actcctaaaa aacctttgga aatcgttcat atggacatct 4380 ttataattag tgataaacat tttctaaccc tttgcgataa gttctcaaga ctaaccatgg 4440 cgattccttt acaatcaaga aacgctatac atatactaaa agcactaact caattcttcg 4500 ctaacgttgg aagaccttgt ttgcttgtga tggaccaaga atgttcgttt aaatccacca 4560 caatcaaaca atttcttgaa gaaaataatg tagaattcca ttatacaagt gttggtcaat 4620 cctcttcaaa tggaacaatt gaaattgtcc atagaacaat aagagaattg cacaacataa 4680 tctcactacg agattctaca aaaaatctat caactacttc aaaaattaat ttggcagtct 4740 caatttacaa tgattctata cactctcaaa caaacatatc ccctaaagaa ctatttttcg 4800 gcttcaaaaa cgaagatcct atccctgagg accttcaaga aagaattaaa caaaaagagg 4860 aactttatag agtgtatgct acaaaacaaa aagaaaaaaa atccaaatac atagaaaaat 4920 taaacaaaac acgagaagaa ccagaactat tcaaagacaa tgacacgatc ttcgaaagga 4980 aaagaaataa tttaaaacac gaagaaaggt atcgtgaaac gaaagttttc gataacctag 5040 ttacaaatat tttagaccga aatggaagaa aaattcataa atcaaaattg aaaaggaaac 5100 gaaaaaccta acaaaccatt taagttcttt ctcatcctta cggaaatcac agctgaatac 5160 tataaaaaac taaaaatgaa atttggaatg attcatttcc tcataatagt tttaacaaat 5220 tcaaaactta taacagcaac aatttataga gactaacctt aactatattc agaatttagc 5280 caataatatt agcctaacca accatcttaa ggacacctta gactataaaa tcctatcagc 5340 ttataaaaaa tgaggaaata taagacctag aaaacaaaag agaggcataa taaatgcagg 5400 aaatcgcaga tcagaaaaca cttagaacat tcagaattta gaacattcag aaaacaaaac 5460 catatccaac attaacgctc aagttagaat taattccgaa atagaaaaat cagtgaacaa 5520 aattacacaa actctaaaaa aaaaaaaatc gaaaatcaaa tgaataatca gataaaagat 5580 cagatcagat caaatgaata atcgtaaaaa ctgaaataga acaagtaaat cttatcctga 5640 acattgataa cattgtacaa ctgtagaaga catagaagaa tagacggatg taaaatttgt 5700 aaccaagcat aaaaacacca ttttcaatgt taacgaagag tgcacgattt gcaacaaccc 5760 acaaccgcta aacgacgagt gtatctcaaa tatcatcaga aatcatccat ctaaatgtac 5820 caccaccacg agctccgagc atacgataat acgagaaatg aaaccaggag tcattctaat 5880 cgataccacc cttggagtgc cagtcattga ttcatgttca aacagccaaa taatagcagt 5940 acccacacta attgaaaccg gcaattgcac cgtaaaaatc ttaaatagca catttactgc 6000 gcatattgat gtgttagacc aagaagacta cttcctaccc ttaactggca gcaaaacaga 6060 aataactctt aacaagccaa gtatacaaga tctgcacaca atgcatatta gtaacataca 6120 tcaactccat acaataacac tccgcttacg cacgcataca attgcaggag ggatattaat 6180 tgttgttcta acaggctttc ttataactgc attttgcatc tacaaacaca aaaccaaaca 6240 tgcagagaag aaaatgtcta ccgaagctat tcacaacatc atcccactgc aaacgtcggt 6300 aagtcgatcg aggacgctcg acgtttaagg agggagg 6337 // ID GYPSY20-I_AG repbase; DNA; ANG; 4274 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY20-I_AG is an internal portion of retrotransposon GYPSY20_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY20-I_AG; GYPSY20-LTR_AG; GYPSY20_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4274 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY20_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 5-5 (2004). XX DR [1] (Consensus) XX CC GYPSY20_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY21_AG, GYPSY22_AG, GYPSY23_AG, CC GYPSY24_AG, CC GYPSY25_AG, GYPSY26_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY20-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. CC The consensus encodes the 1378-aa GYPSY20_AGp gag-pol like CC protein CC (pos. 110-4243). CC The sequence of the LTRs flanking GYPSY20-I is deposited as CC GYPSY20-LTR_AG. XX FH Key Location/Qualifiers FT CDS 110..4243 FT /product="GYPSY20_AGp" FT /translation="MALPNPVVDDPAVASPALPAVSGAAVPVTFHFEPFNP FT ASSKFDRWLNRLQISFRIYHVREADKRDFLLHYMGGPTYDVLCNKLKNAEP FT HTKTYDEIVALLKEHYSPTPLEILENFKFASRKQLEQETLSDYLMHLEKLA FT QTCNFGDYMDKALRNQFVFGIQNRVIQSRLLEVRDLTLTKAKEIAFGMEMS FT HRGTDEMHNSRQKSEVQHIEHGANKTKKSFQSSSQASSSQSSGRLSNKQNG FT GNKRCYRCGDPDHYADKCKHKATICKYCKKAGHLERMCLTKTNEKGTDDAH FT HLEEQPCVMKDVLHLNAIQGIAGKFLLSLWINQKQLTFEVDTGSPVSLINL FT QDKRKYFNNFDISPTNIRLVSYCDNDIGVLGKITVKVVANGEEFTLPLHVA FT ESSRHPLLGRDWLLALNLDFNRVFQPGTHSVSYCSGKNQSTTNALNNLLTK FT FSRVFDERVGKIEGIQATLTVRKNTKPVYIKARPVAFAVRSTVDKEIDHFV FT KEGIWEKVDHSEWATPVVAVRKAGGKVRLCGDYKITLNPNLLVDEHPLPTV FT EELFATVAGGETFSKLDLSQAYLQLEVRPEDRDLLTLSTHRGLFRPTRLMY FT GVASAPAIFQRLMEEILQGIPGVTVFIDDIRVTGSDTKMHLLRLEEVLNRL FT DKYGLRVNREKCDFFSDRIEYCGYMVDKQGIHKLREKIDAIQNMPIPKNKE FT QVRSFVGLINYYGRFFPNLSTILYPLNNLLKDDVPFVWSADCDKSFTLVKR FT EMQSDRFLVHYDPSLPVILATDASPYGVGAVLSHQYLDGTERPLQYASQTL FT TRTQQKYSQIDKEAYSIIFGVRKFHQYLYGRKFILVTDNKPISQIFSESKG FT LPTMSAMRMQHYAAFLQAFDYKIRHRRSSEHFNADAMSRLPVSTTDPESEI FT EEPEVVEVNAIQTLPLTVDELSAATLADVNVRELLRALRTGNSVEGKHRFG FT VNQEEFNLHKDCLMRGSRVYIPPALRRKVLEELHSTHFGITRIKSLARSYC FT WWEGIDRDIENLVNDCASCQAAKANPPKVTFHCWETPTEPFQRVHADYAGP FT FMGLYYLILIDAYSKWPMVYVVKNMTTETTIRLCREFFSTYGLPSVFVSDN FT GPQFTSTEFSRFLKLNGITHKLSAPYHPATNGQAERFIQTMKSKLKSLQCD FT RGEVHSEICNILLSYRKMIHPATGFSPSKLVFGRQIRSRLDLMIPSNDPNS FT NEVQSKIRALTTGSKVAAREYVHGNKWEFGTIKERLGKLHYLVKLNDGRTW FT KRHIDQLRSVGAGLSESTKEEISLRREEIGGENFYDNTIAVTPDITTNTYN FT YDNTYTSTDMSLPAIPTAPNLETGIQLPISNETAAIDQPTGSADLPVDQRL FT RRSLRTIKPPQRLNL" XX SQ Sequence 4274 BP; 1272 A; 929 C; 939 G; 1134 T; 0 other; atttggcgac gagaagtacg aagctacgtg aatgtgctag gcgcaataca tatacaaaca 60 agttgctgca tcgtgaattt ggctgtctac tgcgtaattt tacatcacaa tggcgcttcc 120 taacccggtt gtcgacgatc cggcagtcgc cagtcccgct ttgcctgctg tgagtggtgc 180 tgcggtacca gttacatttc atttcgagcc attcaatcct gcttcatcga aatttgaccg 240 atggttaaat cgactacaaa tttcattccg gatttaccac gtgcgtgaag ccgataaacg 300 cgattttctg ctacactaca tgggcggccc tacatacgat gtgctgtgca ataagctgaa 360 aaatgctgag ccacatacaa aaacgtacga cgagattgta gctctactga aggaacatta 420 cagtcctact cctttggaaa tactggagaa tttcaagttc gcgagccgta aacagctaga 480 gcaagaaact ctaagcgatt acctgatgca tttggagaag ctcgcccaaa catgcaattt 540 cggggactac atggacaagg ccctccggaa ccagttcgtt tttggcatcc agaaccgtgt 600 gatacagtct cgattgctgg aagtgcgcga cttaaccttg acaaaggcaa aggagatcgc 660 attcggaatg gaaatgtctc atcgtggaac cgatgaaatg cacaactcac gtcaaaaaag 720 tgaggttcag cacatcgagc atggagcaaa caaaactaaa aaaagtttcc agtcatcgag 780 ccaagccagt tccagtcaaa gttccggtcg cctgtcgaac aagcagaatg gtggaaataa 840 acgatgttat cgttgcgggg atcctgacca ctacgcagac aaatgcaaac ataaagctac 900 gatctgcaaa tactgcaaga aagcggggca tcttgagagg atgtgtctca ccaagaccaa 960 cgagaagggg acggatgacg cacatcacct ggaggagcag ccgtgtgtta tgaaggatgt 1020 gttacacctg aacgcgatcc aaggtattgc tggtaagttt ttgttgagtc tgtggataaa 1080 tcaaaaacag ctaacgttcg aggtcgacac tggttcaccc gtatccttaa tcaacctaca 1140 agacaaacga aaatacttta acaattttga catttcccct actaacattc gactcgtgag 1200 ttactgcgat aatgacattg gtgtgcttgg gaaaataacg gtaaaagtag ttgcaaatgg 1260 tgaggaattt acattgcctc tacatgtcgc agaatctagt agacatccgt tgttagggcg 1320 tgattggcta cttgctttga atttagattt caatcgtgta ttccaaccag gtacacattc 1380 agtttcctac tgtagtggca aaaatcagtc taccactaat gcattgaata acttacttac 1440 aaaattttca cgtgtctttg atgaacgtgt tggtaaaatt gaaggaatac aagctacact 1500 tactgttagg aaaaatacaa aaccggtata cataaaagct aggccagtgg catttgcagt 1560 gcgcagcacg gttgataagg aaattgatca tttcgtgaaa gaaggcatat gggaaaaagt 1620 ggaccactca gagtgggcta cacctgttgt tgctgttagg aaagccggag gcaaagtgcg 1680 gttgtgcggc gattacaaaa ttactcttaa cccaaactta ctggtggatg aacatcctct 1740 tccgacggtc gaagaacttt ttgctactgt tgcgggaggg gagacattct caaaattgga 1800 cctttcgcaa gcttacttac aactcgaagt tcgacccgag gatagggact tacttacatt 1860 gagcactcat agaggcttgt ttcgtcccac tcgactcatg tatggagttg cttccgcacc 1920 tgcaattttt caacgtctga tggaggaaat tttgcagggc atacctggcg ttactgtctt 1980 cattgatgac attcgcgtta ctggttctga tacaaaaatg catttactta gacttgagga 2040 agtacttaac agattagata aatatggatt gcgtgtcaat agagaaaagt gtgacttttt 2100 ctctgatcga attgagtact gcgggtacat ggtggacaag caaggaatcc acaaactccg 2160 cgaaaagatt gatgcaatac aaaacatgcc tattcctaaa aataaggagc aagtacggtc 2220 ttttgttgga ctcattaact actatggtag atttttccct aacctcagta ctattttgta 2280 cccactgaat aatttactta aagatgacgt tccatttgtg tggagtgctg attgtgataa 2340 atcgtttaca ttggtaaaaa gggagatgca atccgatagg tttttagtac attatgaccc 2400 gtcacttccg gtaattttag ctactgacgc gtccccatac ggggttgggg cagttcttag 2460 tcatcagtat cttgatggaa ctgaacggcc attacagtac gcatctcaga cccttactcg 2520 aacgcaacaa aaatattctc agatcgataa ggaagcctac tcgatcattt ttggtgttcg 2580 caagtttcat caataccttt acggtcgcaa atttattctg gtaacagaca ataaacctat 2640 cagccaaatc ttttcggaat ctaaaggact tcctactatg tccgcaatgc gcatgcagca 2700 ttacgcggca ttcctacagg cgttcgatta taagattcga catcgccgtt cgtcggaaca 2760 tttcaatgcc gatgctatgt ctcgcctacc ggtttcaact actgaccctg aatcggaaat 2820 tgaagaaccg gaggtagtcg aggtaaatgc aatacaaaca ctcccactga ctgtagatga 2880 attgagtgca gctaccctag cggatgtgaa tgttcgcgaa ttgctacgtg ccctaagaac 2940 tggaaactca gttgaaggga aacatagatt tggtgtgaat caggaagaat tcaacttaca 3000 taaagattgc ttgatgcgtg gtagccgagt atacatacca cctgcattgc gaagaaaggt 3060 gcttgaagaa ctccattcaa cacatttcgg tataacacga atcaaatcac ttgctcggag 3120 ttattgttgg tgggaaggca tagacagaga catcgaaaac ctggtcaacg attgtgcttc 3180 ctgtcaggct gcaaaggcta atcctcccaa agtcactttt cattgttggg aaacacctac 3240 ggaaccgttt cagcgtgttc atgcggacta tgctggccca ttcatgggac tttactacct 3300 catattaatt gacgcatact cgaagtggcc tatggtctac gttgtgaaaa acatgactac 3360 ggaaacgaca attcgtttgt gccgggagtt tttcagtact tatggattac cttctgtctt 3420 tgtgagtgac aacggtcctc aatttacctc tactgaattt tcaagatttc ttaaactaaa 3480 cggaattact cataaactta gtgctccgta ccatccagcc actaatggac aggctgaaag 3540 atttatacaa acaatgaaat ctaaactaaa gtcgctacaa tgtgatcgag gggaagttca 3600 cagtgaaatt tgcaatatac tgctctcata ccgtaaaatg attcacccag ccaccggatt 3660 ttcaccttca aaattagtgt tcggtcgtca gatccgttca aggctggatc tcatgatacc 3720 atcaaacgat ccaaactcaa atgaagttca gtctaaaata cgtgcattga ctactggatc 3780 aaaagttgca gctcgagaat acgttcacgg aaacaagtgg gagtttggga caattaaaga 3840 acgtctaggt aaactccact atttagtgaa acttaatgac gggcggacct ggaaaaggca 3900 tatcgatcaa ctacgtagtg ttggtgcagg gctatcggaa tctactaagg aagaaatttc 3960 tttgcgccgt gaagagattg gtggtgaaaa tttctacgac aacactatcg cagtcactcc 4020 agacattact acaaatacat acaactacga taacacttat acatccactg acatgtccct 4080 ccctgctatc ccgactgctc ctaatcttga aacgggtatc caactaccga tttctaacga 4140 aacagctgca attgatcaac ccacagggtc agcggacttg ccggtggacc aaaggttgcg 4200 tcgttctctg cggaccatca agcctccgca aaggctcaac ctataacaac gaattctatt 4260 ttgcgcggaa gagc 4274 // ID GYPSY25-I_AG repbase; DNA; ANG; 4631 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY25-I_AG is an internal portion of retrotransposon GYPSY25_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY25-I_AG; GYPSY25-LTR_AG; GYPSY25_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4631 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY25_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 15-15 (2004). XX DR [1] (Consensus) XX CC GYPSY25_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, CC GYPSY24_AG, GYPSY26_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY25-I_AG consensus was reconstructed after multiple CC alignment of 8 copies. CC The consensus encodes the 1470-aa GYPSY25_AGp gag-pol like CC protein CC (pos. 210-4619). CC The sequence of the LTRs flanking GYPSY25-I is deposited as CC GYPSY25-LTR_AG. XX FH Key Location/Qualifiers FT CDS 210..4619 FT /product="GYPSY25_AGp" FT /translation="MLNPDEMMEEEEHRRLSQNWGGNAVFGNGQQQPSQSN FT VALPAQPPLQNVVGAPSQQGASTSGQDAGMLSQMLQLLQQQMNQQQQLMAQ FT MLKYQQQSVPQPSQPSLIPTNPELIIDALASNISEFRYEAESGATFKAWYE FT RYEDLFLRDASRLDDGAKVRLLGRKLGTAEHARFTSFILPRAPRELTFDET FT VAKLTALFGRTESLLSKRYKCMQITKAPREDLLTFSCRVNRACVDFEFAGM FT NEEQFKCLILVCGLKEEVDSDMRNRLLARIEEKHDVTLEQLSAECQRITNI FT KVDSALIANESGERVLAVNSGGYRQKQFQFNRQQYQSYGQPTQAHARANDN FT TMSVQKPLNACWLCSGPHWKRECPYRSHECADCGRLGHREGHCEVVNRFQR FT RGYNKKGTNVATRVVNINVCNVEARRKYVHILINGRPTKLQLDTASDITVI FT SEGLWKDIGQPSLIRATVKAKAASQEYLELMGEFEALLTIASRTQKAVVRV FT AYANLLLLGADVVESFSLGSIPMDHYCSSIDAESATQMTWEERFPTVFRGM FT GLCTKSSIKLKVKEGSRPIFRPKRPVAYAMLKTVDEELDRLETLNVITPVD FT YSEWAAPIVVVRKANGKIRICGDYSTGLNDLLQSHEYPLPLPEDIFAKLSK FT CRIFSKIDLSDAFLQVQIDKEYRPLLTINTHRGLYHYNRLSPGIKIAPAAF FT QQLIDTMLAGLSGVCGYMDDLIIGGLTDEDHDKTLGQVLKRIEEFGFTLRA FT DKCVFRMCQIKYLGHVIDGRGIRPDPEKISAIQNLPPPTDIAGVRSFLGAI FT NYYGKFIPMMRDLRFPLDSLLKDEKQFKWTKECEAAFMKFKEVLSSELLLT FT HYDPSAEIIVAADASSVGIGATLSHKFSDGSIKVVQHASRALTKAESNYSQ FT IDREGLALVYAVTKFHKMLYGRHFRLQTDHRPLLRIFGSKKGIPIYTASRL FT QRFALTMQLYDFTIEYIQSGMFGNADILSRLIRNHAKPEAEYVIASLNLEE FT DLRSVAINAISNSSPLCFRDVEKSTQADPLLRKVYLYIQEGWPRDATFGSE FT LARFHVRREALSTVEGCILFGERLVIPEDLRAHCLEQLHRGHPGVERMKSL FT ARSYVYWPRLDDEIVQYVAACEACAAAAKTPPQAKPTPWPKPSGPWQRLHV FT DYAGPILGDYFLVVVDAFSKWPEIVKTSTTTSRATVAILRGLFARFGMPLS FT IVSDNGPQFTSQEFKNFCECHGIRQVTTAPFHPQSNGQAERFVDTLKRALK FT KIQTGGTSMDEALDTFLQAYRTTPNPALEWNTTPAEIIIKYPVRTHLELLR FT RPPVVEEETEITAAGLQPGDVVATKKYYQNTWKWISAEVLRRLGTVMYELT FT SRNGRMMRRHIDQIRKRSVKEPHQLVQDSETPTQLPIDILLGEWSLTVPNE FT PLPLSSTTEELRTVPSPQPFSPAQVAPSESNHRKIRSPRRSSRNRKLPRRF FT DAFIL" XX SQ Sequence 4631 BP; 1317 A; 1037 C; 1164 G; 1113 T; 0 other; gtggcgacga gtttagttac attgttttat ttattgaact gtttttatcg aaccgaacgc 60 agagttaatt tattttgttt cgttttaacg tattatttat tcaaagtgtg tcggccgtat 120 acgcgtcaag tcgttatttt atttttttta ttaatcgttt gtgaatcgcg ccaaaagaaa 180 tcggtacaat ttcgcgtaag cgtagaagga tgctaaaccc ggacgaaatg atggaggagg 240 aagagcatcg ccgcttatcc caaaattggg gtggaaatgc agtttttggt aatgggcagc 300 agcagccatc gcagagcaac gtagcgttac cagctcaacc accattgcaa aatgtagtcg 360 gggcgccatc gcagcaaggt gcatcgacct cggggcaaga cgccgggatg ctatcgcaaa 420 tgctgcaatt attgcaacag caaatgaacc agcaacagca gcttatggcg caaatgttga 480 aataccaaca gcaatccgtc ccgcagcctt cgcaacctag tttgataccg actaatcctg 540 agcttataat tgatgcgttg gcgagcaaca taagcgaatt tagatacgag gctgaatcag 600 gagcaacgtt taaggcctgg tacgagcgtt acgaggactt atttttgagg gacgcttccc 660 gactcgacga tggggcgaaa gtacggctcc ttgggcgtaa gttgggcacg gcagaacacg 720 ctcgtttcac cagttttata ctacctcgcg cgcctcgtga actaacattt gacgaaacgg 780 ttgcaaagct aacggccctt tttggtagaa cggaatcctt actcagcaaa cgttacaagt 840 gcatgcagat aaccaaagca ccccgggaag atctgctcac gttttcttgc cgcgtaaacc 900 gtgcctgtgt cgactttgag tttgccggaa tgaacgagga gcaattcaag tgccttatcc 960 ttgtttgcgg actcaaagaa gaagtcgata gtgacatgcg caaccggtta ttagcccgta 1020 tcgaggagaa acatgatgtg acgttggagc agttatcagc agaatgccag cgtatcacca 1080 atataaaagt ggatagtgca ttaattgcca acgaatcagg agaacgggtc cttgcagtga 1140 acagcggtgg ctatagacaa aaacaatttc aatttaatcg tcagcagtac cagtcctacg 1200 gtcaaccaac acaggcgcat gcacgggcaa atgataacac catgtcagta caaaagccgt 1260 tgaatgcgtg ttggttgtgc agtggcccgc attggaagcg tgagtgtcca tataggtccc 1320 acgaatgtgc ggattgtgga aggcttgggc atcgcgaagg gcactgtgaa gtcgtcaatc 1380 gattccagcg gcgtggatac aataaaaaag gtactaatgt ggcaacgcgc gtcgtgaaca 1440 taaatgtgtg caatgtagaa gcaaggcgaa agtatgtaca tatcttgatc aacggaagac 1500 caactaagtt gcagctagac acggcatcag atattacggt gatcagcgag ggattgtgga 1560 aggatatcgg acagccatct ctgataaggg ctacggtaaa agctaaggcg gcctcgcagg 1620 agtatcttga actaatgggt gagtttgaag cgttattaac cattgcttct aggacccaaa 1680 aggcagtagt tcgagtcgca tatgctaacc tgttgttact aggagcagac gtcgtggagt 1740 ccttttcact cggatctatc ccgatggacc attattgtag tagcatcgat gcggaaagtg 1800 caactcagat gacatgggag gaacgttttc caacagtttt ccgagggatg ggtctttgca 1860 ccaaatcaag cataaagttg aaagtgaagg aaggcagtcg acccatattt cgtcccaagc 1920 ggccggtggc atacgcaatg ctgaagactg ttgatgagga gctagatcga ttagaaacct 1980 tgaacgtgat cacgccagtt gattattctg agtgggctgc ccccatagtg gtagtgcgta 2040 aagcgaacgg gaaaattagg atctgcggtg actattctac tggactgaac gatcttctac 2100 aatcacatga gtatccactc cctttgcctg aggatatctt cgctaaattg tctaaatgcc 2160 gcatattcag caaaattgac ctttctgatg cttttctaca ggtccaaatc gataaagagt 2220 atcgcccgct gttgacgatc aacacccacc ggggattata ccactacaac cgtttgtctc 2280 cgggaataaa aattgcccca gcagcatttc aacagttgat cgacacgatg ctggcaggac 2340 tgtcaggagt ttgtggttat atggacgacc ttataatcgg tggcttgaca gatgaagacc 2400 acgataagac tttgggtcaa gtactgaagc gtatagaaga attcggattt acactacggg 2460 ctgataaatg cgtttttaga atgtgtcaga taaaatactt agggcatgta attgatggta 2520 gaggaatacg ccctgatccg gaaaagataa gcgccataca aaacttaccg ccacccactg 2580 acatcgcagg agttcgttct tttcttggag ctataaatta ctatgggaag ttcataccga 2640 tgatgcggga tctaagattt ccactcgaca gccttttaaa agatgagaag caattcaaat 2700 ggacaaagga gtgtgaagcg gcatttatga aatttaagga agtactatca tcagaacttt 2760 tactaacaca ttacgatcca tccgcagaaa taatagtggc agcagacgcc tcatccgttg 2820 gaatcggagc cactcttagc cacaagttct ccgacggtag catcaaagtt gtgcaacacg 2880 cttcaagggc gctgacaaag gcggaatcca actatagcca aatagaccgc gaaggtctgg 2940 cccttgttta tgcggttaca aaatttcaca agatgttgta cggccgtcat tttcgactcc 3000 aaactgatca tcgtccttta ctccggattt ttggatcgaa gaagggcatt ccgatataca 3060 cggctagcag actacagcga tttgccctca ccatgcagtt atatgatttc acaatagaat 3120 atatacaatc cggaatgttt ggaaatgcag atatcctttc gcgactcata aggaaccacg 3180 caaagcccga agccgaatat gtgattgcca gcctaaacct agaggaggat ttaaggtcag 3240 tagctatcaa cgcgatttct aattcctctc ctctttgttt tagagatgtg gagaaaagta 3300 cgcaagcgga cccattgctg cggaaagtct atctatatat tcaggaaggc tggccacggg 3360 atgctacttt tggttcagag ttggcacgtt ttcacgtcag gagagaagcg ctatcgaccg 3420 tcgaagggtg catcctcttt ggcgaaagat tagtgatacc ggaagatctg cgtgcgcatt 3480 gtctagaaca gctccatcga ggccacccag gcgttgaacg tatgaaatct ctcgcacgaa 3540 gctacgtgta ctggccaaga ttagacgacg aaatagtgca gtacgtggct gcttgtgagg 3600 catgtgctgc agcagcaaag acaccgcctc aagcaaaacc gacaccgtgg ccgaaacctt 3660 ctggtccatg gcaaaggtta cacgtggact atgcggggcc aattttaggc gactactttc 3720 ttgtggttgt ggatgccttc tcaaagtggc cagagattgt gaaaacctcc acaacaactt 3780 cacgggccac ggtagcgata ttacgtggat tattcgcacg ctttggcatg cccctgagta 3840 ttgtcagtga caacgggcca caatttacga gtcaggaatt caaaaacttt tgcgagtgcc 3900 atgggatccg acaagtaaca acggcgccat ttcacccgca atccaacgga caggcggagc 3960 gtttcgtcga cacattgaag cgggctctaa agaaaataca aacgggaggc acatctatgg 4020 atgaagcgtt ggacacattc ttgcaggcgt atcgcaccac gccaaatcca gcgcttgagt 4080 ggaatacaac accggcggaa atcatcatta agtaccccgt aaggactcac ttggagctat 4140 tacgtcggcc tcctgttgtc gaagaggaaa cagaaatcac cgctgcaggg ctccagccag 4200 gagatgttgt tgcaaccaag aagtactacc aaaatacatg gaaatggatc tccgctgagg 4260 tgctacgaag gctgggaacg gtgatgtatg agttaaccag ccgaaacggt cgaatgatgc 4320 ggagacatat agatcagatc aggaaacgat cagttaaaga acctcatcag ttggtgcagg 4380 atagtgagac gcccacacaa ctgcccatcg acattctgtt gggtgaatgg agtttaacgg 4440 taccgaacga gcctttgccg ctaagcagta caacggagga attgcgcacg gtcccttcac 4500 cccagccatt ctccccggca caggttgcac cttcggaaag caatcatcgt aagatacgct 4560 cacctcgccg ttcgtctagg aatagaaagc ttccgcgaag gttcgatgcg ttcatacttt 4620 aaagggggag a 4631 // ID RETRO1_AG_LTR repbase; DNA; ANG; 301 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO1_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; NINJA; RETRO1_AG_I; RETRO1_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-301 RA Jurka J. and Drazkiewicz A.; RT "RETRO1_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 5-5 (2002). XX DR [1] (Consensus) XX CC Related to NINJA from Drosophila simulans. 5 bp target site CC duplication. XX SQ Sequence 301 BP; 100 A; 61 C; 74 G; 66 T; 0 other; tgttggagaa cgtaaacatt acagaacgta aacattgggg tgacgggacg actcatgata 60 acgagtagtg gttacacgcg cgagtcggaa agttgaaaag cgtaggcggc gcaattcgct 120 ataaaaaggg ctgggctggc gccgtagggc ctttttcgac ggaaattttc tgtagcgtga 180 acaaccacta gaagtgagta atattttggc gtgggttaat aaaacgcaaa gcaccactta 240 taatattaac acttttcaca cacgacctaa aaaagactca caacacaatc cgaacgctac 300 a 301 // ID RETRO993_AG_LTR repbase; DNA; ANG; 224 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO993_AG DE retrotransposon - a consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; IDEFIX; KW Long terminal repeat; MAG; RETRO993_AG_I; RETRO993_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-224 RA Jurka J. and Drazkiewicz A.; RT "RETRO993_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 24-24 (2002). XX DR [1] (Consensus) XX CC Related to MAG from Bombyx mori and IDEFIX from Drosophila CC melanogaster. 5 bp target site duplication. XX SQ Sequence 224 BP; 67 A; 43 C; 63 G; 51 T; 0 other; tgttgtgacg agtagaaaac gcccttaggc actcaggagc tgcgcgcctc ggaacgtgag 60 tgtgttcggg tttagccagg gatcgtgagg acactctcgg ataagcggtc aagcgagtag 120 gaacgcgatt aggaaataag cggaataaaa agtatcctgt tagctgtata agtaaacgcg 180 tgttgtttaa cgtattatat tacgaaccac ccgaaaacgt aaca 224 // ID R6Ag3 repbase; DNA; ANG; 5289 BP. XX AC AB090819; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon R6Ag3 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; KW reverse-transcriptase; gag-like; R6Ag3. XX NM R6Ag3. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090819; Positions 1 5289. XX FH Key Location/Qualifiers FT CDS 501..1700 FT /product="R6Ag3_1p" FT /translation="MEARKQKPKKNVEDEEHERLIEEFISKLKKSYKKASK FT AEENEAPRKVSHKAQLERFKNYANNLEIEDLRDGMIAQMIEFMESMIKEMS FT ELKKQLKQKSTQEIEVQTAQPSELAEDAPFVPQTRKGRVPKEARKRDNNAR FT QRSAQRETPKSSGGQSKQPKKKKKKRSLPKPEAVVIEKCENIDLAKVLKGL FT THDDALKDVGDQVAKVRRTQNGDMLLVLKRGKEAVRVEAGIKNALKDKANV FT RTLAPSVMIEITHLDEITLAEEIAEALKQQLEIDVDHKEIKVREARTKGTQ FT KATFRVPLSAKERVLNPGKLKVGWCVCSLREATVQVKCFKCWKLGHKGFEC FT TGQDRSKLCIKCGQEGHKIRECPNAMTCLDCREDMVEPHITGSLRCPNRIA FT RRQQHG" FT CDS 1696..4761 FT /product="R6Ag3_2p" FT /note="endonuclease and reverse transcriptase." FT /translation="MVKLIQHNQNHCGAAFNLMWQTAREREVDLFIVADPV FT KNQRHNNNIVYSEDQLAAIVTCGNLPIQKIVNKASRGMLAVEVGGILIVSA FT YAPPSWTVQEFEELLDNIVLTVSGSSKFVVAGDFNAWSSSWANTLGARGES FT QRLRGDTLLAAFAGLEMVLMNNGQDTFVTPERKSAIDLTFVSQSLMETTGW FT EVLPDYMNSDHIGILITIGKEQTPSPRDNAKKGWKTTLYHKELFAAALDRI FT LHEMRVDTPDDLVKALDKACDATMSRLKKTCRWRGVYWWTSVIADLRRKSK FT AASRVAQRAYDTPEFPDKRREYKLARNALKREIKRTKKATWYRLVNMSDII FT LFGEVYVILKRMVGGNRVPKELDPEKLNTIIDELFPSHPVTDWPTHQPTTS FT QENPESVTDEEIRNIGRSLKSRKVPGPDGIPNAALATAMIEEPTIFKKVYQ FT RCLDTGVFPDNWKKQRLVLSIPKPGKRPGESGSSRPICLIDGVAKGLERVI FT LHRLNNHIERVQGLSENQYGFRKGRATTDAIEKVLSIASASRARNRGANRF FT CAVVTLDVKNAFNSASWTAIARSLQRINIPKYLYDIIGNYFRNRVLLYETN FT EGNRERVVTAGVPQGSVLGPTLWNLMYNEVLGLTLYDGASLIGFADDIVLV FT AVGSRIDDLENTIETSINIIRQWMESVELQLNISKTEYILVSSHRSRQESQ FT IIVEGHTIRSSRHLKYLGIMIDDRLEYTQHIKYVAERAVTNTNALVRMMPN FT RSGPRSSRRRIIANTIIAGIRYASSIWAESLKFECRKQWLRRCHRPLVNRV FT ISAFRSTSHDAACVIAGMMPLHILIDEDYRVRQRSITTGVSSKLARIAERP FT YSVEAWQREWSTTTSGSWTRRLIPNIQPWITRRHGNIEFHMSQFLSGHGFF FT RSHLHRMGYVPSPVCPACGDENQTAEHTIFICGMYLLTRLRLEQDLQADFD FT VENAINIMCSDEVTWNRVAEYVHEVMENQYNLQCSYRGNSDRELQNQEAAS FT QETSPERAGISMNEDN" XX SQ Sequence 5289 BP; 1630 A; 1218 C; 1378 G; 1063 T; 0 other; gcccggcaaa gtccgaccat cacctcctcg cggtgccggg cggggtagga gttcgtcctt 60 ggacaggctc tgagctaacg atggcggcac gctttggtag tggatgcaga ccatgctatg 120 gatcaggcta atgtattgga cgtaaaccat agatgaatgt tatcgggctg gcgaaaaact 180 caaactgatc ccttagtaag gacgggtgaa actttcttga aacggcgtga gggtcaatga 240 gtgattcgtt cccgtcctat taaaacctgg cgagtcttcg gtagcgctct aagtttgttg 300 tcggcctagc aaaccgacaa ctataggtgc aggtgagctt cggccctgct taaccctgcc 360 tggctctgag ctcacacctc ataagggtag tgagccaacc ctcgcgtgga gctccctgaa 420 ccaaagccaa ccacggggca tcaacaaggg gactatagtg gcgaacaagg atagcggatt 480 gaagtttgga cccaacagca atggaagcaa gaaaacaaaa acccaagaaa aatgttgaag 540 atgaggagca cgagcggctc attgaagagt ttatctcgaa gctcaaaaag agttataaga 600 aagcctcaaa ggctgaggag aacgaagcac cgagaaaagt ctcgcacaaa gcacaactcg 660 agcgcttcaa gaattacgcg aacaacctgg agattgagga ccttcgtgac ggcatgatcg 720 ctcagatgat tgagttcatg gagtccatga ttaaagaaat gagcgagttg aaaaagcagc 780 tcaaacagaa gagcactcaa gaaattgaag tgcaaacggc tcagccaagt gaactggcag 840 aagatgcccc ctttgtgccg caaactagaa aaggccgagt cccaaaagag gcccgcaagc 900 gagacaacaa tgcgcgtcaa agatccgcgc agagagaaac tccgaagagt agtgggggtc 960 aaagcaaaca gcccaagaaa aagaaaaaga agcgatcact tccaaaaccc gaagcagtgg 1020 tcattgagaa gtgtgaaaac atagacctgg cgaaagtact aaaaggacta acccatgacg 1080 atgccctcaa ggatgttggc gaccaggttg ccaaggtacg aagaactcaa aacggcgaca 1140 tgctgctagt tctgaagcga ggcaaggaag cggtgcgggt tgaagcaggc atcaaaaatg 1200 ccttgaagga caaggcgaac gtccgaacgc ttgctccctc ggtgatgatt gaaatcactc 1260 atcttgatga aatcaccctt gctgaggaga tcgctgaagc cctcaagcaa cagctggaga 1320 tagacgtcga tcataaggaa atcaaggtcc gagaagcaag aactaagggt acacaaaagg 1380 ctacgtttag agtgcccctg tcagcgaaag agcgtgtcct caatccaggc aagttgaaag 1440 ttggctggtg tgtatgcagc ctgagagagg ctacagtcca ggttaaatgc ttcaaatgct 1500 ggaaactggg tcacaagggc ttcgaatgta ctggccagga tcgaagcaag ctctgcatta 1560 aatgtggaca agagggacac aagatcagag agtgtccaaa cgctatgacg tgcctcgact 1620 gccgtgagga tatggttgag ccccacatca ctggcagcct cagatgtccc aaccggatag 1680 ctagacgaca acaacatggt taaattaata cagcataacc aaaaccattg cggagcggcc 1740 tttaatctta tgtggcaaac cgctcgtgaa cgagaagtag atttgtttat tgttgctgat 1800 ccggtaaaaa atcaacgaca caacaacaac atagtgtata gtgaggacca gttggcagca 1860 atagtgacat gtggaaacct acccattcag aagatcgtca acaaggcatc gagaggaatg 1920 ttagccgtag aggtgggagg catcctcatc gttagtgctt atgccccccc aagctggact 1980 gtgcaggaat tcgaggaact gctcgataat attgttttga ccgtcagcgg atcgtccaag 2040 tttgttgtgg caggggattt caatgcatgg tcttcaagct gggcaaacac ccttggagca 2100 agaggagagt cacagcgctt gagaggcgat acactattag cagccttcgc cggattagag 2160 atggtgctaa tgaacaacgg tcaagacaca ttcgttacac cagagagaaa gtcagcaatt 2220 gacctaacgt tcgtgagcca atctctaatg gagacgacag gatgggaagt actgccggac 2280 tatatgaatt cagaccacat tgggatactt atcacgattg gcaaagaaca aacacctagt 2340 ccccgagaca atgcgaagaa gggatggaaa accacccttt atcacaagga actttttgct 2400 gcggcactag atagaatcct gcatgagatg agagttgaca cgcccgatga cctggttaaa 2460 gccctagata aagcatgtga tgctaccatg tccaggctga agaaaacgtg caggtggagg 2520 ggcgtctact ggtggacgtc cgtgatagca gaccttcgga ggaaaagtaa agccgcaagc 2580 agagttgccc aaagggcgta tgacactcct gaattcccag acaaaaggag ggaatataag 2640 cttgccagga atgcgcttaa gagagaaatc aaaaggacaa aaaaagcgac ctggtacaga 2700 cttgtaaaca tgtctgacat cattctattc ggcgaggtct atgtaatttt gaagcgaatg 2760 gttggaggaa acagagtgcc caaagagttg gaccccgaga aactcaacac cattattgac 2820 gaactatttc cgagccatcc tgtcacagac tggccgacac atcaaccaac gacaagtcaa 2880 gaaaacccgg aaagtgtgac agacgaggaa atccgaaaca tcggaaggtc gcttaaatcc 2940 aggaaagtac caggtccaga tggaataccg aacgctgctt tggcaacagc gatgattgag 3000 gagccgacaa tcttcaagaa agtttaccaa agatgcctcg atactggtgt atttccagac 3060 aactggaaga aacagaggct cgtgctctcg attcccaaac cagggaaacg accgggagaa 3120 agcggctcat cacgtccgat atgtttgatt gatggagtag ctaaaggttt agaacgtgta 3180 atactccacc gactgaataa ccacatcgag agagtacaag ggttatccga aaaccaatat 3240 ggtttcagga aaggaagagc aacgactgat gccatcgaaa aggtcttaag catagcaagt 3300 gcatccagag ctcgaaatcg aggtgctaat cgattctgtg cagtagtgac acttgacgtg 3360 aaaaatgcat ttaatagcgc gagctggacg gcgatagcga ggtctctaca gagaatcaac 3420 atacccaaat acctctatga catcataggg aattacttcc ggaaccgcgt gctgctctat 3480 gaaaccaacg aaggaaatcg agaacgagtc gtaacagctg gagtgcccca agggtcagtg 3540 ctgggaccca ccttatggaa tttaatgtac aacgaggtgc tcggcttaac gctgtatgat 3600 ggagcatcac tcatcggatt cgctgatgat atagttctag tagctgtcgg aagccgaata 3660 gacgatctgg agaacacgat cgaaacatcc atcaacatca ttcggcaatg gatggagtca 3720 gtggagcttc aactaaatat atcgaagacg gagtacatcc tagtaagctc acacagaagt 3780 agacaagaat cacagatcat cgtcgaagga cacacaatta gatcatcgcg ccacttgaag 3840 tatctgggca tcatgattga tgatcgccta gaatacactc agcacatcaa gtatgtcgct 3900 gagagagcgg tgaccaacac caacgcccta gtgagaatga tgcccaaccg atcaggacca 3960 agaagcagcc gacgtcgaat tatagcaaac accatcattg caggcatcag atatgcctcc 4020 tcaatatggg ctgagtcact gaaatttgag tgcaggaagc aatggctccg gaggtgccat 4080 agaccattag tgaacagagt gataagcgca tttaggagca cctctcacga tgcagcctgt 4140 gtcatagcag gcatgatgcc gctccacata ctaatcgatg aagactaccg agttcgacaa 4200 cgaagcatca caacgggagt aagcagcaaa ctggcgagga tagctgaacg accatactct 4260 gtagaagctt ggcaaaggga atggtcgacc accacttcag gctcctggac aagacgattg 4320 atacccaaca tccaaccatg gatcaccagg agacacggaa acatcgagtt ccatatgagc 4380 cagttcctat caggccatgg gttcttcaga tctcatcttc accgaatggg gtatgtacca 4440 tcacccgtat gtccggcctg cggcgacgag aatcagactg cagagcacac catcttcatc 4500 tgcggcatgt atcttctgac gaggttacga ctggagcaag atcttcaagc cgatttcgac 4560 gtcgaaaacg caatcaacat catgtgcagc gacgaagtaa cgtggaatcg agtcgcagag 4620 tacgtccacg aagtgatgga aaatcagtac aatctccaat gcagctacag aggcaacagc 4680 gacagagaac tccaaaacca agaagcagca agccaagaga cttccccgga acgtgcggga 4740 atcagcatga atgaggataa ctgacatggc cgatacatcg acgaagatcc attcccctcg 4800 gaacgtgcgg gaatcagctg agatgccaaa gaccatacgc acacatgtct ctattctttg 4860 gtcccccgac ttacgagtag agggaccttg atggtgagct tgtctgcggt tgtcagcccc 4920 ggtgtggcca gctggtcaaa caccggactt tgacattata cttaggtgga tggcacactc 4980 actgagatac catgtgcggc gatgaggtgc ctgacaccta ggagagatgg ccctccgggt 5040 ctcgggacct gggcgcgggg tgtaatgttc tatacgcctg acagacacca ctgtcatagc 5100 attgcggcgc cgtggtcgat gttgaccaaa ggaatgctgg ttgaagtaat gctctagcgg 5160 gcgatccggc cagtatttct tgaggcacaa gagagtttaa gtggttaaaa tccatctgca 5220 tacgtaggta ctggtgctct gtctatcgta tgtcctataa aaggttctct cttgtctaaa 5280 cgggaaaaa 5289 // ID SINEX-1_AG repbase; DNA; ANG; 206 BP. XX AC . XX DT 05-MAR-2004 (Rel. 9.02, Created) DT 05-MAR-2004 (Rel. 9.02, Last updated, Version 1) XX DE SINEX-1_AG is a nonautonomous non-LTR retrotransposon - a DE consensus sequence. XX KW SINE; Non-LTR Retrotransposon; Transposable Element; KW Nonautonomous; SINEX-1_AG; nonautonomous non-LTR retrotransposon; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-206 RA Kapitonov V.V. and Jurka J.; RT "SINEX-1_AG, a family of nonautonomous non-LTR retrotransposons RT from African malaria mosquito."; RL Repbase Reports 4(2), 44-44 (2004). XX DR [1] (Consensus) XX CC SINEX-1_AG is a family of nonautonomous non-LTR retrotransposons. CC The SINEX-1_AG consensus sequence was reconstructed based on CC multiple alignment of ~2000 copies, which are over 90% identical CC to CC the consensus. SINEX-1_AG copies are usually flanked by ~10-bp CC target site duplications. The genome harbors over 10,000 copies CC of SINEX-1_AG, including those that are 30% divergent from the CC the consensus sequence. This family is quite unusual because CC there CC is no similarity of SINEX-1_AG to any known tRNAs or other RNA CC with CC internal pol III promoters. It is possible that SINEX-1_AG CC encodes a CC new type of the internal pol III promoter. XX SQ Sequence 206 BP; 56 A; 56 C; 52 G; 42 T; 0 other; tgggccggtc tggtggtaca gtcgtcaact cgtacgactt aacaacatgc ccgtcatggg 60 ttcaagcccc gaatagaccg tgcccccata cgtaggactg actatcctgc tatggtaaca 120 ataagtcact gaaagccaag ccccacttca ctagtgggta caggcaggcc ttgaccgaca 180 acggttgttg tgccaaagaa gaagaa 206 // ID TRANSIB3_AG repbase; DNA; ANG; 2070 BP. XX AC AAAB01008960; XX DT 29-JAN-2002 (Rel. 7, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE TRANSIB3_AG is a coding portion of a TRANSIB-like DNA transposon. XX KW Transib; DNA transposon; Transposable Element; KW TRANSIB superfamily; TRANSIB1_AG; TRANSIB3_AG; transposase. XX NM TRANSIB3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2070 RA Kapitonov V.V. and Jurka J.; RT "TRANSIB3_AG."; RL Direct Submission to Repbase Update (27-DEC-2002). XX DR Genbank; AAAB01008960; Positions 16423 14354. XX CC TRANSIB3_AG is a copy of a young Transib-like DNA transposon. CC The copy encodes TRANSIB3_AGp transposase. XX FH Key Location/Qualifiers FT CDS 1..2067 FT /product="TRANSIB3_AGp" FT /note="transposase" FT /translation="FNFLNHTIHYNRVSNMVSLQVREIVDQFNFGDSQITN FT INKALAYVDTKARNRELDIEAARSEVENIYLRAQLQWDQCNRVRGTFEANN FT AEWLASKFPTQLLSNGASTSNTNSCSLPANYEQSSELAERMKIRFMEHSEE FT SANERTSPTRIARRKAYRLKEYNIMKQISLLLQSPRDLLISTDHSDMLSDE FT EVLAFYIDLGLTEAQYKSMSSLVPSRFPSFNAIRRAEALCVPFHEQIVTTE FT CAVKMEMQCLVEYTVRRILEMQKKVLRDFLTEKQLNVLKLRCVFTYGIDAT FT SGHGTDVLISVLSPIKLHLDDASHHIFWLNVTPQSYRFCRPISLQLAKPSK FT ALVLQTKHSVDEQISNLKPMHIVLEDDKAVDVEFEFVLSMIEGKVLSYMLD FT DSSTTCCLICCAAPSEMMDASILASGFIGQEEPLVYSISPIHCWLTFFELL FT LQLSYRMDFKEWQVPEVQRTTFNERMKTVQQRLYEAFGVRVAEADAGSSSS FT TNTMGYISRRVLADPALASTTLNIDQDLIERFRNILIAINSRHPLHPAKVQ FT QYCSDTYRKYLELYSWSRVPAAVQKVLAHAGQLIVRSPLRLGYVGKKSGEI FT EHNYFTSDRELCARTTSKQDALKDPFVEALICSDPKISSISLSNRIKRKKR FT AAYPDVVASFFIFDESYCETDDDDDSNSGDDDEDDLDSLD" XX SQ Sequence 2070 BP; 614 A; 461 C; 470 G; 525 T; 0 other; tttaatttct taaatcacac gatacactac aaccgagtga gcaacatggt ttcattacaa 60 gtgcgagaga ttgtggacca atttaacttt ggcgattccc aaatcacaaa cataaacaaa 120 gcgttagcct atgttgatac gaaggctcgg aaccgggagc tagacatcga ggctgcccgt 180 tccgaagtgg aaaacatcta cctacgtgct caactacagt gggatcagtg taatcgcgtc 240 cgaggcactt tcgaagcaaa taatgcagaa tggttggctt ccaaattccc gacccagtta 300 ctgtccaatg gtgcttctac atcgaataca aatagctgta gtctgccagc taactacgaa 360 cagtccagtg agcttgctga acgaatgaaa atacggttta tggagcattc agaagagtcg 420 gcgaacgaaa gaacatcacc tacacgaatc gctcgccgga aagcatatcg gctcaaagaa 480 tataacatca tgaaacagat cagcttgttg ctgcaatcgc cacgcgatct tctcatatca 540 acggatcatt ctgatatgct gtccgatgaa gaagtattag cattctacat agacctgggt 600 ctgaccgaag cccagtacaa atcaatgagc agccttgtgc cttcccgctt tccatcgttt 660 aatgccatta gaagagcaga agctttatgt gtgccgtttc atgaacaaat tgtgactaca 720 gagtgtgcag tgaaaatgga aatgcaatgt ctggttgagt atacggtgcg aagaatattg 780 gagatgcaga agaaggtgtt gagggatttt ttaacggaga agcagttgaa cgtactcaag 840 ttacgatgtg ttttcaccta cggcatagat gccacatcgg gccatggtac tgacgttttg 900 atcagtgtcc tatccccaat taaacttcac ctggacgatg cgtcacatca cattttttgg 960 ctcaacgtta ctccgcagag ttatcgtttt tgtagaccaa tttcattgca gctagcgaag 1020 ccatcaaaag cattagtcct tcaaaccaag cacagtgttg atgaacaaat ttccaaccta 1080 aaaccaatgc acattgtttt ggaggatgat aaagcagtcg atgtggagtt tgaatttgtg 1140 ttaagcatga tcgagggtaa agttttatca tacatgctgg atgattcgtc gaccacttgc 1200 tgtctgattt gctgcgcagc cccatcagaa atgatggacg cttcaatcct ggcatcgggt 1260 tttatagggc aagaggaacc gttggtgtat agtatttcac ccatccattg ttggctaaca 1320 tttttcgagc tgctgctaca actgtcatat aggatggact tcaaagaatg gcaagtcccg 1380 gaagtgcaac gaaccacatt taacgaaaga atgaaaaccg tgcaacagcg tctctacgaa 1440 gcgtttggcg taagagtagc agaagcagat gcagggtctt cctccagcac aaacacaatg 1500 ggatacatca gccgacgagt gttggccgat cccgctcttg caagtaccac gctcaacatc 1560 gaccaagatc ttatcgaacg cttcagaaac attctaatcg caatcaatag tcgtcatccg 1620 ctgcaccccg caaaggtaca acagtactgc agcgacacct atcgcaaata tctggagctc 1680 tacagctggt cccgagttcc ggcagcagtg cagaaagtct tagcacacgc cggtcaactg 1740 atagtacgct cacctctgcg tttaggctat gttggcaaga agtctggtga gatcgagcat 1800 aactatttca catcagatag ggagctctgt gcaagaacaa cttcaaaaca agatgctttg 1860 aaggatccat ttgtggaagc acttatctgc agcgatccta aaataagttc tatatcgctt 1920 agtaatagaa tcaagcgtaa aaaacgagct gcttatccgg atgtagtagc gagctttttc 1980 atatttgatg aaagctattg tgaaaccgat gacgacgatg attctaatag cggcgacgac 2040 gatgaggacg atttggactc gctcgatgaa 2070 // ID COPIA2-I_AG repbase; DNA; ANG; 4232 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 29-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE COPIA2-I_AG is an internal portion of the COPIA2_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA2-I_AG; COPIA2-LTR_AG; COPIA2_AG; Copia clade; Salto 7; KW integrase; reverse transcriptase. XX NM COPIA2-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 578-2937 RA Parkes J.R., Warren M.A. and Crampton M.J.; RT "Salto, a Ty1-copia group retrotransposon from the malaria vector RT Anopheles gambiae."; RL Direct Submission to Genbankac. AF93,. XX RN [2] RP 1-4232 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA2_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Direct Submission to Repbase Update (31-MAR-2003). XX DR [2] (Consensus) XX CC COPIA2_AG is a young family of autonomous Copia-like LTR CC retrotransposons. CC COPIA2-I_AG, an internal portion of COPIA2_AG is flanked by 99% CC identical COPIA2-LTR_AG LTRs. The COPIA2-I_AG consensus sequence CC was reconstructed based on multiple alignment of 3 copies. CC The consensus sequence encodes the 1348-aa COPIA2-I_AGp protein CC (positions 189-4232). Partial protein (ac AAL55241.1) previously CC submitted to GenBank by [1]. XX FH Key Location/Qualifiers FT CDS 189..4232 FT /product="COPIA2-I_AGp" FT /translation="MDFSKVGVIRRNNRNYRSWAFKVQMLMMREGTWTYVD FT PGVAPTPVTPEWTEGDSKARATIALLVEDNQHNLIMTKNTAKETWDALKAH FT HHKATLTGKVSLLKEICNANYREGENMEDFLYGMEDHYSRLENSGEKLSAN FT MQVAMILRSLPKAFDALTTALESRSDKELTMDLERAKLIDESEKLYGGKVQ FT EERVLKAKSEAKPGACFFCGQPGHKKRECKEFLNRKSSGEGEKKKKIKPNK FT EQQVKTVRENDASSFTFMVRQPEIRGNDRSWLIDSGASSHMCSDKSAFTVM FT EQSLRSNVTVADGSENRVEGVGDCLIKCAVEYGEIIEITLRGVLYVPTLEG FT NMISIGKLAEKGVRAVFDNTGCKLVYGNTVVAVADKVSDMYWLRIAQDRVM FT KSVVKEHTKNCQHTWHRRLGHRDPAVIGEMKRRDLVSGLEVVDCGIRWTCE FT CCIECKMARSPFPPVAEKTSTEVLDIIHSDVCGPMEETTLGGCRYYMTLID FT DHSRYTFVYFLKKKSEAEDKFREHVKLVQNQFGRKPRIIRSDQGGEYSNKA FT LRKFCADEGIKMEFTAAYSPQQNGVAERKNRSLTEMGRCMLRDAGMHKRFW FT AEAINTTCYLQNRLPSAAVERTPFEIWFGRKPDLTNLRLFGCVGYVLIPSV FT KRKKLDVKAERMTFVGYSGEHKAYRMLNTQTGEIQISRDVRFLEIDDGSKE FT QTYGDPKIEDNPTESVEIEWSLDETKREAKTNVANDTISESEFYGWDCSDD FT GWPRGFWNDNDNNWLRGLWDDDDAEAGAAMPEAVLDAVPEAMPAAVPEVSN FT TPVRRLQRVTAGVPPARYDEEVYLVKESVAEPKTYKEAVSGPQSAEWKIAM FT AEEMQSHQENGTWELAELPPHRKAIGSKWIFKCKADEDGHFVRYKARLVAQ FT GFCQKFGTDYDLVFVPVVKQITFRTMLVLASKRKMLTKHVDIKTAYLHGLL FT KKEIFMRQPQGFESDNPNEVCRLHRSIYGLKQAARVWNTKIDDVLKTMGFI FT QSTADPCLYIREKAGKSIFVLIYVDDVIVICNTEEEFSEVVHVLTLNFTIS FT VMGNLRFFLGIRIRRNDGRYCMDQRAYLERVLERFGMLDAKPSKFPMDPGF FT LKRKEENGRKLDSPKAYQSLIGALLYAAEISRPDIAIATAILGRRVQDPSE FT ADWNEAKRILRYLKGTLDSVLYLGSGGQKLECFVDADWAGDESDRKSNSGF FT VFKFGGGLIGWGCHKQKCVALSSTEAEYVSLAECLQEVKWILKLMADVGEQ FT LDGPVLVNEDNQSCIALTKGDRAERKAKHIDTKFNFVEDMVRDGIVKLQYC FT PTEHMQADLLTKPLQAVKLRQLREAIGIKPFSVEEE" XX SQ Sequence 4232 BP; 1188 A; 859 C; 1239 G; 946 T; 0 other; ggttatgggc ccagctctgt gtggccagtt caattgaaaa gtgcgcgacg cggttcggaa 60 agacagttat tttttcggtg tgaaaaaatt aggacatttc cggaaggtac gcagtgcggt 120 taaggaatcg tgtgtttttt cgtggtgaaa gaaaaaaccc aaccgggaag gtttttgcac 180 gggcaaaaat ggatttttcg aaagtgggcg tcatccggcg gaacaaccga aactatcggt 240 cgtgggcttt caaagtgcag atgttgatga tgcgggaggg tacgtggacg tacgttgacc 300 cgggtgtcgc gccgacaccg gtaactccgg agtggacgga gggtgattcg aaggcgcggg 360 cgaccattgc tttgttggtt gaggataacc aacacaatct catcatgaca aagaacacag 420 cgaaagagac atgggatgcg ctcaaggcac accaccacaa agccactctt accgggaaag 480 tttcgttgct gaaagagatt tgcaacgcaa actatcgtga aggtgagaat atggaagatt 540 ttttatacgg catggaggat cattattctc ggctggagaa ttcgggtgaa aaactctcgg 600 cgaacatgca ggtggccatg attttgcgga gccttccaaa agcatttgac gcacttacca 660 cagctttgga aagtcgttca gataaagagc taacgatgga tcttgagcgg gcaaagctga 720 tcgacgaaag tgagaagctg tacggcggaa aggtgcagga ggagcgagtg ctgaaggcga 780 aaagtgaagc aaaaccaggc gcgtgtttct tttgtggtca acctggccat aagaaacgag 840 aatgcaaaga gttcctgaat cggaagagca gcggggaagg tgaaaagaag aaaaagatta 900 agccgaataa agaacaacaa gtgaaaacag tgcgcgaaaa cgacgcaagt tcgttcacgt 960 tcatggttcg tcagcctgaa attcgcggta acgatcggtc gtggctaatc gactcgggtg 1020 caagttcgca catgtgtagt gacaaaagcg cgttcacggt aatggaacaa agcttgcgtt 1080 caaatgttac cgtcgcggat ggcagcgaaa atcgcgttga aggcgttggc gattgcctga 1140 tcaagtgtgc ggttgaatac ggtgaaataa ttgaaatcac gctacggggt gtgttgtatg 1200 ttcctacgct ggaaggaaac atgatttcaa tcggtaaact cgcggaaaaa ggtgtgcgtg 1260 cggtttttga caacaccggg tgcaagctcg tttacggaaa tacggtcgtc gcggtcgcgg 1320 ataaagtgag cgatatgtat tggttgcgaa ttgcacagga tcgagtgatg aaatcagtgg 1380 taaaggagca cacgaaaaac tgccaacaca cttggcatcg tcgtcttggg cacagggatc 1440 cagctgtcat cggtgaaatg aagcggcgcg atttggtgtc ggggctagaa gtggtcgact 1500 gcggtatccg ctggacctgc gaatgctgca tcgaatgcaa aatggcacgc tcgccatttc 1560 caccagttgc ggaaaaaacc tcgacagaag tgctggatat aatccatagt gatgtgtgcg 1620 gcccaatgga ggaaacgacc ttagggggat gccgttacta tatgacccta atagacgatc 1680 atagtcggta tactttcgtc tattttctca aaaagaaatc ggaggccgag gataagtttc 1740 gcgagcatgt aaaattggtt caaaaccaat ttggccggaa accgcgaatc attcgctccg 1800 atcagggagg agaatactcc aataaggcgc ttcggaagtt ctgtgcggac gaagggataa 1860 agatggagtt tactgcagca tattcacccc agcaaaatgg agttgcggag cggaagaacc 1920 gatcgctaac ggagatgggt cggtgtatgc ttcgggatgc aggtatgcat aagcgatttt 1980 gggcggaagc aatcaacacc acttgctact tgcaaaatcg attgccgtct gctgcagtag 2040 agcgtacgcc attcgagatc tggttcggca gaaaaccaga tttgaccaac ctgcgactgt 2100 ttggatgtgt tgggtacgta ctgattccgt cggtgaaacg aaaaaagtta gacgtcaagg 2160 cggagcgtat gacttttgtc ggctattccg gcgagcataa ggcgtatcgg atgctaaaca 2220 ctcaaacggg agaaattcaa attagtcggg atgtccgttt tcttgagatt gatgacggat 2280 ccaaggagca gacatacggt gatcccaaaa tagaggataa tccgactgaa agcgttgaaa 2340 tcgagtggtc tctcgatgaa acgaaacggg aagctaaaac taacgtggcc aatgatacaa 2400 tctccgaatc tgaattttac ggttgggatt gttcagacga tggctggcca cgaggttttt 2460 ggaacgacaa tgataacaat tggcttcgcg gactgtggga cgatgacgac gctgaagctg 2520 gagctgcgat gccggaggcg gtgctggatg ctgtaccgga ggcgatgcca gcagctgtgc 2580 cagaggtgtc gaatactccc gttcgtcgtt tacagagggt gacagctggc gttccaccgg 2640 caagatatga cgaagaagta tatctggtga aggaaagtgt agcagaacca aaaacgtata 2700 aggaagctgt gtccggtcct cagagtgctg aatggaaaat agcgatggca gaagaaatgc 2760 agtcccatca ggaaaatgga acgtgggagc tagcggagct gccgccacac cggaaggcta 2820 tcgggtcgaa atggatcttc aagtgtaagg cagatgaaga cggtcatttc gttcggtata 2880 aagcacggct ggtggcgcag ggtttctgcc agaaattcgg gacggattac gacctggtgt 2940 ttgtccccgt cgtaaagcag attactttcc ggacgatgct ggttctggcg agtaaaagga 3000 agatgttaac gaagcacgtt gacataaaga cggcgtatct acatggtctt ctcaagaagg 3060 agatttttat gcgccagcca cagggattcg aaagcgataa cccgaacgaa gtatgcaggc 3120 tgcatcgcag catttacggg ctcaagcagg cagctcgtgt ctggaatacg aagatcgacg 3180 acgtactgaa aactatgggt ttcatccaat caacggcgga cccatgtttg tacatacgcg 3240 aaaaagcggg taagtccatc tttgttctca tttacgttga cgatgtgatc gtcatatgta 3300 acacggagga agaattttct gaggtggtcc acgtcttgac actgaatttc acgatcagcg 3360 tcatgggtaa cctaagattt tttctcggca tacgaattcg gcgtaacgat gggcgttact 3420 gtatggacca acgagcttat ttggaacgag ttctggagcg tttcggcatg ctggatgcta 3480 aaccgtccaa attcccgatg gatcccggct tcttaaaacg aaaggaggag aatggcagga 3540 agttggattc gccaaaagcg tatcaaagtc tcataggagc tctgttgtac gctgcagaga 3600 tcagcagacc cgatattgca atcgccacag ccattctggg caggagagtg caagatccat 3660 cagaagcaga ttggaacgag gccaaacgga tactacgtta cctcaagggt acactggata 3720 gtgtattgta ccttggaagc ggcggacaaa agctggagtg ttttgtggac gccgattggg 3780 caggcgacga gagcgaccgc aaatccaact cggggttcgt gtttaagttc ggcggcgggc 3840 tcatcggatg gggctgtcat aagcagaagt gtgtggcact atctagtacc gaggccgaat 3900 atgtttccct tgccgagtgt ctacaggagg taaagtggat actgaaactg atggcggatg 3960 ttggcgagca actggatggt ccagttctgg tcaacgaaga caatcaaagc tgcattgcgc 4020 tgactaaagg agaccgagcc gaacgcaaag caaagcacat cgatacgaaa tttaatttcg 4080 tggaggatat ggttcgggac ggcatcgtga aactgcagta ctgcccaacc gaacacatgc 4140 aagctgattt gcttaccaaa ccgttgcaag cagtgaaact tcgacaactt agggaagcga 4200 tcggaataaa accattcagt gttgaggagg ag 4232 // ID GYPSY69-LTR_AG repbase; DNA; ANG; 441 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY69-LTR_AG is an LTR of retrotransposon GYPSY69_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY69_AG; GYPSY69-I_AG; GYPSY69-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-441 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY69_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 178-178 (2004). XX DR [1] (Consensus) XX CC GYPSY69-LTR is a long terminal repeat of GYPSY69_AG (its CC internal portion is deposited as GYPSY69-I_AG). XX SQ Sequence 441 BP; 185 A; 58 C; 113 G; 85 T; 0 other; tgtagagaat aatagtaata ataataacaa taataataat ttagaaatta aatttaaaat 60 aaaatgattg ttaccattaa agtgaagctt agaattgatc gcgagaggga aaggaaaaga 120 gagaaatagc gatagaacga gaagcgtatc gacaagtgag agactcgcgg cttgccgaca 180 agtgaacgtg agagactgca cggtgagatg ctacacggtg agaagagttc aacatagaaa 240 cgcgttttca agagaggagc aactcggtga gatcgcatgt gagagcggca ggatggaaac 300 gaaacgcgaa aaggcgatca gtttttggac agacgtagaa aagacaagtc gcaataaaaa 360 gagtgtccca gcagtaaaaa aaaaaaaaaa aaagtgttaa aatgtataac atcaatgatg 420 acaagggctg cggttacgac a 441 // ID BEL3-I_AG repbase; DNA; ANG; 5476 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE BEL3-I_AG is an internal portion of the BEL3_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL3-I_AG; BEL3-LTR_AG; BEL3_AG; Bel clade; PHD zinc finger; KW integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5476 RA Kapitonov V.V. and Jurka J.; RT "BEL3_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 12-12 (2003). XX DR [1] (Consensus) XX CC BEL3_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL3-I_AG, an internal portion of BEL1_AG is flanked by CC BEL3-LTR_AG CC LTRs. The BEL3-I_AG consensus sequence was reconstructed based on CC multiple alignment of 4 copies; they are less than 1% divergent CC from CC the consensus sequence. There is no tRNA-like PBS at the 3' end CC of CC BEL3-I_AG. CC The consensus sequence encodes one protein: a 1749-aa BEL3-I_AGp CC (positions 204-5450). BEL3-I_AGp is composed of the PDH domain CC (aa positions 12-59), reverse transcriptase (positions 740-940) CC and CC integrase (positions 1440-1610) domains. XX FH Key Location/Qualifiers FT CDS 204..5450 FT /product="BEL3-I_AGp" FT /translation="MQEESAVIADFTCAVCDKADSVDSLLQCDFCDKWYHY FT ECAHVDKTVETRAWWCAECEVKTKLASGKKDDEIARLKREMEALKATTEKA FT LALIREKDAEVARLSKSSERTSFSPGESLPCSTAKRISVNEEGDLSQSQIA FT ARQAVRYELPSFNGNPEEWPIFLSTFRRSSRTFGFTEDENILRLQGALRGK FT ALRTVQGRLRHADNLEEILSALEKSYGRPDVLVNTLLEQIRESPPIKSERL FT DSFIEYGDLVAEICSTIKASGTSDRLYDAALLQELVDRMPAYLRWSWGMHS FT QELKSVTMSEFGAWIQKATDGAMAVTPPQLKKKTTTRQVHAQHVETHQPQP FT RRHRECALCNSDTCGTIAECRVFNRLSVADRWDKVRTLKLCKRCLGKHYGP FT CSKRDDCGVQGCVAKHHRKLHRVTSEERVEINHHGTRSDGTLLRYVPVKLH FT GESGPICTHALLDEGSTVTLMEQELAGQLGVSGVLDPLCLQYSAGERRDER FT DSERVAVQVSSAEENASAFSMADVRTVSRLSLPIQSVDVNELKRKYKHLEA FT IPAASYEAVSPRLLIGIDHYRLTRPLKTIEGQPGQPTATKTRLGWLIFGKC FT TDNANDTSIVQPESSYHVCDCQGETSRADRMMAAYFEVEGYGPAKEPLLSK FT EDQRAMSILQNNTKHVDGRYTTGLLWRSDNVFMPENRQMALSRMECLERKM FT SRDTSLAEKINAILEDYLEKGYARPIRADELKTFYPRKWYLPVFPVTNPHK FT PNKVRLVWDAAAEVRGISLNKKLLTGPDLLTPLQAVLFRFREYRVAVAADI FT REMYHQVRICDDDVHSQRFLWRWGNTNAEPQEFVMLRMTFGAACSPSTAQF FT VKNENAEKYRSLYPRAVRCIHEEHYVDDMLTSVETEPEAIELAYQVSLIHN FT NAGFSLHNWLSNSIRVVTAVKGTESTLKEMDFEPCLKPEKVLGMWWDTTTD FT SFGFKLSRVRHLELARKDKPPSKRQMLRTLMSIYDPLGLIAGVLFYLKVLL FT QEVWRLHLGWDDEVPEEIQHKWDAWMERLPELESFIIPRCYRQLASLTESS FT LQLHVFVDAGADGYAAVAYFRFECHGRIEVSLVGSKAKVAPLKYLSVPRLE FT LQAAVMGCRIASSITSAHRETISGSYFWTDSTDVIDWINADHRKYSIFVAH FT RVAEVLDTTNVDDWRWLPTKLNVADEATKWTNLQHHLASERWFSGPEFLQL FT PEAEWNIPRRVPSETSEEVRKKDRLKLVGIHIARPIFIDYERFSRWTRLVR FT TMAYVCRYVNIITKTKSPSTGPLNRDEIQRAETVILRDVQRNAFTDEYAIL FT WKARENSTTPSWKSPIPRSSLLFKRSPYMDEDGLLRLSGRIDRCRYVDPGR FT KRPILLPRRHRVSELIVDDVHRRYKHGSQETVVNEVRQRFDIPALRSVCRH FT VRLQCRTCTLLYAKPASPEMCELPAARLAAFSRPFSYTGIDYFGPMVIVNG FT RKTEKRWGVLFTCLTVRAVHIELVQSLSTSDCLMAVRSFMARRGTPIEIVS FT DRGTNFVGADRELKEAAERVDSAILNEFGSPDPVWKFNPPAAPHFGGSWER FT MIQSVKRMLSRTLTERHPTEAVLSAALIEVENMLNSRPLTHVPVDGEDEEP FT LTPNHFLLGSSAGMKPLVKPDDSPAGLKQNWRAVQAKMNELWKKWIKTYLP FT TLVRRTKWFESCKPIETGDVVLIVDENSPRNCWPRGRVERVVPSKDGVIRR FT VVIKTAKGTMLERPVVKLVSLNVAPRVIV" XX SQ Sequence 5476 BP; 1433 A; 1269 C; 1551 G; 1223 T; 0 other; ttctaaaaat tcagctaaaa ttctatttaa aaactcagtc ttataattct tctaaattcc 60 tctcacacga gctgtgcaaa ttcggctcgt tctctacttg ctctattata ccacgcggta 120 agcgctgaga acaaaggact gttcggtaag gtacacgcac atacaaatac acagccaacg 180 catccgaatc gtgtagtgca aaaatgcaag aggaaagtgc agtgatcgcg gactttacgt 240 gtgcggtgtg cgataaggcg gattcggtag attcgttatt gcagtgcgat ttttgcgata 300 aatggtacca ttacgagtgt gcgcacgtgg ataaaacggt agagacgagg gcatggtggt 360 gtgcggagtg cgaagtgaag accaaactag cgagcggcaa gaaagacgac gagatagcgc 420 gcttgaagag agagatggag gctcttaaag ccacgacgga gaaagcgtta gctttgatac 480 gggagaagga tgcggaagta gctcggctca gtaagagcag cgagcgaacc tcgttttcgc 540 ccggggaatc gcttccttgc tcaacggcta aacggatttc ggtcaacgaa gaaggtgacc 600 tgagccaaag ccaaatagca gcccgtcaag cagtgcgcta tgaactccct tcattcaatg 660 ggaatcccga ggaatggccg atatttctgt ctacctttcg aagatcgtct cgtacgtttg 720 ggtttaccga agacgaaaat atccttcgat tgcaaggagc gttacgggga aaagcactac 780 gaacggtgca aggccgtctc cggcacgctg acaatctgga ggaaatattg agcgcgcttg 840 aaaaatcata cgggcgaccg gatgtgttag tgaatacgtt gctcgaacaa attcgcgaat 900 caccgccgat taagtccgag cggctcgata gttttatcga atacggcgat ttagttgctg 960 aaatatgttc aactataaag gcaagcggaa cttctgacag attgtacgat gcagcgctgc 1020 ttcaagagct ggtggaccgt atgccggcat atctgcgctg gagctggggc atgcatagtc 1080 aggagctgaa aagtgtgacg atgagcgagt tcggcgcctg gattcagaag gcgaccgatg 1140 gagcaatggc ggtaactcct ccgcagctga agaagaagac gacaacgcga caagtgcatg 1200 cgcaacacgt ggagacgcat cagccccagc ccaggaggca tcgagagtgc gcgttgtgta 1260 acagtgatac gtgcggtacg atcgcggagt gtcgggtgtt taaccggttg agcgttgctg 1320 acaggtggga caaggtgcgc accttgaagt tgtgcaaacg atgtttgggc aagcactacg 1380 gcccgtgctc gaagcgtgac gactgtggtg tccaaggatg cgtcgctaag catcaccgga 1440 agttgcatcg tgtcaccagc gaggaacgcg tggagataaa tcatcatgga acgcgctccg 1500 atggtacatt gctgcgttac gtgccggtaa agctgcacgg tgagagtggt ccaatttgca 1560 cgcatgcgct gttagacgag gggtcgacgg tgacgttgat ggagcaggag ctcgccgggc 1620 aacttggggt tagcggcgtt cttgacccgt tgtgtttgca gtacagtgcg ggagagcgac 1680 gcgatgaacg tgattcggaa agggtagcgg tgcaagtctc cagtgctgaa gaaaatgcat 1740 ccgcattttc gatggccgac gtgcgtacag tcagccggtt atcgctccct atccaatcgg 1800 tcgatgtgaa cgagttgaag cggaaatata agcatttgga ggcgattcca gctgcctcgt 1860 atgaggctgt ttctcctcgt ttactaatcg gtatcgacca ttacagattg accagacctt 1920 tgaaaactat agaaggacag ccaggacaac ctacagctac gaagacgcgt ttgggatggc 1980 tcatttttgg caaatgcacg gataacgcta acgacacatc cattgtgcag ccggagtcta 2040 gctaccacgt atgcgattgc caaggagaga cctcacgggc agatcgtatg atggcagcgt 2100 acttcgaagt ggaaggttat ggtcctgcga aggagccttt gttgtcgaag gaggatcaac 2160 gtgcgatgtc gattctgcaa aacaacacaa aacacgtcga cgggcggtac accacgggct 2220 tactctggcg gagcgacaac gttttcatgc cggaaaaccg tcaaatggct ctatctagga 2280 tggagtgtct ggagcgtaag atgagccgag atacgagcct tgcggagaaa attaacgcga 2340 tactagagga ctacttggaa aagggctatg ccagaccgat aagagcggat gagctgaaaa 2400 ccttctaccc taggaaatgg tatcttccag tgttcccagt gacaaatcct cacaagccta 2460 ataaagttag gttagtatgg gatgcggcag ctgaagttag aggtatctcg ctgaacaaga 2520 agctgctgac tggccctgac ctgttgacgc cgctgcaagc cgtgctattc cgtttccgtg 2580 aatatcgagt cgcggtggca gctgacatcc gggaaatgta ccaccaggta cgcatttgcg 2640 atgatgacgt ccacagtcaa cggttcctat ggagatgggg aaacacaaat gcagagccgc 2700 aggagtttgt catgctgagg atgaccttcg gtgcagcatg ttctccaagt acggcgcaat 2760 tcgtaaagaa cgagaatgca gagaaatatc gttccctgta cccgcgtgca gttcgctgta 2820 tccatgaaga acattacgtg gatgacatgc ttacaagtgt ggaaacggaa ccagaagcga 2880 ttgagctggc gtatcaggtg agcttaatac acaataatgc tggattttct ctccacaatt 2940 ggctctcaaa tagcatcaga gtcgtgacag cggtaaaagg cactgagtca accctcaagg 3000 aaatggattt cgaaccgtgt ctaaagccag agaaggtgct gggaatgtgg tgggacacca 3060 caacagatag tttcggcttt aaactatccc gtgtaagaca cctggagctg gcacgcaagg 3120 acaaaccacc gtccaagagg cagatgctgc ggactttgat gtcgatctac gacccgttag 3180 gtttgatcgc aggcgtgctt ttttatttga aagtacttct tcaggaagtc tggcgcctac 3240 accttggctg ggatgacgaa gttccggaag agatccagca taaatgggac gcctggatgg 3300 aacgactgcc agaattggaa agtttcatca taccacgttg ctaccgacag ctggcgtcgc 3360 tcaccgaatc atctctacag ctacacgtgt tcgttgatgc gggtgcagat ggttacgcag 3420 cggtcgcgta cttccgtttt gagtgccatg gacgtatcga ggtgtcgtta gttggatcca 3480 aagctaaagt ggcgcctcta aagtatcttt ccgtgccccg cttagagctg caggctgctg 3540 tgatgggttg cagaatagcc tcgtctataa ctagtgctca tcgagaaact attagtggaa 3600 gctacttctg gacagactcg actgatgtta tagactggat aaacgcagac caccgtaagt 3660 actcgatttt cgtcgcacat agggttgctg aggtgctgga cacgacgaac gtcgacgatt 3720 ggcgatggct tccaactaaa cttaatgtgg cggacgaagc gaccaaatgg accaacctgc 3780 agcatcatct cgcctccgaa cgatggttta gtgggcccga gttcctgcaa ctacccgaag 3840 cagaatggaa catacctcgt cgagtaccat cggaaacgtc cgaggaggtg cggaaaaagg 3900 ataggctgaa gctggtcggc atccacatag cgcgtccgat tttcatcgac tacgagagat 3960 tttcccgatg gacgcggctg gtcaggacga tggcttacgt gtgtcggtat gtaaacatta 4020 ttaccaaaac taaatctcca tcgacaggtc cgcttaaccg cgacgaaatc cagcgggctg 4080 aaacggtaat tctgcgagac gttcagagga acgcttttac cgatgaatat gccatactct 4140 ggaaggcgag ggaaaactca acaacaccgt cgtggaaaag tccgattccc agaagtagtt 4200 tgttgttcaa acgaagcccc tatatggacg aggacggctt gctgagattg agcggaagga 4260 tcgatcgttg ccgttacgtc gaccccggga ggaagcgacc tattctgcta ccgagacgac 4320 atcgtgtttc cgagttaatc gttgacgatg tccaccgtcg ttataagcac ggcagccaag 4380 aaactgtagt gaacgaagtg agacaacgct ttgatatacc ggctctccga tctgtgtgta 4440 gacacgtacg cctgcaatgc cgaacatgca cattgttata cgcaaagcca gcgtccccgg 4500 aaatgtgcga attacctgct gctcgtcttg ctgcattctc caggcccttt tcatatacgg 4560 gcattgatta tttcggtccg atggtaatcg tgaacggccg aaagacggaa aaacggtggg 4620 gtgttttgtt cacgtgcttg acggttcgag ccgtccacat cgagcttgtg caatccctgt 4680 ctacgagtga ctgcttgatg gcagttcgga gcttcatggc acgccgcgga acaccaatcg 4740 aaatagtctc cgatcgtgga acgaatttcg tgggcgccga tcgtgaatta aaggaagctg 4800 cggaacgtgt tgattcggcg attttaaatg aattcggatc gcccgatccg gtgtggaagt 4860 ttaacccccc cgcagcccca cactttggag gatcatggga aaggatgatc caatcggtca 4920 agagaatgct gtcgcgcacc ctcacggaga ggcatcctac ggaagcggtc ctgtcggcag 4980 cattgatcga agtcgagaac atgctcaatt ctcggccact tacacacgtt ccagtcgacg 5040 gtgaggatga ggaaccgtta actcccaacc attttttgtt aggttcttcc gcaggaatga 5100 agcccttggt gaagcctgat gattcgccag caggcttgaa acaaaactgg agagctgtac 5160 aggctaaaat gaacgagctc tggaagaagt ggatcaaaac ctacctcccc accctggtta 5220 gaaggacgaa atggttcgag tcgtgcaagc ctatcgagac cggagatgtt gtcttaatcg 5280 tggacgaaaa cagcccgcga aactgttggc cgagaggaag ggtggaacgc gtagtcccat 5340 ccaaagacgg agtcattcgt cgggtcgtca tcaaaacagc gaaaggaacg atgctggaga 5400 ggccagtggt gaagctggta tcgctgaacg tcgcgccgag ggttatcgtt tgacgacgtc 5460 gaaacgcctg gtggac 5476 // ID Ag-Jen-1 repbase; DNA; ANG; 4320 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE A Jockey clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW Jockey; Non-LTR Retrotransposon; Transposable Element; Ag-Jen-1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4320 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4320 RA Kojima K.K. and Jurka J.; RT "Jockey clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 4 CC sequences with >99% identity. XX FH Key Location/Qualifiers FT CDS 131..1300 FT /product="Ag-Jen-1_1p" FT /translation="MRGRGRHKGKRLRPIPRNRNRSYDDSRVRSDRSENSY FT ALLSECEDTPAKLQDAVPKTTTTKNSTSATVTNKVPPIVVKSSSVHFVHKC FT VIEAGVVRYSTKQTNNGTNVQVTNTSDYRSVIKHLQNSNVAYHTYQLDEEK FT STKIVLYGLHDLPTDEVMEILKEENLQPTSIKKLQIRQKRYDDQAVYLLHF FT PAGTISMNLLRTIPALDHCVVKWSYFSKRAGPMQCKRCQLYGHGANNCNRV FT ARCNKCADDHNSSVCPLTAGSSSSDGKVPEHKLKCANCGGNHTATFSGCPK FT RPAANLPKEPKRTTSFNMEKNSFPALPQTKTIQGWKQLEQNPVKFVNFRQN FT ETASGELFNSSEIIPIVGEVFQKLQKCKTKQDQITVIFEVVAKFCFP" FT CDS 1303..3984 FT /product="Ag-Jen-1_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="MDCPDNFNISYWNANGIANKKMDLIYYLSRYDIDAVL FT IVETFLKPVHAFSIPNYIVYRVDRLSGPKGGVLIAIKCNFRHQLLKHNKLG FT IIETLSVKVLTRLGEFTLIAAYHPGKNTNLVTFQQDIKKLSARNDNYFICG FT DFNARHKSWNCVSTNAAGKLLFAEAQKGKFSIHFPDKPTYIPIDPTRNTST FT LDLMLTKNLYTISKLRTITALTSDHLPVTFSIKHSSANFVSNKGIYNYKKA FT DWNIFKNIINNSIDLTTIHPNDIINATQIDDMLHNLTSLITTAKQSAIPKA FT IPGKYNIIIPDYIIALIKLRNKYRKKWQRHRRNITFRNYYHALKTEIEQQM FT SLARNRAWSDTLKSINTQQNNTNNLFKFVKIVKNKQHTIQTLKLGNNIFLT FT ATEKCNALKTHFENSHYLTYNTNSPMERKVTKNNRIFFARNQNHSLDPGYL FT VKPKLLMNIIKNLKNKKSAGCDEINNTEIKMLPKKAVLYLKIIFDSCLKLG FT YFPSQWKKAKIVAIPKPGKDHSNPGNYRPISLLSCIGKLFEKIIARNIRSH FT TDINNIIPQSQFGFQPEKSTTHQIHRLKNYIIENRNRKRSTGLVLLDTEKA FT FDSIWHDGLLYKLIQYNFPTYIIKIIQSFLSNRKSSVHIGHTKSNFFKPIA FT GVPQGSILSPILFNIFISDIPSTNNCTNYLYADDLAIAATARSPSSIIKAL FT NGTLKAYAKFCNRWKLKTNANKSEAIFFTRNTSLRKLPSNEVSFMQGNITW FT KDQVKYLGVHLEKRMTFKTHIEKQLEKVDKLLKILYPFLNRKSKLNLQNKK FT LVYTAYIRPILTYAAPAWISCAYSHRKKLQVRQNKILKMIMNLPPWHNTND FT LHKRSNLVKMEEFIGNLNVNYANKCRQSSISIINQLVEQRI" XX SQ Sequence 4320 BP; 1636 A; 850 C; 682 G; 1151 T; 1 other; ggcagtcgca atccagctcc gatcgagagc agacgtaaat ttgctgatct ctaatctgat 60 aagcaatcga tagcgataga atcacaaaca aatactaaca agcagcaggt gtcccgttcc 120 taccaacaag atgaggggga gaggaagaca caaaggtaaa aggttaagac caataccaag 180 aaacagaaat cgttcatacg acgattcacg cgttagatcg gatcgttccg aaaacagcta 240 cgcgctacta agcgagtgtg aagatacacc agcaaagctt caggacgctg tcccgaaaac 300 cactactaca aaaaactcca ccagtgcgac ggtaacgaac aaagtgccac ctattgtcgt 360 taaaagttcg tccgtacatt ttgtgcataa gtgtgtaata gaagcaggtg tagtgaggta 420 cagcaccaaa cagacaaaca acgggaccaa tgttcaggtc actaacacaa gcgattaccg 480 aagcgtgata aaacacctac aaaacagcaa tgttgcatat catacttatc aactagatga 540 agaaaaatct acaaaaatcg ttttgtatgg cctacacgat ctcccaaccg atgaagtgat 600 ggagatctta aaagaagaga atttgcaacc aaccagtata aaaaagttac agatcaggca 660 aaaacgatac gatgatcaag cagtatacct actgcacttt ccggcgggta caatatccat 720 gaatcttctc agaaccatcc ctgctctaga tcactgtgta gtgaaatgga gctatttttc 780 gaaaagagcc ggaccgatgc agtgtaaacg ttgccaactg tacgggcacg gtgcgaataa 840 ttgtaatcgc gtcgctagat gcaacaaatg tgctgacgat cataattcgt cagtgtgccc 900 actcacagct gggtcatctt caagtgatgg caaagtgcca gagcacaaac tgaaatgcgc 960 gaattgcggc ggtaaccata cagctacttt cagcggatgt cctaaacgtc ccgctgccaa 1020 tctacctaaa gaaccgaaac gtacaacttc tttcaacatg gaaaaaaata gctttcctgc 1080 tctaccacag acaaaaacaa ttcagggatg gaaacaactt gaacaaaacc cagtcaaatt 1140 tgtaaatttc agacaaaatg aaaccgcgtc aggcgaactt tttaattctt cggaaataat 1200 tccaattgta ggcgaagttt ttcagaagct acaaaagtgc aaaacaaagc aggatcaaat 1260 aacagtgatc tttgaggtag ttgctaaatt ttgtttcccg tgatggattg cccagacaat 1320 tttaatattt cctactggaa tgctaatggc attgccaaca aaaagatgga tttaatctac 1380 tatttaagcc gatatgatat tgatgctgta ttgattgttg agacatttct gaaaccagtg 1440 cacgcttttt ctattccgaa ttatatagta tatagagttg atagattatc gggaccaaaa 1500 ggtggtgttc taatagcgat caaatgcaat tttagacacc aattgctaaa acataataaa 1560 ctagggataa ttgaaacatt aagtgttaaa gtgctcacta gactaggaga atttacttta 1620 atagcagcat atcacccagg taaaaatacc aacttagtta cctttcaaca agatatcaag 1680 aaactttcag cgcgaaacga caactatttc atctgtggcg attttaatgc gcgacataaa 1740 tcatggaatt gtgtctctac aaatgcagcc ggaaaactcc tttttgctga agcacaaaaa 1800 ggtaaatttt ctatacattt cccagataaa ccaacataca ttcctatcga ccccacaaga 1860 aatacttcaa cgttggactt aatgctaaca aaaaatctgt atacaatatc gaaattacga 1920 actatcactg ctctcacttc ggatcatttg ccagttacat tcagtataaa acattcatca 1980 gctaattttg tttccaacaa aggtatttat aattacaaaa aagcagactg gaatattttc 2040 aaaaatatta taaataatag tatcgacttg accacaatac acccaaatga cataattaac 2100 gctacacaaa tcgatgatat gctacataac ttaacctcac ttattacaac agctaaacaa 2160 agtgccattc cgaaagcaat tcctggaaaa tataacatta tcattccaga ttacattatt 2220 gctctaatta aactaagaaa taaatacagg aagaaatggc aaagacacag acgtaatatt 2280 acctttcgga attattatca tgcacttaaa acagagatag agcagcaaat gtcactagcc 2340 cgtaacagag cttggtctga tacattaaaa agtataaata cacaacaaaa taatacaaat 2400 aacctattca aatttgtaaa aatagttaaa aacaagcaac atactattca aacgctaaag 2460 ctaggaaaca acatcttcct aacagccact gaaaaatgta acgccctcaa aacacatttt 2520 gaaaactctc attacttaac ttacaacaca aatagtccaa tggaaaggaa agttacaaaa 2580 aacaatagaa ttttctttgc gagaaaccaa aatcactcct tggatccagg ttacttggtt 2640 aaaccaaaat tactcatgaa cataatcaaa aatcttaaaa ataaaaaatc agcaggctgt 2700 gacgaaatta acaacacaga gattaaaatg cttcctaaaa aggcagtact ctacctcaaa 2760 ataatatttg atagttgtct taaactagga tactttcctt cacaatggaa aaaagcaaaa 2820 atagtggcta ttccaaaacc cggaaaagat cattctaatc ctggaaacta cagaccaata 2880 tcccttctga gttgcatagg taaacttttt gagaaaatta ttgcacgcaa tatacgttct 2940 catacagaca ttaacaatat catcccacag tcacaatttg gttttcaacc cgaaaaatca 3000 acaactcatc agatacatag actcaaaaat tatattatag aaaatcggaa tagaaaaaga 3060 tctactggtt tagttttgtt agatactgaa aaagctttcg attccatttg gcatgacggc 3120 ctgttatata aattaattca atataatttt cctacgtata ttataaaaat aattcagtcc 3180 tttctcagca atcgaaaaag ttcagttcat attgggcaca ctaaatcaaa tttctttaaa 3240 cccatagcgg gagtaccaca aggtagtata ttatcgccaa tactatttaa tatatttatt 3300 tctgacattc cctctacaaa caactgtaca aattatctat atgcagacga cctagcaatc 3360 gcggctacgg ccagaagtcc gagctcaata atcaaagcct taaatggcac attaaaagca 3420 tacgccaagt tctgcaatag atggaagctt aaaacaaacg caaataaatc tgaggcaatc 3480 ttcttcacga gaaacactag tttacgcaaa ttacctagta atgaagttag ctttatgcaa 3540 ggaaacatta catggaaaga tcaagttaaa taccttggag tacatttaga aaaacgcatg 3600 acatttaaaa cccacataga aaaacaatta gaaaaagtag acaaattact aaaaatttta 3660 tatcctttcc taaataggaa atcaaaacta aatttgcaaa acaaaaaatt agtttacacg 3720 gcttatatac gcccgattct gacgtacgca gcgccggcat ggattagttg tgcgtactct 3780 catagaaaaa aacttcaagt tagacaaaac aaaatattaa aaatgatcat gaacttacca 3840 ccttggcata acacaaacga tttacacaaa agaagtaatt tagtaaaaat ggaagaattt 3900 ataggaaatt taaatgtaaa ttacgcaaac aaatgtcgcc agagctcaat tagcataatt 3960 aatcaactag ttgaacaacg tatctaagca taaaatatga gtttataatt ttctgtacaa 4020 agctctcctc ttctctatct agttaaggat taggtctagc aaatgtaggt taagatatag 4080 gttagatata tatgtkttta ggttaagtct aaaattgcat gggttttttt ttgcactttt 4140 cccatatact cagatttaaa atcagcaaac aacatatatt ctactcagct atgctataaa 4200 aattcccgat tattgaggag agcacattgc gtacccaatg ttaaagaacg ttcgattcta 4260 tatgtatttt tttttcttta tatccacaat aaacatctct actactacta ctactactca 4320 // ID BEL16-I_AG repbase; DNA; ANG; 5510 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 18-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE BEL16-I_AG is an internal portion of the BEL16_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL16-I_AG; BEL16-LTR_AG; BEL16_AG; Bel clade; RING Zn-finger; KW integrase; peptidase; reverse transcriptase. XX NM BEL16-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5510 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL16_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 39-39 (2003). XX DR [1] (Consensus) XX CC BEL16_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL16-I_AG, an internal portion of BEL1_AG is flanked by CC BEL16-LTR_AG CC LTRs. The BEL16-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 18 copies; they are ~1% divergent from CC the consensus sequence. CC The consensus sequence encodes a 1736-aa BEL16_AGp Bel-like CC protein CC (pos. 231-5438). CC BEL16_AGp is composed of the peptidase A16 (pos. 145-304), RING CC Zn-finger (pos. 335-400), reverse transcriptase and CC integrase domains. XX FH Key Location/Qualifiers FT CDS 231..5438 FT /product="BEL16_AGp" FT /translation="MSVSEDFAGFSDASGSNNTMQYTQAEDLDMIILKRER FT DRVVQSLTRIDTFLAQYKESDFPELTPRLDLLNERWREFQALARNIGAKDY FT SEDNDVLYGEIEDKVMLLKGKLLGKLRAGAEIPSVKQERVCDDYNSVRLPQ FT LTLPQFSGKYDEWLPYHDMFVVTVHENEKLSQVEKMLYLKGSLKGEALKVV FT DTLQACNSNYDVAWDALKKRYSNEYILKKRHVNAMLQWPRMKVMNTVGIHG FT LIDCFERNLQILKQLGEVTEQWGCLIIQIIISKLDESTQQKWERHVEESEQ FT KTVTDLLNFLRTQTRIMDAFAVDRPMAAGKSTSERRVASNVAAEAKCAKCD FT GSHMVENCDSFRSLTLPRRREVVEAKKLCLNCLRQGHFQAKCWSRARCNIC FT NRKHHSLLHGENVSETIGPSVREELPSTSGTQNVVVNASSNRECTSVLLST FT AIVSVRACGNKWLSARALIDSGSQVNLMTKGLAARLKLPQYESKTALSGVG FT HSKVDITTSVTTVIRSKSCNYQERMQFLVLPRISSYRPVTGNQISRQNLPM FT NFVLADPNFDSDAEVDLLLGSEFYATFLKPDNRGKIRLELPALPTFISTVF FT GWVATGKVPLASEGSNYVTCGTCTRLDDLIERFWIIEEIREPLQHSQEERD FT CEAHFVQTHQRDSEGRYIVKLPFKCDLQNQLGPSSAIARKRFLQLERRFNR FT DPWLKQKYTAVINDYIDKGILVKVAANPDSEEAHGHYFLPHHPVIKASSTS FT TKVRPVFDGSASTDGGKSLNDLLMTGPVIQENLLALLLKFRMRSVALVADI FT KQMYLQVKVHPDDTRFQRVLWRGSSVESIEVYELQRVTFGLAPSSFLAIRV FT LQQLAIDEGENFPLARQALLEDFYVDDYIGGASSEEEAVRLQAELTLLLRK FT GGFHLTKWNSNKPDVLSSVSAEDRATSNVKMFEVPEEPIKTLGIAWLPESD FT QLYIDSNIQMNNESWSRRKVYSLVARIYDPLGLVAPVTSWAKINMQSLWLA FT TDDWDEEIPAVMQERWYAFQSQLGLLKEVKFSRHAVVHNPVAVQLHCFSDA FT SEAAYGACVYVRTIGSSGEVVVELLAAKSRPAPLKRVSLARLELCGALLAA FT RLQKVVRQALRIPDVETFMWTDATIVLHWIRAPSHSWATYVANRVSEIQEL FT THGYKWMHVKGVDNPADIVSRGAMPNELLASKLWFHGPGWLQLSEEEWKKN FT ASGVLAIPEEELLERRKSSLVAAVSSESDDWCDRFSNYDKLLRITAYCMRF FT IRCCQRKLDPKHKGVLLVSELAEAKIRLVKREQRIYFAAEIKELSAGQTVR FT PKSSLKTLGAFLDGDGLLRVGGRLHRAKAMQVCSRFPLALPKKSRFTRLMA FT EYYHRLALHGGPTATLSALRREFWPIQGRSLVNSVCRGCLVCFRMNPALVQ FT QPPGQLPVSRAMPARPFSIVGVDFCGPIYLKPVHRRAAAEKAYISIFVCFS FT VKAVHIELVESLSTHAFLAAFRRFVARRGLPSEVYSDNGLNFQGASKVIDD FT FYTLMNSDSAVEDISRYAVGAGVKWHFIPPHAPNFGGLWEAAVKAAKRVLL FT KVVGDRQLAFGEMSTVLAQVEAQLNSRPLTPLSEDPEELDVLTPGHFLIGA FT PMNALPEPDVGDVPINRLKRYEELRRVVQNHWARWRREYFSELHNEHQRGK FT AVVELKVGQMVLLKEDGKTPHHWPMGRIAEVFPGPDGVVRVVSIRTRNGLY FT KRPANRISLLPFERVN" XX SQ Sequence 5510 BP; 1469 A; 1034 C; 1539 G; 1468 T; 0 other; ttggtgccgt gaccaggatg gtctaatatt tgcggtttgg tttgactggt taaagtgaag 60 aactttgtgt gatttctttt ggcaaaaatc gcgtaacgcg tgtgaaatct gaaagttccg 120 tatcgagcgt tggtggattt ctgaataaga aaaaatcgcg aaagcgtgag aaatcattgt 180 gctagtgggt gttctttgcg tgtgtatgca ttttgtgtcg cgtattagaa atgtcggtta 240 gtgaagattt tgccggtttt tctgacgcgt cgggttcgaa caatacaatg caatacacac 300 aagcagaaga tctggatatg attatcctaa agcgcgagcg cgaccgagtc gtgcaatcgt 360 tgacaagaat cgacacgttt ttggcgcagt acaaagagag tgatttccca gaattgacgc 420 ctcgtctcga tctattgaac gagcgttgga gggagtttca ggctttggct cgaaatattg 480 gcgcaaaaga ttacagtgaa gataacgatg tgctttatgg tgagattgaa gacaaagtca 540 tgctattgaa aggaaaatta ttgggaaagc tgcgggctgg agcggagata ccgagtgtga 600 aacaggaacg cgtgtgcgat gattataata gtgttcgttt gcctcagttg acgcttcctc 660 aattttccgg aaaatatgac gaatggctcc cgtatcacga catgtttgtt gttactgttc 720 atgagaacga gaaattgtcg caggttgaaa agatgcttta tttgaagggt tctctgaaag 780 gtgaagcgct gaaggtggtg gatactttgc aagcctgcaa ttcgaattat gacgtagctt 840 gggatgcttt gaaaaagcga tattctaacg aatatatttt gaaaaagcgt catgttaacg 900 cgatgctaca atggccacgg atgaaagtca tgaacacggt gggtattcat gggcttatcg 960 attgtttcga aaggaactta caaattctaa agcagttagg tgaagtgacc gaacagtggg 1020 gatgtttgat tattcagata atcatttcaa agttggacga aagcacccaa caaaagtggg 1080 aaaggcacgt tgaagaaagt gagcaaaaaa cggtgacgga tttgttaaac tttttgcgca 1140 cacagacgcg cataatggac gcgtttgcag tggataggcc aatggcggcg ggcaaatcta 1200 caagtgaacg tcgtgttgcg tctaatgtgg ctgcagaagc aaagtgcgca aaatgtgatg 1260 gatcgcacat ggtggaaaat tgtgactcgt ttcggagttt aacgttgccg cgtcgtcgtg 1320 aagtggttga agctaagaag ctttgcctta attgtttgag gcaagggcat tttcaagcca 1380 agtgttggtc acgtgcgcga tgcaatattt gcaatcgtaa acatcattcc cttctacacg 1440 gtgaaaatgt aagtgaaacg atcggtccgt cggttcgtga agaacttcca tctaccagtg 1500 gcacgcaaaa tgtggtggtg aacgcgtcat cgaacagaga gtgcacttct gtgttattat 1560 caacggcgat agtgagtgtg cgtgcgtgtg gtaacaaatg gttgtcagca agagcattga 1620 tcgatagtgg gtctcaggtg aacctgatga caaagggctt ggctgcgcgg cttaagttac 1680 cgcagtacga aagtaaaacg gcattatcgg gagttggaca ttcgaaagtg gacataacga 1740 cgtcggtaac gactgtcata cgttccaaaa gttgtaacta tcaagaacgt atgcagtttc 1800 tagtgttgcc gagaatttct agctacaggc cggtaactgg aaatcaaatt agcaggcaga 1860 atctcccgat gaattttgtg ctcgcggatc ctaatttcga tagtgatgct gaagtggatt 1920 tattgttggg ctccgaattt tacgcgactt ttttgaaacc ggacaaccgt ggtaaaatta 1980 ggctcgagct accagcgctc ccaacattta ttagcactgt ttttggatgg gttgcaactg 2040 ggaaagttcc gctggcttct gaaggtagca attatgttac ttgtggcacg tgtactaggt 2100 tggacgattt aattgagcgt ttttggatta ttgaagaaat acgtgagccg cttcaacata 2160 gtcaggaaga aagggattgc gaagcacatt ttgtgcaaac tcatcagcgg gatagtgaag 2220 ggagatatat agtgaagctg ccatttaagt gtgacttaca gaaccaattg ggaccgtcca 2280 gtgcgatagc gagaaaacgg ttcttgcagt tagaacgacg tttcaatcgg gacccatggt 2340 tgaaacaaaa gtatacggcg gttatcaacg actacatcga caaagggatt ttagtcaagg 2400 tggctgcgaa ccctgattct gaggaagcgc atggtcatta tttcttaccg catcatccgg 2460 taatcaaggc gtccagtacc agtacaaagg tacgacctgt gtttgatgga tcggcctcta 2520 ccgacggtgg taagtctctt aatgatttgt taatgactgg tcctgtgatt caggagaact 2580 tgttggcgtt gttgctgaaa tttcggatga ggagcgtggc attagtggca gacataaaac 2640 agatgtatct gcaagtcaag gtgcatcccg acgacactcg ttttcaacgt gtattatggc 2700 gaggctcatc tgttgagtcc atcgaagtct acgagttgca gcgggtcaca tttggacttg 2760 ccccgtcttc ctttctggcc attcgagtac tgcagcaatt ggcaattgat gaaggggaaa 2820 actttccctt ggcgagacag gcgttgttag aggacttcta tgtcgatgac tacattggtg 2880 gtgcctctag cgaagaagaa gcagttaggt tgcaagctga gctgacgcta ttgttgagaa 2940 agggcggatt tcatctaact aaatggaatt ctaacaaacc agatgtttta tctagcgttt 3000 cagcggaaga cagagcaaca tccaacgtta aaatgtttga agttccagag gagccaataa 3060 aaactctagg tatcgcgtgg ctaccagaat cggaccaact gtacatagac tcgaacattc 3120 agatgaacaa cgagagctgg tcccgtagaa aggtttactc tttggtagca cgtatatacg 3180 accctttggg gctggtggct cctgtgacat cttgggccaa gataaatatg caatcgttgt 3240 ggttggcaac tgatgactgg gatgaagaaa taccggctgt catgcaagaa cgatggtatg 3300 cttttcaatc acaactcggg ttgctgaagg aggttaagtt ttcgcgccat gctgttgtgc 3360 ataatcctgt tgctgttcaa ctacattgct tttcggatgc atctgaagcg gcttatgggg 3420 catgcgtgta tgttagaacg attggtagca gcggggaagt ggtagttgag ttgcttgctg 3480 caaagtctcg tcctgcgccg ctgaaaagag tcagtttggc ccggttagaa ctttgtggag 3540 cattactggc agcaaggttg cagaaagtgg tacgccaagc gttgagaatt ccagacgtgg 3600 aaacctttat gtggactgat gcgacaatcg tgttgcattg gattcgagca ccatcccatt 3660 cttgggctac gtacgtagcg aatagggtat ccgaaattca ggaattgacg catggctaca 3720 aatggatgca cgtgaagggc gttgacaatc ctgccgatat tgtatcgcgt ggagctatgc 3780 cgaacgagct gttagcatcg aagctgtggt tccatggtcc cgggtggtta caactatcag 3840 aggaagaatg gaagaagaat gccagcggtg tgttggcaat tcccgaagag gagttattgg 3900 aacgaaggaa gagctcattg gtggccgcag taagtagcga gagcgatgat tggtgtgata 3960 ggttttccaa ctatgacaaa ttactgcgga tcactgcgta ttgtatgaga tttattcgtt 4020 gttgccaacg aaagctggat cctaaacaca aaggtgtttt gttggtgagc gagctagcag 4080 aggcgaaaat tcgactggtg aaaagagaac aacggatata ctttgcggct gagatcaagg 4140 agttgtctgc tggacaaacg gtacgtccca aatcatcact gaagacatta ggagcttttt 4200 tggacggtga tggtttgctc cgagttggtg gccgcttgca tcgcgctaaa gccatgcaag 4260 tttgtagcag atttccgttg gcgctaccca agaagtcacg atttactagg ctaatggcag 4320 aatattatca tcgattggca cttcatggtg ggccaactgc aacattgagc gcactcagga 4380 gagaattttg gccaattcaa ggacgatctt tggtcaatag tgtttgcaga ggctgtctgg 4440 tatgcttcag gatgaatccc gcgttagttc aacaaccacc aggacagcta cccgtgtcgc 4500 gtgctatgcc agctcgacca ttttcgatcg taggggttga tttctgtgga cccatttact 4560 tgaagccggt gcatcgccga gcagcagctg aaaaagcata tatttcaatc tttgtgtgtt 4620 tttcagtaaa ggctgttcac atcgagcttg tggagtctct atcaactcat gcatttctag 4680 cggcgtttcg tcggtttgtg gcaagacggg gtttgcccag cgaggtctat tccgacaacg 4740 gtctcaactt ccaaggagcg agtaaggtga tcgatgactt ctacacgttg atgaacagcg 4800 attcggcggt ggaggatata tcgaggtatg ctgttggcgc tggcgttaag tggcacttca 4860 tcccacccca tgcaccgaac tttggcggcc tttgggaggc agcggtaaaa gcggcaaaac 4920 gcgtcctact gaaggttgtt ggtgatcggc agctggcgtt tggggagatg tcgacggtac 4980 tggcacaagt ggaagctcaa ctcaacagca gaccgcttac accgttgtcg gaggatccgg 5040 aagaacttga tgtattgacg ccggggcatt ttctaatcgg ggctccgatg aacgctctac 5100 cggagcctga cgtgggtgat gtaccaatca atagattgaa gcggtatgag gaattgcgta 5160 gagtggtaca gaatcattgg gcgcgttggc gtagggaata ttttagcgaa ctacataacg 5220 aacatcaacg cggcaaggca gtagtagagc taaaggtagg acaaatggtc ctgttgaaag 5280 aggatgggaa gactcctcac cattggccaa tgggacggat tgctgaggta tttcctggcc 5340 cagatggcgt agtgagagtc gttagtatca ggactaggaa cggcttgtat aagaggccag 5400 cgaataggat tagtcttctt ccgtttgaga gagtgaatta gatatcataa agtcaggcat 5460 tttgtgaagt aatggaaaga ggtaaatttg gtaaatttag gtggccgcta 5510 // ID TC1N-1_AG repbase; DNA; ANG; 521 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE TC1N-1_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW TC1N-1_AG; mariner/Tc1 superfamily; nonautonomous DNA transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-521 RA Kapitonov V.V. and Jurka J.; RT "TC1N-1_AG, a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 2(11), 25-24 (2002). XX DR [1] (Consensus) XX CC There are several hundred copies of TC1N-1_AG in the genome. CC They are ~98% identical to the consensus sequence. CC TC1N-1_AG copies are flanked by the TA target site duplications. CC This element has 182-bp terminal inverted repeats. CC Classification: a nonautonomous Tc1-like DNA transposon. XX SQ Sequence 521 BP; 150 A; 103 C; 117 G; 151 T; 0 other; cagggtttcg aatcatataa gggaatagtt caaccactta ggggaatgtt gcaaatcggt 60 aggggaatgt tgcatgcaag ctgtggatgt ttgcaatcac ttaggggagt tttgcaatcg 120 attaggggaa tgttgcaagc gtactgtgga gctgtcatca gactgtgggt gccggtgtaa 180 aaagctacga attgataccg atgatgtttt tttaaaaaca tcgtttggag ttgttttcgt 240 gtgggattct gctgtacaca tgataaacta aataaatcct gctgcggagc cataattaaa 300 catgataagt tagtaagatt gacgaaaata ttttaaggta tttacagcgg cacccacagt 360 ctgatgacag ctccacagta cgcttgcaac attcccctaa tcgattgcaa aactccccta 420 agtgattgca aacatccaca gcttgcttgc aacattcccc taccgatttg caacattccc 480 ctaagtggtt gaactattcc cttatatgat tcgaaaccct g 521 // ID BEL12-I_AG repbase; DNA; ANG; 5887 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 18-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE BEL12-I_AG is an internal portion of the BEL12_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL12-I_AG; BEL12-LTR_AG; BEL12_AG; Bel clade; RING Zn-finger; KW integrase; peptidase; reverse transcriptase. XX NM BEL12-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5887 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL12_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 31-31 (2003). XX DR [1] (Consensus) XX CC BEL12_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL12-I_AG, an internal portion of BEL1_AG is flanked by CC BEL12-LTR_AG CC LTRs. The BEL12-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 20 copies; they are less than 1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes a 1726-aa BEL12_AGp Bel-like CC protein CC (pos. 582-5759). CC BEL12_AGp is composed of the peptidase A16 (pos. 130-250), RING CC Zn-finger (pos. 358-410), reverse transcriptase (pos. 700-900) CC and CC integrase (pos. 1450-1600) domains. XX FH Key Location/Qualifiers FT CDS 582..5759 FT /product="BEL12_AGp" FT /translation="MPAADKRVKMFNLKRVEIMNTLQDFEEFTKSFDATID FT AYQIPSRLEQLEELVSEFTELRKAFNETVDDSEAFDIMQKDRREFNKRSHE FT VRAFLLKNSSHSGASSGLNTTQVNTTISAGTQNHLRLPKVDLPSFDGEITK FT WLTFKDRFSSMVHDSTEMPEVLKLQYLLSALKGDAAHQFEHMQITADNYYV FT TWEALLKRYDNSKVLKREYFKAFYSLEKMKTDSTEELARIVNEANRLVRGL FT ERLNEPVDKWDTPLTSLLFYKLDSKTLVAWEQYSVDFKTDEFTNLVEFLEQ FT RVNILKSSAQNICNQYSANSIMVTGRQARRDGRNVALPVQQTNNTFKGYLK FT CPLCNEQHPLHVCERFERASVINREEIVRKHGLCFNCLRKGHSARECRSTY FT VCQQCKRKHHSKLCKIGRLSEVEVVPSTSRLTATAQANCSKKTVILSTAQI FT IILDVNDQPYKVRALLDNGSQLNFITERVAQELRLKRARVSEQIAGVGGAI FT MRVAGSVVGTIRSLTTEYTTCLEFLILPKIATDLPSETMDVRGWKLPKDVR FT LADPTFHERGSIDMLIGADTFVEMIKAKKIKLDHELPTLLETELGWIVSGA FT YKHNNLNQSMACTIVSQGGENDIASLMNTFFNIEEVQDQNLWNVEERECED FT HFQATTRRDENGRYVVRLPLKAERELGESKEVALRRLIGLERRFEREPKVK FT EAYEAFMQEYITLGHMSVRENENSSDGYYMPHHAVFKQDSTTTKCRVVFDG FT SCKTSNGRSLNDILKVGPTIQQDTTDILLRWRRRAIAVVGDVEKMYRQVWV FT HEEDRKFQRILWRSHSSEKIKTYELNTITYGTASAPFLAIRTLNQVLEDNK FT EKYPLAASRINDFYVDDFISGADSENEAKQLCEETKAALAMGGFPLRKWAS FT NCPHILPSETEIDNIQRVIELKSREGAVSTLGLVWNPILDTLGVKISEPET FT CEIYTKRSIIRTIAKIYDPLGIVDTVKAKAKQFMQRVWSLKKENGDSYGWD FT EEIPQQMRQEWEVFERQLTHLQEVQVPRCVTIVGARNIQIHGFCDASEEGY FT GACVYVRSTNGEEIVSRLFVSKSKVTPLATKHTIARLELCAAHLLGKLLVK FT LKRATEDPYETFCWTDSSTVIYWLKSSPSRWKTFVANRVSQIQNATKEFEW FT RHVPGIHNPADAVSRGRNPAEVVEDKLWWHGPDWLVKDPEHWPKNIESGNT FT CETAKEEKQTKTTLTCMVKEESFINKLCERVGSFTKLKRIVAYCHRFFDRK FT RIHRKSYFELRELKRAEKTIIRLVQNEVYATEYECIKQGQQVVRKSPLRVI FT RPILDKDNVMRVGGRLSNADIKDEQKHPVIIPGKHRIAELIADKYHKILRH FT AGAQLMINTMQLRFWIVGARNVAKRTVFNCVKCTRCRPKLIQQPMADLPEQ FT RVRQARPFSISGVDYAGPIMVKGTHRRAVPTKGYISIFVCFVTKAVHIELV FT SNLTSSAFLAALRRFVARRGHVTELHSDNGTNFRGANNKLRELYKLLNSDT FT HQDEVVGWCAERDMKWKFTPPAAPHFGGLWEAAVKSMKFHLKRVLGTGHLT FT FEDLSTLLAEIEACLNSRPITAISEDPNDMEALTPGHFLVGNHLQTVADVD FT IADVPTNRLNHWRLIQKHMQHIWNRWHREYLSTLQKRAKWNKNAISIEPGR FT LVILQEDNVAVSKWPMARVVDLHPGKDGVTRVVTLKCANGKEIRRPIHRIA FT PLPIES" XX SQ Sequence 5887 BP; 1982 A; 987 C; 1423 G; 1495 T; 0 other; tttggtcctt cgaaccggat ccgaattacg gatattcttt gctagttttt tgtggtttat 60 gctatttggt taccgagcgc caagtgggag tgcaaagaaa tattggattt cttagcagaa 120 cggtgtgttc tccactccat tcgtgttgtt gctactgctg caggagccgg acactcatcg 180 ttggtttgtg tgtgtgagag agagaggacg tgaagccacg aaaggagctg cgtatctgtg 240 ggctgtcggt aaacagcaaa agacgaactg tgcgacgacg aaaggtgcga agttgtaata 300 cggtgaatag ctgaatttag tgcgtgcgtg aactttccat taataacata tttgtggata 360 attagtgtca gcgcagaaca ttttcgaacg cgtgagtgtc gttcaaccgt agtagtggag 420 tgggtgtgcg tgagtgacta tcctagcgcc atacttggca cccaaagctg cgcgcgtatt 480 agttcaggtg tcgacaactt ccaggggtac ggaactcatt gctgtgcgtt ttattaaaaa 540 ttgtgttaag aagtgcctaa agaacaatta ttgggagcaa aatgccagca gcagataaac 600 gagtgaaaat gttcaattta aagagggtag aaattatgaa cactttgcaa gatttcgaag 660 agtttacgaa atcctttgat gcaaccatcg atgcatatca gatacctagt cggttggaac 720 agttagaaga gttggttagt gagttcacgg aattacgtaa agcattcaac gaaacggtag 780 atgattcgga agcgttcgat atcatgcaaa aagatcggcg tgaatttaac aaacggtctc 840 acgaagtaag ggcattttta ttaaaaaata gttcccattc tggggcgtcg agtgggttga 900 acactacaca ggttaacaca actattagtg caggaactca aaatcatctg cgccttccta 960 aggttgacct tccaagcttt gatggtgaaa taacaaaatg gcttacgttc aaagacagat 1020 tttcgtctat ggtgcatgac tcgacagaaa tgcctgaagt gttaaaattg caatatttat 1080 tatcggcgct taagggtgat gctgcgcatc aatttgaaca catgcaaata acggccgata 1140 attattatgt gacatgggaa gctttgttaa aacgttatga taattctaag gtgttaaaaa 1200 gggaatattt caaggcattt tattctctag aaaaaatgaa aaccgactcg acggaagaat 1260 tggcacgtat cgtgaacgaa gcaaatagat tagtcagagg gttagaacgt ttgaacgagc 1320 ctgtcgacaa gtgggacact ccgttaacaa gtttattgtt ttacaaattg gacagtaaaa 1380 ctttagtggc gtgggagcag tactcggtgg atttcaaaac agatgaattc acaaatttag 1440 tggaattttt ggaacagcga gtgaacattt taaagagctc tgcgcaaaat atttgcaatc 1500 aatattcggc taattcgatc atggtgaccg gcaggcaggc gagaagagat ggtaggaatg 1560 tggcattacc agtacagcaa acgaacaata catttaaagg gtatctcaag tgtccactgt 1620 gcaacgaaca gcatccgttg catgtgtgtg agagattcga aagagcgtca gtgataaatc 1680 gagaggagat agtaagaaaa catggcttat gttttaattg cttgcgaaag ggacactcag 1740 cacgtgagtg tagatcgacg tatgtgtgcc agcagtgtaa aagaaagcac cattcgaaac 1800 tgtgtaagat aggaagatta tctgaagtgg aagtggttcc gtcaacgtca agattaactg 1860 ctacggctca agcaaattgt tcgaagaaaa cagttatatt gtctaccgcg caaattataa 1920 ttctagatgt taacgatcag ccatacaaag tgagagcatt actcgataac ggctctcaat 1980 taaatttcat cacggagaga gtggcacaag aactcagatt gaagagagcc cgcgtgagtg 2040 aacagatagc tggtgtgggt ggagctatta tgagagttgc aggatcagtt gtgggtacca 2100 ttcgatcact caccactgag tacacaacat gcttagaatt tttaattttg ccaaaaattg 2160 ctaccgattt accatccgaa acaatggacg tacgaggttg gaagttacca aaagatgttc 2220 gattagcgga ccctacattc catgaaaggg gctcaataga tatgttgata ggggcagaca 2280 cctttgttga aatgataaag gcaaaaaaga taaagcttga tcatgagtta ccaacactac 2340 ttgaaacgga attaggttgg attgtgagtg gtgcatataa gcataataat ttaaatcaat 2400 caatggcatg cacaattgtt agtcaagggg gagaaaacga catagcttct ttgatgaaca 2460 cattttttaa tatcgaagaa gttcaagatc agaatttgtg gaacgttgag gaacgagaat 2520 gcgaagatca ttttcaagca acaacaaggc gtgatgagaa tggaagatac gtggtgcgat 2580 taccactcaa ggcggagagg gaattgggag agtccaagga agtagcctta cggcggctga 2640 ttggacttga gagaagattt gagagggaac cgaaggtgaa ggaagcatat gaagcattta 2700 tgcaggaata tatcactttg gggcacatga gtgtcagaga aaatgaaaat agtagtgacg 2760 gttactatat gccgcaccac gctgttttca agcaagatag caccacgaca aagtgtcgtg 2820 tagtttttga tggatcgtgc aaaacgtcaa atggtcgatc tctcaatgat atattaaaag 2880 taggtccaac aatacagcaa gacactacgg atattttatt aagatggcga cgtagagcca 2940 tagcagtggt cggtgatgtt gaaaaaatgt accgacaagt gtgggttcat gaggaggatc 3000 gaaagttcca acgaatactt tggagatcac attcaagcga aaaaataaaa acatatgagc 3060 ttaatacaat aacgtacgga acggcatcag cgccatttct tgctatacga accctaaatc 3120 aggtgctaga agacaataag gaaaaatacc cactagcagc atcgcgtata aatgactttt 3180 acgtggatga ttttatttct ggtgcggatt cagagaatga agcaaaacaa ttgtgcgaag 3240 aaaccaaggc agcgttagca atgggtgggt ttcctttacg caaatgggct tctaattgtc 3300 cccatatatt accatctgaa accgaaattg ataatataca aagggtaatt gaattgaagt 3360 caagagaggg tgcagtatca acattaggac ttgtgtggaa tccgatctta gacactctag 3420 gtgtaaaaat tagtgaacca gaaacttgtg agatatatac aaaaagatcg attataagaa 3480 caatcgcaaa aatctatgat ccattgggga ttgtggatac agttaaagca aaagcaaaac 3540 aattcatgca aagagtatgg tcattaaaaa aagaaaatgg tgactcatac gggtgggatg 3600 aagaaattcc acagcaaatg agacaagagt gggaagtgtt tgagaggcag ttaacacatt 3660 tacaagaagt acaagtaccg agatgcgtaa cgatagtagg agcacgtaat attcaaatac 3720 acggattttg tgatgcttct gaagagggtt atggagcttg cgtatatgtg agaagcacga 3780 atggagagga aatagtttcg cgattatttg tatcgaaatc aaaggtcacc ccattagcta 3840 caaaacacac aatagctaga ttagaactat gcgcagctca tttattagga aagctattgg 3900 tgaaactcaa aagggccaca gaagatccat acgaaacatt ttgttggaca gactctagca 3960 cagtaattta ttggttgaaa tcgtctccaa gtcgttggaa aacattcgtg gcgaatagag 4020 tatcacaaat acaaaatgca acaaaagaat ttgaatggag gcatgtgcct gggattcata 4080 atccagcaga tgcggtttcg agaggtagaa atcccgcaga ggttgttgag gataagcttt 4140 ggtggcatgg accagattgg ctagtcaaag acccagaaca ttggcctaaa aatatagagt 4200 caggaaacac ttgtgagaca gcgaaagaag aaaaacaaac gaaaactaca ttaacatgta 4260 tggtgaaaga ggaaagtttt ataaacaaac tatgcgagag agtaggttca ttcacaaaac 4320 taaaaaggat tgtcgcatat tgtcatcgtt tcttcgatcg taagcgaatc catcgcaaat 4380 cttattttga gttgagggaa ctaaaacgag ctgaaaagac aatcattcga ttggttcaaa 4440 atgaagtcta tgcaactgaa tacgagtgta tcaaacaagg gcaacaagta gtgcgaaaat 4500 caccattgag agtgattaga ccaatactgg acaaagataa tgtcatgaga gtaggaggtc 4560 ggttgtcaaa cgccgacata aaagacgaac aaaaacatcc tgttattatt ccaggaaagc 4620 acaggattgc agagttgatt gccgacaagt accataagat acttcgtcat gctggggctc 4680 aactgatgat aaacactatg cagttaaggt tttggatagt gggagcgcgc aatgtagcga 4740 aacgtacagt tttcaactgt gtgaaatgta ctcgttgtag accaaaactg attcagcagc 4800 caatggctga tcttccagag cagagggtga gacaagctag accgttctca attagcggtg 4860 tggactacgc aggaccgata atggtaaagg gcacacaccg acgggcggtg cccacaaaag 4920 gctatatttc aatatttgtt tgtttcgtaa caaaagcagt tcatatcgaa cttgtatcaa 4980 atctaacctc ttctgcattt ttagctgcac tgcgtcgatt cgttgcgagg agagggcatg 5040 ttacggaatt gcattcggat aacggcacaa acttccgagg tgcgaacaat aagttgcgcg 5100 aactgtataa attactaaat tctgatacac accaagacga ggttgtagga tggtgcgccg 5160 aacgagacat gaagtggaag tttacacccc cagctgcacc acattttgga ggtctgtggg 5220 aggccgcggt gaaatctatg aaatttcatt taaagcgcgt gttaggtaca gggcatttaa 5280 cgtttgaaga tttatcaacc ttattagccg aaatagaagc atgtctaaat tctcgaccaa 5340 ttacggcaat atcagaagat ccaaatgata tggaagcact taccccaggg cattttttgg 5400 tagggaatca cttacaaacg gtagcggacg tagacatcgc agatgtgcca acaaacagat 5460 taaaccattg gagactgata caaaaacaca tgcaacacat ttggaatcgt tggcatcgcg 5520 aatatttaag tacattgcag aagcgagcaa agtggaacaa aaatgcgata tcgattgagc 5580 caggaagatt agtaattcta caagaagaca atgttgcagt atctaaatgg ccgatggcaa 5640 gagtagtgga tttacatcca ggaaaagatg gtgttacacg agtagtaacg ttgaaatgcg 5700 caaatggcaa ggaaattcgt aggccaattc atagaatagc tcctttacct atagaatcgt 5760 aaattgaaat caataattgg aattatgtgg aattaagatg aggaatttta agaaatcaaa 5820 taggaatatg aaaatgaatt caatcattac tgaatattaa aaaacattcg tttttggtga 5880 ccgggaa 5887 // ID Ag-Outcast-6 repbase; DNA; ANG; 6411 BP. XX AC . XX DT 20-JUL-2009 (Rel. 14.07, Created) DT 29-OCT-2010 (Rel. 15.11, Last updated, Version 2) XX DE Outcast is a non-LTR retrotransposon - consensus. XX KW Outcast; Non-LTR Retrotransposon; Transposable Element; KW Nonautonomous; Ag-Outcast-6. XX NM Outcast. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6411 RA Biedler J. and Tu Z.; RT "Non-LTR retrotransposons in the African malaria mosquito, RT Anopheles gambiae: unprecedented diversity and evidence of recent RT activity."; RL Mol. Biol. Evol 20(11), 1811-1825 (2003). XX DR [1] (Consensus) XX FH Key Location/Qualifiers FT CDS 1633..5223 FT /product="Ag-Outcast-6_1p" FT /translation="MEHLNILQTNIQSIRKNRDELTHILHDQKYHVACLQE FT TWLKNEDKITIKGFNTIRTNREDGYGGSCILIKKGIKYKPIKLIDESDEIQ FT ITTILIHSGNLIIISIYIAPNTTKQIIKDTLAKITHNTQNYTNIIIAGDFN FT AHHTYWGDDRIDQKGNTIIEEIDNSNLIILKNNTYTYVPTDHNKRQTSIDL FT TIISKKIHNVINKTILEKHIGASNHKIIKITIKKHSTEPIQHTIINMNKVI FT TNIKHMKGKDIINIKHFTKKVYKIIQNSKQKINFTLKSWWNNEIKKALQDK FT NIARQTFNSTKLIEHAIEFRKRTAIFKLKIKQAKEKQTERNMEKINKDTSS FT KELWNLLGNISNINTTNKESNLIHNEETYAKEFMNLNFTNNKKSKFRTFFS FT TNLPIDTELINSDIWKNILKQKKNTTPGLDKITYQMLRNINDESLTQIITD FT GNKMWENGKINKELKQIKIIAIPKPKKNINDPNNYRPIALIPTITKVLNSA FT VLLKLNKYIETKNILPEKSFGFRNKRTINQCINYFNNEINHNQRLNRISGA FT IFIDLEKAYNNVSTKILVEQMIKSDIPPQIIKWTYSFLRNRTLIIESNKKT FT YKMLVTDGLPQGDVMSPTLFNLYTKDIHKAINQTNTKNTIIQYADDFVILS FT SGVNERELQNNLQNALNAFAIETEQLKFNINSNKTKFMIFGKSHHTIQLNI FT HNNSIEHTNTYKYLGTVIDPKLNFHKHIELLRNKATNRLNFLKIISTQKNN FT INPKNSLKIYRATIRNTMETGASYTLNSNKNKYKTMNSTINQALRKATGCT FT KTTPINTLHAIAAEIPFNIRSRFIVRKELAKDLVYSPIIRQQLQIHKRTKY FT KKKKKTIHETTYEKDNKILKQLYIAKNNDITTHIKINTKIKDTIETKTKTN FT TQILKQITIENMNKYENNRPTIYTDGSIAGNKVGIGIYIKNKPQHYYYSYR FT LKNFTSITTAELIAIEKALLLAQENNIDNPVIFTDSLTSCNILQKAMTENK FT IEEICYNIIKTATNIKADIVWIPSHVGIDGNDRADELAKKGTLTNWFYKNK FT IRYTDATQYYKKEMEEETKKWYTDLGRNKGKKFMQFQNVFKTELWHKQVNL FT NGNEVKTINKILAGHDLSEFWLHKMKIVENGTCEKCLVPETGRHKIFECQK FT YKRHKDISLDTLYEKWMETSGNTCKKIIEFIKENNITL" XX SQ Sequence 6411 BP; 3004 A; 1161 C; 837 G; 1409 T; 0 other; gatgactcct acacaaaccg ttgcacagag cagatcaatc ctatcaatag tttaaaaaaa 60 aaaaaaccct ttttcaaaca gaaccgataa atacccaatt ttggagtggc gaaaataaaa 120 acagtttaaa aacaaagtgt tgtgttgttc ctcttttcaa agtatcctta ctaaaaaaaa 180 atagatacat acatacagag tggaaaaaaa atcaagcaaa gtggtagaga ggggagagga 240 agatgggggt catagacccg ggaggaaata agacattcgg gtcttatttc agcatcttgg 300 gggaaacatc aaacaaaaac accaaaaaaa gaagaaaaat aagaggtgag cgtatcaatt 360 taaatttcag tgaaacagtg aagaatgaag aggtgtttat ggtgttggag agtaaaactc 420 ccggcaaaag tgtggcatgc tacaacccat ttttgttcgc aaaggcaact gaaatggcga 480 ttggatgtag gccattacaa acgtccatgc ttagagacgg caaaatactg atgcgtgtta 540 aaaatgaaac cgaagcaaaa aaactaaaaa atattaattt aaaacacggt gattgtgcca 600 tcgaagtgga tgtttacgaa cacaaaactt taaatcaaag caagggaata atcagaagtg 660 atgcgtgcag gtttttgagt gaagaagagc ttctagaagg tcttcaacct caaaatgtat 720 cagaggtata cataatgaag agaaaaagcc aagatggtgt tctcagaaac acaagaaccg 780 ccatcataac tttcaaatct acagtgcttc ccagaaacat agaaataggt tacttcaccg 840 aaaaagtaga attatttatc ccaaatccaa tgcggtgcat gaaatgcatg ctctttgggc 900 atacaaaaaa taattgcaat cgacaaaaaa tttgcgccaa atgtggagaa aactttcatg 960 agaattgcac aaaccctcta aaatgcacac aatgcggtga aaaccactca tctttagaca 1020 aagattgtcc agtatggaaa gacgaggtgg aaataaaaag aattcaaaca gaaaaaaaaa 1080 taacaataaa agaagcaaga aaaattcgaa gggaaacagt gccagctatt cctagaattt 1140 atataagaga aaaattttca tcaatagtga gaggcaatca aacgcctata atgaacgaaa 1200 aacgaaaaat agacgaaaca agcgaaaata atagtaacac accaaaatca actgaaaata 1260 caccaacaaa aattaacagc acaacaaata acaagtacct agataataac acagaaccca 1320 aaaaaaatac gaatttaaca acaattacca atgatttaga aaaccacaca gaaaaggaaa 1380 aacacacaga aatcgaaaac acagaaacac aaacgaatcc aaacgcaaca gatgaaagtg 1440 ccaccaatat cacttactat agtgatactg aaatgtatct caatgagata gacattcaag 1500 gaacaatcga aatagactac aaagaaacat cacagtaaaa attacaaaaa ttaaaataaa 1560 aaccctactt atatagccca ttacacaaac aatcctaata atacttaaca ttttctctat 1620 aaataattaa acatggaaca cctaaatata ctacaaacta acatacaaag catacggaaa 1680 aatagggacg aactaacaca catactacat gatcaaaaat accatgtagc ttgcttgcag 1740 gaaacatggc tcaaaaatga agataaaata acaattaaag gttttaacac aatacgaaca 1800 aacagggaag atggatatgg aggaagctgc attctaatta aaaaaggaat taaatacaaa 1860 ccaattaaat taatagacga aagtgatgaa attcaaataa caacaatact gatacactct 1920 ggaaatctaa ttataatatc aatatatata gctccaaata ctaccaagca gataataaaa 1980 gacaccttag caaaaataac acataacaca caaaattaca caaacataat aatagctggc 2040 gattttaatg cacatcatac atactgggga gatgatagaa tcgatcaaaa aggaaacaca 2100 ataatagaag aaatagacaa ttcaaacctc attatcctaa aaaataacac atacacttac 2160 gtcccaacag atcacaacaa gagacagacg tccattgatt taacaattat atcgaaaaaa 2220 atccataatg taatcaacaa aactatttta gaaaaacaca taggggcaag caaccacaaa 2280 ataataaaaa taacaattaa aaaacactcc acagaaccga tacaacatac tataataaac 2340 atgaataagg taataacaaa cattaaacat atgaaaggaa aagatataat aaacataaaa 2400 catttcacaa agaaagtata caaaatcata caaaatagta aacaaaaaat caattttaca 2460 ctaaaatcat ggtggaataa cgaaataaaa aaggccctgc aagacaaaaa tatagcaaga 2520 caaacattca acagtacaaa acttattgaa catgccatag aattccgcaa aaggacagca 2580 atcttcaaac ttaaaataaa acaagcaaaa gaaaaacaaa cggagagaaa catggaaaaa 2640 ataaacaaag acacaagtag taaagaacta tggaatctgt taggaaatat tagcaacatc 2700 aataccacca ataaagagtc aaacctaatt cacaatgaag agacctatgc taaggaattt 2760 atgaatttaa atttcacaaa caataaaaaa tccaaattta gaactttttt cagtacaaat 2820 ctaccgatag atacggaact tataaattca gatatatgga aaaacatact gaaacaaaaa 2880 aaaaatacaa cacccggttt agacaaaatt acataccaaa tgctaagaaa cataaacgac 2940 gaatcactaa cacaaatcat aacagacgga aacaaaatgt gggagaatgg taaaataaac 3000 aaagaactaa aacaaattaa aataatagca attcctaaac caaagaaaaa tataaatgac 3060 cccaacaact atagacctat agcacttata ccaacaataa ccaaagtcct aaactcagcg 3120 gtattattaa aattaaataa atatattgaa accaaaaata tactcccgga aaaatccttt 3180 ggattcagaa acaaaagaac cataaaccaa tgcattaact atttcaataa cgaaataaat 3240 cacaaccaaa gactaaatag aataagtgga gcaatattca tagatttgga aaaagcatat 3300 aataatgttt caacaaaaat cctcgtggaa caaatgataa aatcagacat cccacctcaa 3360 attataaaat ggacctactc ctttcttaga aatagaactt taataataga aagcaataaa 3420 aaaacataca aaatgttagt gacagatgga ctaccacaag gagatgttat gtcaccaaca 3480 ctttttaact tatataccaa agatatacat aaagctataa accaaacaaa caccaaaaat 3540 actataatac agtacgcgga tgatttcgtt attctcagca gtggagtaaa cgaaagagag 3600 ctacaaaata acctacaaaa cgctttaaat gctttcgcaa tagaaacaga acaactaaaa 3660 tttaacataa actcaaacaa aacaaaattt atgattttcg gaaaatcaca tcacaccata 3720 caactcaaca tacacaataa tagcatagaa cacacaaaca catacaaata tctaggaaca 3780 gtaatagatc caaaactcaa tttccacaaa cacatcgaac tactacgtaa caaggctaca 3840 aacagactaa atttcttgaa aattataagc acgcaaaaaa ataacataaa ccccaaaaac 3900 agcctaaaaa tataccgggc aacaatcaga aacactatgg aaacgggagc ttcatacact 3960 ctcaatagca acaaaaacaa atacaaaaca atgaattcaa ctatcaacca ggcactaaga 4020 aaagctacag gttgcaccaa aactaccccc ataaatactc tgcatgccat tgcagcagaa 4080 atacctttca atatcagaag tagatttata gtacgtaagg aactagccaa agatttagtc 4140 tactcaccaa tcattagaca acaactacaa atacataaaa gaacaaaata caaaaaaaag 4200 aaaaaaacaa tacatgaaac aacgtacgaa aaagacaata aaattctaaa acaattatac 4260 attgcaaaaa acaacgatat aaccacacac atcaaaataa atacaaaaat taaagacacg 4320 atagaaacaa aaacaaaaac aaacactcaa atcctgaaac aaattaccat tgaaaacatg 4380 aacaagtacg aaaataacag acccaccatc tacactgatg gaagcattgc cggaaataaa 4440 gtaggtatag ggatatacat aaaaaacaaa ccacaacatt actactatag ctatagatta 4500 aaaaatttca cctccataac caccgccgaa ctgatagcaa tcgaaaaggc cttactttta 4560 gcacaagaaa acaatataga caacccagta attttcacag acagtctaac atcatgtaac 4620 attctacaaa aagcaatgac agaaaacaaa atagaggaaa tctgttacaa tattataaaa 4680 acagcaacaa atattaaagc agatattgta tggattccat cacatgtagg cattgatgga 4740 aatgatagag cagatgaact ggcaaaaaaa ggcactttaa caaattggtt ctataagaac 4800 aaaatcagat acacagacgc tacccaatat tacaaaaaag aaatggaaga ggaaacaaaa 4860 aaatggtaca cagacctggg caggaataaa gggaaaaagt ttatgcaatt ccaaaatgtt 4920 ttcaaaacag aactatggca caaacaggtt aacctgaacg gaaatgaagt aaaaacaatt 4980 aacaaaatac tggcaggaca tgacctttcc gaattttggc ttcataaaat gaaaatagta 5040 gaaaatggaa cctgtgagaa atgtctagtc ccagaaacag gacgacacaa aatttttgag 5100 tgtcaaaaat ataaaagaca taaagacatc tccctagaca ctttgtacga aaaatggatg 5160 gaaaccagtg gaaatacctg caaaaaaatt atagaattca taaaagaaaa caatattaca 5220 ttatagtact tcatcacaca tatcttcaaa taaaaagaaa aattcaacaa ttttttttac 5280 acatttccta tatcttcaaa actcacaaca tagaacctgc actcaacact ccagcaaaaa 5340 tgatatagca ataggataag tacttactat ttatttatct atttattata tttattacat 5400 ctattgcatt tattatattt attctattta ttatatttat tatatttatc atatttatgt 5460 gttcaaagta tcaattttgt tcgtataatg aaacgttatt ccatgttttt ttttccttgt 5520 tagtcaatta tttaaaacta tttattcaat catttctcag caccctattg atttatttat 5580 ttatttatat actgtttttg gcatgactta tatatataca caaaaaatca cctaactgca 5640 cattccaaat ccaaacacca caccacaaaa aaaataacaa taacaataat aaaatttaat 5700 aaattcaata ataaatccga ttttacacgc atcagggggg cctcctaagt ttgggtactc 5760 cgtttgttct tgcaaacatg agaaatccaa agacatccat ttccatcact agaggataga 5820 aataagatgc caaaaggacc aactgaaagg gccttgtatc aaatcccccc cccccaaaaa 5880 aaaaaaaaaa aatccaaaat aaaaaaaaaa ataaaaaaaa aaaaaaccct tttggaacct 5940 tcacaaaaca agccatttca aaaataataa taataaaaat aaaaataaac aacccaatac 6000 aagaacaaga accggtacaa aaaaaaaaaa aacaaaattc ctcatcctct cccaagaaat 6060 aaaatatata tatatataca aacaccacaa aaaaaaaaga agaaaatcaa atccactaaa 6120 ccacaagcga acttaaagat ggaacaacgc tagcttttga atacaaactg aaaaggaaac 6180 acaaaccatc aaaaagaaaa aacgattggt gatatcacga aaaccaacac actgtgcatc 6240 aatcagaaag atctcaacca caacatataa acatcaacga aacagcaata taaacatcaa 6300 aggagaatgg atgatttccc ctccaaaaac tggtagtcaa acctctgatt gtaccttcaa 6360 aaacggatgg tataggaggc ctcaacctac caaccctatt aagaagaaga a 6411 // ID GYPSY3-I_AG repbase; DNA; ANG; 4326 BP. XX AC EAA13694; XX DT 02-MAY-2003 (Rel. 8.04, Created) DT 02-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY3-I_AG is an internal portion of the GYPSY3_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY3-I_AG; GYPSY3-LTR_AG; GYPSY3_AG; Gyspy clade; integrase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4326 RA Pavlicek A., Kapitonov V.V., Drazkiewicz A. and Jurka J.; RT "GYPSY3_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Direct Submission to Repbase Update (30-APR-2003). XX DR [1] (Consensus) XX CC GYPSY3_AG is a family of Gypsy-like LTR retrotransposons. CC GYPSY3-I_AG, an internal portion of GYPSY3_AG, is flanked by CC GYPSY3-LTR_AG LTRs. The GYPSY3-I_AG consensus sequence was CC reconstructed based on multiple alignment of 6 copies; they are CC less than 1% divergent from the consensus sequence. The A. CC gambiae CC genome contains about 20 copies of GYPSY3_AG. CC Some copies of GYPSY3_AG are 100% identical to each other. CC They can be active retroelements. CC The consensus sequence encodes the 1410-aa GYPSY3_AGp protein CC (pos. 80-4309), with reverse transcriptase (pos. 526-694) CC and integrase (pos. 1077-1223) domains. XX FH Key Location/Qualifiers FT CDS 80..4309 FT /product="GYPSY3_AGp" FT /translation="MNNENLETVINQMAQLLQQMAAFKTRSPDEILESLSK FT TIEEFRFDEENNITFEKWYVRFKDLFQNDASSLNDEAKVRLLLRKLETSAH FT SRYVNYILPKQPHENSFEETVNILKKIFGKQYSVFHKRYQCLQIVKSALED FT IITYGGRVNKACEDSEFENLKLDDFKCLIFICGLQAPEFSDIRARLLSRIE FT NATPEAKVNIQTLMAEFQRLINLKADTTMIENQSRSKHSVHAVSEKKPYRL FT PQPSRFDNSKDQRQPKSDSNTVPRTPCWQCGKMHFVRDCNFTDHLCKVCKN FT VGHKEGYCKAMRSKSSNNNTYQKSEKNQNQNQKKSQGIFVNHVTKNTAKRK FT FISIIINGITTSLQLDTASDITVISKTTWQHLGQPKLSQASIEASNASGEQ FT LKLIGEFECEVILNAITEKCVCFVTSSPVLNVIGIDWIDRFNLWAIPFDVL FT CKKVSSACPKDRIVQLQSKYPTVFDDSLGHCTKTKVKLFLKPNVKPVFCPK FT RPVPFNTIALVDAELSRLQSLGIITPIDFSEWAAPIVAIRKPNGKVRICAD FT YSTGLNEALESNHYPLPTPEEIFAQLNGSVVFSIVDLSDAYLQVEVEDESK FT HLLTINTHRGLFQFNRLAPGVKSAPGAFQRLVDGMIADIPGVRTFLDDAII FT FGKTWEAHKQSLDTFLQRLKEYGFHVKLEKCHFYQTEIVYLGHVVDRNGIR FT PDPEKLKTIASIPAPTNISELRSFLGAVNFYGRFVRNMHELRHPLDQLLKK FT DTKWKWNSDCQTSFEKFKKVLQSDLLLTHYDPNLPIIVAADASSTGVGAVI FT FHKFPNGYLKAIQHASRTLTSAEQGYGQPEKEALALTYGVTKFHKYLLGRK FT FTLLTDHKPLLSIFGSKKGIPLHTANRLQRWALMLLNYDFTIEYVSTTEFG FT CADMLSRLIDRCKQPEEDYVIASISLEEDIQNVMHESLKQVPVSFADIQKA FT TRLDETLQAVLKFIREGWPNEASSIKNQDIRSYYTRKESLTHVDGCILFHN FT RVVVPNIYRKKVLQQFHRGHPGMVRMKSISRSFVFWPGMDVDIENFVRRCT FT SCCTAGKAPIKETPESWPVPEKPWSRVHVDYAGPVDGVYFLVVVDPYTKWP FT EVYATKTTTAKTTIKFLTQSFATFGVPETIVSDNGTQFTSFEFQAFCKQLG FT ICHIRTAPYHPQSNGLAERFVDTLKRTLRKIRAGGETLDEALQTFLQVYRT FT TPTPDKSPAELMFQRPIRTVQSLLLPPVSRSRNEKPEAGRKFFDPGEAVYA FT QVHRNNSWEWKPASIVERVGRVNYNVFLEDNQRIIRSHANQLKKRLQEETS FT CGNRNCHDTGNLLTIFFDEFELGTPQAVENSIEMTEQEHYYSADEESAEEI FT EGENWQTSSSTVTTQATPPMPASSAQPVKAERPRRIRRPPARFEPYW" XX SQ Sequence 4326 BP; 1395 A; 905 C; 893 G; 1133 T; 0 other; ttttggcgac gaggattatt actgaacccg gaacccgaag ctcacgagat tctgaagcta 60 acgagaatta gcgtcaagga tgaacaatga aaaccttgaa accgttatca accaaatggc 120 tcaacttctt caacagatgg ctgcttttaa aactagaagt cctgacgaaa tccttgaatc 180 tttatcgaaa accatcgagg agtttcgttt cgatgaagaa aacaacatca cttttgaaaa 240 atggtatgta cgtttcaaag acctttttca aaacgatgcc agcagtttga atgacgaggc 300 gaaagtgcga cttttgctga ggaagctgga aacgtccgct cacagccggt atgtgaatta 360 cattcttccg aagcaaccgc atgaaaattc gtttgaagaa accgtcaata ttctcaaaaa 420 aatcttcgga aagcaatatt cggtgttcca caaaaggtac cagtgcttgc agatagtgaa 480 atctgcatta gaagatatca tcacttacgg cggaagagtg aataaggcat gcgaggattc 540 tgagttcgag aatttgaaat tagacgattt caaatgtcta attttcattt gcggactgca 600 agctccggaa ttctcggata tcagagcaag actgctatct cgaatagaaa acgcgacgcc 660 ggaggctaaa gtgaacattc aaacactaat ggcagaattt caacggctta tcaatctgaa 720 agcggataca acgatgatcg aaaatcaatc aagatcgaag cattccgtgc atgcagtttc 780 agagaagaaa ccgtatcgtt tgccacagcc ttcccgattc gacaattcaa aagatcagcg 840 tcaaccgaaa tcagattcga atactgtccc tcgtactcct tgttggcaat gtggaaaaat 900 gcattttgta cgggattgca attttacaga tcatctctgc aaagtgtgca aaaatgtagg 960 ccacaaggaa ggttactgta aagccatgag atcaaaatct tcaaacaaca acacatatca 1020 aaagagtgag aagaatcaaa atcaaaatca aaagaaatct cagggaattt tcgttaatca 1080 tgtcacgaaa aacactgcta aaaggaaatt tatctctatc ataatcaatg gcataactac 1140 atcactccaa ctggacaccg caagtgacat aacagttatt tctaaaacaa catggcaaca 1200 tttgggtcaa ccgaaacttt ctcaagcatc cattgaagca tcgaatgcat cgggagaaca 1260 actcaagctt atcggtgaat tcgaatgcga ggttatcctt aatgcgatta cagagaaatg 1320 tgtttgtttt gttacatcat caccagtcct aaatgttata ggtatagatt ggatagatag 1380 gtttaatcta tgggcaattc ccttcgatgt actgtgtaag aaagtttcat cagcatgtcc 1440 aaaggatcga atagtacagc ttcaatcgaa atatccaaca gttttcgatg attcgcttgg 1500 acattgcaca aaaacgaagg tgaaactgtt tctgaagccc aatgtaaaac cagttttctg 1560 tccaaaacga ccagtaccat tcaatactat cgctcttgtg gatgcggaac tatcaagact 1620 acaatcacta ggcatcatca caccaataga cttttcggag tgggctgctc cgatagtagc 1680 cattcgcaag cctaatggta aagttcgaat ttgtgcggat tattcgacgg ggcttaacga 1740 agctttagag tcaaatcatt atcctctacc tactcctgaa gaaatttttg cacagttgaa 1800 cggaagtgtt gtattcagca tagtcgattt gtccgatgca tacctacaag tagaagtaga 1860 agacgaatct aaacaccttc ttacgatcaa cacacataga ggtctttttc agttcaaccg 1920 tctcgcaccg ggagtcaaat cagcacctgg agcttttcaa agactagttg acggtatgat 1980 tgctgacatt cctggcgtta gaacatttct agatgatgct ataattttcg gtaaaacttg 2040 ggaggcacac aagcaatcat tagacacatt tcttcagcgt ttaaaagaat atggcttcca 2100 cgtcaagctg gaaaaatgtc atttctatca aactgaaatt gtctatttgg ggcatgtggt 2160 agatcgcaac ggcatacgtc ctgacccaga gaagctgaaa accatcgcat ctattccagc 2220 accaacgaac atttccgaat taagatcgtt cttaggagct gtgaatttct acggccgatt 2280 tgtgagaaac atgcatgaac tcagacaccc gttagatcaa ttgctaaaga aggatacgaa 2340 gtggaagtgg aattcggatt gtcagacatc gttcgaaaaa ttcaagaaag tactgcaatc 2400 tgacttactg ttgacacatt atgatccgaa tcttcctatc atcgttgcag cagatgcgtc 2460 tagcaccggc gttggcgcag tcatttttca taaatttccg aacggatatt tgaaagccat 2520 tcaacatgct tcaagaacat taacgtctgc agaacaaggc tacggacaac cggaaaaaga 2580 agctctggct cttacttatg gggtaaccaa atttcataag tacttacttg gtcggaaatt 2640 tactctttta accgatcata agcctttact atccatattt ggttcaaaaa agggaattcc 2700 tttgcacaca gcaaaccgac tccaacggtg ggcattgatg ttattaaact acgattttac 2760 catcgaatat gtttcaacca ccgagtttgg gtgcgctgac atgctatcaa ggctaatcga 2820 ccgttgcaag caaccagagg aagattacgt tatagcttcg atttcacttg aagaggatat 2880 tcaaaacgtc atgcacgaat cattgaaaca agtaccagta tcatttgctg acattcaaaa 2940 agcaacaaga ttggacgaaa cactgcaagc tgttttgaaa ttcatccggg aaggctggcc 3000 gaacgaagcg tcgtcaatca aaaatcaaga cattcgttcg tattacacac gaaaagaatc 3060 gttaacgcat gtagatggat gtatcttgtt tcacaacaga gttgttgttc caaacattta 3120 caggaaaaag gtacttcaac aattccatcg cggtcatcct ggaatggtgc gaatgaaatc 3180 catttctcga agtttcgttt tctggccggg aatggatgtc gatatcgaaa actttgtccg 3240 gcgttgtact tcatgttgca ctgctggcaa agcaccaatc aaagaaacac ctgaatcttg 3300 gcctgtgccg gaaaagccat ggtcacgggt acacgtggac tatgccggtc ctgtagatgg 3360 cgtgtatttt ttggttgtag ttgatccata cacaaaatgg ccagaagtat atgcaaccaa 3420 aactacaaca gcaaagacaa ctatcaagtt tttaacgcaa tcattcgcaa cgtttggtgt 3480 tccagaaact atagtctctg ataatggtac ccagttcacc agctttgaat tccaagcatt 3540 ttgcaaacaa ttgggtattt gtcacatccg tactgcacca taccatccgc agtcaaacgg 3600 tttggcagaa cgtttcgtgg acacactaaa gcgtacattg cgaaaaattc gagcaggagg 3660 agagacttta gatgaagctt tgcagacttt tttgcaagtt tatcgaacca cacctacacc 3720 agataagtct ccagctgaat tgatgtttca gcggcctatt cgaacagttc agtcattatt 3780 actacctcca gtttcacggt cacgaaatga aaagccagaa gctggtagaa aattcttcga 3840 tcctggagaa gcggtttatg cccaagtcca ccgcaataat tcctgggagt ggaaaccggc 3900 atcaatagtc gaacgagtcg gtagagtgaa ttacaacgtg tttttggaag acaaccagcg 3960 aattattcgc tcacatgcaa atcaactgaa gaagcgtcta caggaggaaa catcgtgtgg 4020 aaatagaaat tgtcacgaca ctggaaattt gttaactatt ttctttgacg aatttgaatt 4080 gggaactcca caggcagttg aaaactcaat tgaaatgact gaacaagaac attactattc 4140 agctgatgaa gagtccgctg aagaaattga aggagaaaat tggcaaacat catcgtccac 4200 agttactact caagccactc caccgatgcc tgcatcttct gctcaacccg tgaaagcgga 4260 aagacctcga agaattcgta ggccaccagc caggtttgaa ccgtattggt gaattaaggg 4320 gggaga 4326 // ID GYPSY57-I_AG repbase; DNA; ANG; 4597 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY57-I_AG is an internal portion of retrotransposon GYPSY57_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY57-I_AG; GYPSY57-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY57_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4597 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY57_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 153-153 (2004). XX DR [1] (Consensus) XX CC GYPSY57_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY58_AG, GYPSY59_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY57-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. The consensus encodes the 1483-aa CC GYPSY57_AGP gag-pol like polyprotein (pos. 135-4583). The CC sequence of the LTRs flanking GYPSY57-I is deposited as CC GYPSY57-LTR_AG. CC GYPSY57_AGP: CC MFNPEGDAEMDEQRRLSLNGARALNPGFVQPTQSAPMQPAQPPSQNPVASSAQAGLAMNGSETG CC MFLQMLNLMQQQMQQQQQQLSQQQQMMAHMMQQTPHTTAQPQQQAQQAQQYAQQPAAPRNPELI CC MDALAGSINEFRFEAESGVTFPAWYARYEDLFVQDASRLEDSAKVRLLIRKLGTAEHARYCNFI CC LPRVPRDLTFDDTVSKLKALFGSAESLLSKRYKCLQIVKAHNEDLMTFACRVNRACVDFQFSAM CC TEEQFKCLMLVCGLREEGDAEIRTRLLARIEDKADLTLEQLSAEAQRITSLKVDSAMIATEGNE CC RVFAMRGDYKQRYAQQTQRSFSKQQHPQQADRSAPAAKPPGPCWLCGDHHWIRDCSYRSHKCED CC CGKYGHREGHCNTAGRKKRKNNFYKRSSVATRVVSVNIFSVQQNRKYVTIHIGQKSARLQLDTG CC SDITIIGRKTWKQLGMPTLSPAQVHAKTATGAHIQLDGEFEADITINGRTETAIIRVIPSDLQL CC LGADLIDIFSLAAMPMNQFCNRIESDSAKWEKRFPAVFHGTGLCTKANIKLQLKENVRPIFCPK CC RPVAYAMQATVEKELDRLQALKVITPVDYSEWAAPIVVVRKGNGSIRICGDYSTGLNSALRSYE CC YPLPLPDDIFAKLAQSRFFSKIDLSDAFLQVEIDAKYRPLLTINTHRGLYYYNRLPPGIKIAPA CC AFQQLIDTMLAGLKGTSGYMDDVIVGGKTEREHDENLLNLFRRIQEFGFTIRADKCAFKMQQIE CC YLGFIVDSRGLRPNPAKIDIIQKLPTPTNISEVRSFLGAINYYGKFVSKMRDLRYPLDMLLKDE CC SKFLWTRECDRAFSKFKEILSSDLLLTHYDPNAEIVVSADASAVGLGATISHKYADGSIKVVQH CC ASRALTQAETRYSQIDREGLAIIFAVTKFHKMLYGRHFCLQTDHRPLVRIFGSKKGIPVYTANR CC LQRFALTLQLYDFSIEYVPTTKFGNADILSRLIREHAKPEPEYIIASAELEEDVSFVATNSINA CC FPLNFRDVARATNTDAVLKKVHQYVMEGWPRNLSYGADLARFFNRSEALSSVRGCILLGERVVI CC PTRLREKCLKQLHEGHPGMQRMKALARSYVYWPSLDEDISECVKTCHACAVAAKSPPHTKPLSW CC PATKQPWERIHIDYAGPINGEFYLVIVDANTKWPEVIKANSTSTAGTIAILRGVFARFGFPITL CC VSDNGPQFSSTAFTDYCCSKGIQHVTSPPFHPQSNGQAERFVDTFKRSVKKIEEGGASHGEALD CC IFLQAYRSTPNAALENHRSPAEAMFNRKIRTAMELLKPPPTTETLPEEHGRGFKSGDCVYAKWY CC SRNSWKWVPAQVVRIIGNVIYEVKTDDGRMHRRHVNQLRKRDIGSSEQSTNLAARTQLPLDLLI CC EGPRDVVPVEHPTVISSPLLPSSSDGVRGMTSSPPLPSPRPVPVQRRRRQQPVQSPRRSNRSRR CC FPRRLDGYLLQ. XX SQ Sequence 4597 BP; 1309 A; 1130 C; 1135 G; 1023 T; 0 other; gtggcgacga gttgaaaaaa aaaagacttt ttcagtgcag tgaagtttag tacgaaaggc 60 attgttaaag tgttcgcgtg tgaagtgcat cctttaccgg aaaagagagg caatttcatc 120 gaagcggcat taaaatgttt aatcccgaag gcgatgccga gatggatgag cagcgacgtt 180 tgtctctgaa cggtgcaagg gctctgaacc cgggctttgt tcagccaacg caaagtgcac 240 cgatgcaacc agcacaaccg ccatcgcaaa atcccgttgc ttcatcggca caagcaggac 300 tggccatgaa cggttctgaa acgggaatgt ttttgcaaat gctgaatctt atgcagcagc 360 aaatgcagca gcagcagcaa cagctatctc agcagcaaca aatgatggca cacatgatgc 420 aacaaacacc acacacgaca gcgcagcccc aacaacaagc ccagcaggct caacagtacg 480 cacagcagcc cgcagcacca aggaatccag agttgattat ggatgctttg gcaggaagca 540 tcaacgagtt tcggttcgag gcagaatccg gcgtgacgtt ccctgcatgg tatgcacgtt 600 acgaggattt gttcgtgcaa gacgcctccc gactagagga ttcagcgaaa gtgcggctgc 660 tcattcgcaa attgggcaca gcggaacacg cgcggtactg caacttcatc cttccccgtg 720 ttccccgcga cctcacgttc gacgatacag ttagcaaact gaaggcccta tttggaagcg 780 cggagtccct acttagcaag cgatacaagt gtttgcaaat tgtgaaagct cataacgaag 840 acctgatgac attcgcatgt cgggttaatc gcgcttgtgt tgatttccaa ttcagcgcta 900 tgaccgagga gcagttcaaa tgcttaatgc ttgtttgtgg gttgagagag gaaggcgatg 960 ctgaaatccg cacacggctt cttgctcgca tcgaggacaa ggcagacctc acgctggagc 1020 agctttctgc cgaagctcag cgcatcacaa gcttgaaggt ggacagcgcg atgatagcca 1080 ccgagggaaa cgagcgagtt ttcgctatgc gcggtgacta caagcagcgg tacgcgcagc 1140 agacacaacg cagcttcagc aaacagcagc atcctcagca agcagacaga agcgcaccag 1200 cagcgaaacc accgggtcca tgctggttat gcggcgatca tcattggata cgagattgtt 1260 cgtaccggtc gcataagtgc gaggattgtg gcaagtatgg gcaccgcgaa gggcattgta 1320 acactgcagg gcgaaagaaa agaaaaaata atttctacaa gcgctcttca gtagcgacaa 1380 gggtagtttc cgttaatatt tttagcgtac aacaaaaccg gaaatatgtg actatccaca 1440 ttgggcagaa atcggccagg ttacagctcg acactggctc ggacattacg atcataggac 1500 gaaagacctg gaaacagctc ggcatgccaa ctctatctcc agcgcaagtg cacgcgaaaa 1560 cggcaactgg tgcacacatt cagctggatg gcgaatttga agcggatatc acgattaacg 1620 gaaggactga aacggcgata ataagagtca ttccgtcaga tttgcagcta ttaggagccg 1680 atctaattga tatcttttcg cttgcagcca tgccgatgaa ccagttttgc aacagaatag 1740 agagcgattc tgcaaaatgg gagaagcgat ttccagctgt tttccacggt acaggtttat 1800 gcacaaaggc caatataaag ctgcaattga aggaaaatgt ccgtcctatt ttctgtccga 1860 agcgtcccgt agcttacgct atgcaggcaa cagtcgagaa agagctagac aggctacaag 1920 cactcaaagt cattacgcca gttgactatt ccgagtgggc cgcaccgatt gtcgtcgtta 1980 gaaaggggaa cggtagtatt cgcatttgtg gggattactc caccggacta aattcagctc 2040 ttcggtcata cgaataccca ttaccgcttc cagacgatat tttcgctaag ctagcccaaa 2100 gcagattctt tagcaaaatc gacctttcag acgccttttt acaggtcgaa attgatgcca 2160 aataccgacc cttgctaacc ataaatacac atcgcggctt atattattac aatcgcctgc 2220 cgcccggcat caaaatagct cctgcagcat tccagcagct aatagacacc atgcttgctg 2280 gactaaaagg aacatccgga tatatggatg acgtaatcgt tggtggcaaa acggaacgcg 2340 agcacgatga aaatctgctg aatctattcc gccgaataca ggaattcgga ttcactatcc 2400 gtgcagacaa atgtgcgttc aaaatgcagc agatagagta tctgggtttc atcgtcgata 2460 gtcgtggcct tagaccaaac ccagcaaaaa tcgacataat tcagaagctg ccaacaccaa 2520 caaatatcag cgaggtacgt tcattcctag gcgcgatcaa ctactatgga aagttcgttt 2580 cgaaaatgcg cgacctacga tacccgctag atatgctatt gaaagacgag agcaaatttt 2640 tatggactcg ggaatgtgat cgagccttca gcaaatttaa ggaaattctc tcgtccgatt 2700 tactgctgac acattatgac ccgaacgccg aaattgtggt atctgcagat gcatcagcag 2760 ttggacttgg tgcaacaatt agccataaat acgccgatgg ctcgataaag gttgtacaac 2820 atgcatcaag ggcgctaacg caagccgaaa caaggtacag ccaaatagac cgcgaaggtc 2880 ttgccattat ttttgcggtc actaagttcc acaagatgct gtatggtcgt catttttgcc 2940 tgcagacaga ccaccgtcca ctagtaagaa tcttcggcag taaaaaaggt attccggttt 3000 acacggcaaa cagactgcaa cgattcgcac tgaccttgca gctgtatgac ttcagcatcg 3060 agtatgttcc cactactaag ttcggtaacg ccgacatact atcgagactg ataagggaac 3120 atgccaaacc ggaaccggag tacatcatag ccagcgccga actcgaggag gatgtaagtt 3180 ttgtagcaac aaattccatt aatgcatttc ctcttaattt tagagacgtt gccagagcta 3240 caaatactga cgcagtcttg aagaaggttc accagtacgt catggaaggg tggcctcgaa 3300 acctgagcta cggtgcagac ctggcacgct ttttcaatcg aagcgaagcc ctttcttcgg 3360 ttcgaggatg cattttgctc ggggagaggg tagtgatccc taccagatta cgcgagaaat 3420 gtctgaagca gctacacgaa ggccatcccg ggatgcaaag gatgaaggca cttgcccgga 3480 gttatgtata ctggcctagc ttggatgagg acatttccga gtgcgttaag acttgccatg 3540 catgtgcggt agccgcaaaa tcgcctcctc atacgaaacc attatcctgg ccagccacta 3600 aacagccttg ggaaagaatt cacatcgatt atgcagggcc cattaacggc gagttttatt 3660 tggtgatagt cgacgcaaat actaaatggc cagaggttat aaaggcaaat tcaacatcga 3720 ccgctggaac aattgctata ttaagaggcg tatttgctcg ttttggtttc cccattacac 3780 ttgtgagcga taacgggccg cagttttcga gtactgcatt taccgattac tgttgtagca 3840 aaggcataca acatgttaca agcccaccgt ttcatccaca gtcgaatggg caagcagaac 3900 gtttcgtcga cacctttaaa cggtcggtca aaaagataga agaaggtgga gcgtcccacg 3960 gagaagctct cgatattttc ctccaagcgt accgatcaac gcccaacgcc gctctcgaaa 4020 accatcgctc gcctgcagag gcaatgttta atcgaaaaat aagaacagcg atggaactgc 4080 taaagcctcc gccgaccact gaaacgctgc ctgaggaaca tggtcggggc ttcaagagtg 4140 gtgattgcgt ttacgcgaag tggtattcgc gcaactcttg gaaatgggtt cccgcgcagg 4200 ttgttcgcat tatagggaat gtgatatatg aggttaagac ggatgacggc cgtatgcatc 4260 gccgtcacgt aaaccagctg cggaaacggg acatcggcag ttcagagcaa tcaactaatc 4320 tcgcggctcg tacgcaattg ccgctagatc tgctcatcga aggaccaaga gatgtcgtgc 4380 cagttgagca ccctacagtg atatcctcac cgctcctacc atcatcatcc gacggggtcc 4440 gtggtatgac ctcctcaccg ccgttaccat ctccacgacc agtacctgtc caacgaagaa 4500 gacgtcagca gccggtacaa tcgccacgcc ggtccaacag gtccagacga ttcccgcgta 4560 ggcttgatgg gtacctgctg caataaaaag ggggaga 4597 // ID Clu-38_AG repbase; DNA; ANG; 422 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-38_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-422 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1440-1440 (2010). XX DR [1] (Consensus) XX CC 3bp TSD. >96% identical to consensus. XX SQ Sequence 422 BP; 140 A; 82 C; 79 G; 121 T; 0 other; aggcagcggt ctaatcttgt tgcgcgcaac attgttgctc ggcaacatat cgctgaggtg 60 gtctatgaat gttgctggca acatttcgct ttgacattgt ttagcgtaaa aaattgccaa 120 ttaacagctc agagtttatt ttttatcatt cacttgttga ataaagtgtt gaaaaaataa 180 aatatacacc aaaaaatgta ttataaatgc ttaaaatcca aatttagcgg atgtcaacaa 240 aacgcacact gctgctgtgt catttcatac acccgattac catgaccaag gtaaatagaa 300 aatgttgcca gacaaaaatc aaattggttt gaatttggcc ggcaacaaag ttgcgctagc 360 tgtcaaaatc catacatttt gccagcaaca atgttgcgcg caacaagatt agaccagtgc 420 ct 422 // ID GYPSY65-I_AG repbase; DNA; ANG; 4780 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY65-I_AG is an internal portion of retrotransposon GYPSY65_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY65-I_AG; GYPSY65-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY65_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4780 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY65_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 169-169 (2004). XX DR [1] (Consensus) XX CC GYPSY65_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, CC GYPSY64_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY65-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. The consensus encodes the 1331-aa CC GYPSY65_AGP gag-pol like polyprotein (pos. 689-4681). The CC sequence of the LTRs flanking GYPSY65-I is deposited as CC GYPSY65-LTR_AG. CC GYPSY65_AGP: CC MEPNKLFGNLPPLSLAGVPLTERRTKWTTWKRGFEIWLRTAKIVDGTEKKNLLLTCGSLELQEI CC FFSIPDADVEADVDQGVDPYEIAILKLDEYFAPQRHEAHERFLFWSMKPEPEETLEKFVMRAQA CC HGSKCNFGATAAESAGIAIIDKVLQFVPANLRVKLLQEKNLSLDEVIKQINSYETSRVANEQMS CC GRSKSFEIDNQHIHHIKTLCRFCGRTHGQQSCPARDKTCAKCGKRGHFAVVCYSNAGQTSNRLH CC TNQRMQKRPFGGHFNDSELNKKTAKFARKIHAIQDDGNEPLECELVEMVSSANDSDELIWAKVG CC GILIEMQIDSGVQSNIIDDRTWASMRNNQVKIIGEARSPDRKFKAYAQTDCLEIITMFEAEIAI CC ADGLKELRTISKFYVVKNGPQPLLGKITAKQLGVLYVGLPSQHESINQVETFRAFPSIRGVSIH CC IPIDKTVTPVAQRLRRLPLPMLDKVDSKLNELVAKDIIEKVSEPSSWVSPMVIVVKDCGSIRLC CC IDMRQVNKAILRETHPLPTIEDIRWRMNGAKYFSRLDIKDAFHQLLLDEESKTLTTFITHRGLY CC RYKRMMFGISCAPEKFQKILEQVLADCPNTVNFIDDIIVTGKTETEHDLALEKVMEKIEEYGIL CC LNQSKCVFKLTEIEFVGQRFNQNGMLPALNKIEAIRSFRPPKNCEEVRSFLGLITYVGTFIPNL CC ATVSFPLRELTKNNAEFVWERDQNKAFNELIRLVSNVERLAHFDPHLKTRVVADASPVGLGAVL CC LQFQNEQPKVIAYASKSLTSTEQRYAQTEKEALALVWAVERFQIYLFGIRFELETDHKPLEAIF CC SPTSSPCLRIERWVLRLQAFSYDVIYRKGKTNIADPLSRLSSPTEVSEFDPDSTVYIRSVMENA CC AIDVQEVEIASSNDAEMRALKECLERGAWNYTDELLKPYQAFRLELGTVGDLVVRGSKLVIPKA CC IRQRMLELAHEGHPGRTKMQQRLRCTCWWPGMDEAIARLVSSCSGCQLVSQPDKPEPMKRRPLP CC HKPWVDVAIDFLGPLPTGDYLLVIIDYFSRYKEVEIVKKITAIETSERLERIFVRLGYPRTITL CC DNGRQFVSNYFDNYCKQRGILLNKTTPYWPQENGLVERQNRSLLKRLKISQAQNGDWKKDLGAY CC LSMYYATPHSTTGKTPSELMFNRNIRTKIPSLGDISTGSALPMSDYRDRDTCMKEKGRVTEDQR CC RKAKPSDIIVGDRVLLRNTMPGNKLSTPFGPQIAKVIEKQGSRVTVQDEQNGKLYDRNSSHLKE CC YKDPDDHRDGCERVVDNQEEGVGFNEPTTEEIPRTLSRPKREIKRPARFLS. XX SQ Sequence 4780 BP; 1616 A; 902 C; 1148 G; 1114 T; 0 other; aattggcgac gaggacagga ttaattttct ttattcctat ttgcaacacg agattgaagc 60 atttgaattg aaataaaata gggtaagtta tcagagaagg acgaaagaaa tggggaaaag 120 aaagattaag agcagaaatg aaataaagcg ttgaaagcat catagtgtta gtgctggctt 180 gggttatgcg gagtacaaag gaagacaaaa ggataagtga atgctctgtg tcgagaaaac 240 agaatggccg cctcggggga aaaaaaaaac aaaagagatg tgtagaaaga gctgtgtcaa 300 gcaaacagca gtgaagcgtt atggaagcga aaggcaaaat tataagatct ataaaaagcg 360 ctgtgtcttg aaaacagcaa ggccgcgtct tgcaagcgga tgaaataatt gaagatgcct 420 ttaatgagcc gcgtctggaa gcggtagtgg aaaatgaaga agaggaagaa agagaatcaa 480 ataagtacgc gtctagagag cggaaaaaaa agcgaagaaa aggaaggaag agaatttaat 540 aaatgcgcgt ctagagagcg aggaaaaaaa ggaaatactc gatattaata gttatgactc 600 gggaaagagt catgaatctg tcatgaaata gtgaactaat tgaatacttg atcgtattca 660 caacctgctt catttcaggg aatcagcgat ggaaccaaac aaattatttg ggaatttacc 720 accactgtcg ttagcaggcg ttcccctaac cgagcggcga acaaaatgga cgacgtggaa 780 acgggggttc gaaatttggt tgcgtacggc gaagatagtg gacggaacgg agaaaaagaa 840 tttactttta acttgtggtt ctctcgagct gcaagaaata tttttcagca ttccagatgc 900 agatgtagag gcggacgtag accagggcgt agacccatac gagattgcta ttcttaaatt 960 ggatgaatat ttcgctccac aaagacacga agcacacgag agatttttgt tttggtcaat 1020 gaaaccggag ccggaagaaa cgctagagaa atttgtaatg cgtgcccaag ctcatggctc 1080 caagtgcaat ttcggggcaa ctgcggcaga gagtgcaggg attgcaatca ttgacaaagt 1140 gcttcagttt gtaccggcaa atttgcgagt caaacttctg caagaaaaaa acttgtctct 1200 ggacgaagtc attaaacaaa taaattcgta tgaaacatcc cgagttgcca acgaacaaat 1260 gagcggtagg agcaaatctt ttgaaataga taaccaacac atccatcata taaaaacatt 1320 gtgtcgattc tgtggtcgta cccatgggca acaatcttgc ccggcacgag acaaaacgtg 1380 cgcaaaatgt ggcaagcgag gtcatttcgc tgttgtatgc tactcaaatg ctggacaaac 1440 ttcaaataga ctacatacga atcaaagaat gcaaaagagg ccgttcggag ggcactttaa 1500 tgattcggaa ctaaataaga aaacggcaaa gtttgcaagg aaaattcacg caatccagga 1560 tgacggaaat gaacctcttg agtgtgaatt agtggagatg gtatcttcgg caaacgattc 1620 agacgaattg atctgggcga aagtaggcgg tatccttatc gagatgcaaa ttgattctgg 1680 agtccaatcg aatatcatag acgatagaac gtgggcgtca atgaggaata accaagtcaa 1740 gatcatcggc gaggcacgga gtccggatcg gaagttcaag gcatacgctc agaccgattg 1800 tttagaaata attacgatgt ttgaggcgga aatagctatt gcggacgggt tgaaagaact 1860 ccgtacaatt tccaaatttt acgtagttaa aaatggtcca caaccacttt tgggaaaaat 1920 caccgcaaaa cagcttggag tattatatgt tggattacct agccagcatg aatcaatcaa 1980 ccaggttgaa acatttcgcg cgttcccctc aattcgtgga gtcagcatcc acattccaat 2040 tgataagacg gtgacacccg tagctcagcg actccgacga ttgcctttgc ccatgctaga 2100 taaagttgat agtaaactaa acgagttggt cgctaaagac ataatagaga aagtatcgga 2160 accgagcagc tgggtttcac ccatggtgat agttgtaaaa gattgtggaa gtattcggct 2220 ttgcatagac atgcggcaag ttaataaagc aattttgcga gaaacgcatc cactaccaac 2280 catagaagac atacgttgga gaatgaatgg tgctaaatac ttttctcgcc ttgacataaa 2340 ggatgctttt catcaactac tgctcgatga agaaagtaaa actctcacaa cttttatcac 2400 ccaccgtgga ctttatcgtt ataaacgaat gatgtttggg atttcatgcg ctccagagaa 2460 attccaaaaa atattggaac aagttttggc cgactgtcca aacacagtca attttatcga 2520 cgatatcata gttacgggaa aaactgagac cgaacacgat ttagcactag agaaagtgat 2580 ggaaaagatt gaggaatatg gcattttact taatcaatct aagtgcgttt tcaaactcac 2640 agaaatcgaa ttcgttggcc aacgttttaa ccagaacgga atgctccctg cactaaacaa 2700 aattgaagca ataagaagct ttcggccacc caaaaattgt gaagaggttc ggagttttct 2760 gggattaatt acttacgtgg gaacgttcat tccaaactta gctacagttt cattcccact 2820 acgcgaatta acaaagaaca acgccgaatt cgtttgggag cgagaccaga acaaagcatt 2880 taacgaactg atccggttag tttctaacgt cgaaagatta gcccattttg atcctcatct 2940 aaaaacccga gtagtggcag atgcgtcccc agtagggtta ggagccgtcc tccttcagtt 3000 ccagaacgag cagcctaaag taatagctta tgccagcaag agcctaacga gtaccgagca 3060 acggtatgca caaacagaaa aggaggcact agcgttagta tgggccgtgg aacggttcca 3120 aatctatcta ttcggaatac gattcgagct ggaaacggac cataaacctc tagaggctat 3180 ctttagcccc acatcttctc cttgtttgcg gattgagcgt tgggtcttaa ggctgcaggc 3240 ttttagctat gatgtaattt atcggaaagg aaaaacgaat atagctgatc ctctgtctcg 3300 gttgtcaagt cctactgaag tatcagaatt tgatccagat tctacagtat atatccgcag 3360 cgtgatggaa aatgctgcaa tcgatgtgca agaggtagag attgcttctt cgaatgatgc 3420 cgaaatgcgg gctcttaagg agtgtttgga aagaggtgcg tggaattata cagatgaact 3480 actgaaacct taccaagcat ttcgattgga actcggaaca gttggagacc tagtagtccg 3540 cggtagcaaa ctagtaatac cgaaagctat ccggcaaaga atgctagaac tggcacacga 3600 agggcatcct ggacgcacta aaatgcaaca acggctgcga tgtacttgct ggtggccagg 3660 aatggatgaa gctattgctc gactggttag cagctgctca ggctgccaat tggtcagcca 3720 accagataaa ccggagccca tgaagcgaag gccactacca cataagcctt gggtagatgt 3780 agcaatcgat tttttgggcc cactaccaac tggtgattat ctactagtaa tcattgatta 3840 cttcagtcgg tacaaagagg tcgaaattgt taagaaaatt acagcgattg aaacatctga 3900 acgacttgaa cgaatatttg tgcgtcttgg ctatccgaga accataacac tcgacaatgg 3960 aaggcaattt gtgagcaatt attttgacaa ctattgcaaa cagcgaggaa tattgcttaa 4020 caagacaacc ccatattggc cgcaagaaaa tggtctggtt gaacgccaga acaggtcttt 4080 gctaaaaagg ctgaagataa gccaggctca gaacggggat tggaaaaaag atttgggggc 4140 gtatctctcc atgtactacg ccactccaca cagcaccaca gggaaaacac cgagtgagtt 4200 gatgttcaac cgaaatatcc ggacaaaaat accatcattg ggagacatca gtaccggatc 4260 tgcattacca atgtctgatt acagggatag agacacatgt atgaaggaaa agggaagagt 4320 gacagaagat caacgacgca aagcaaaacc gtctgatata atagtaggag atagagtcct 4380 attgaggaat acaatgccag gcaacaaact gagtacaccc tttggaccac agattgcaaa 4440 agtaattgag aagcaaggtt ctcgcgtaac agtacaggat gagcaaaacg ggaaattata 4500 tgacaggaat tcgagccacc taaaagagta caaggaccca gacgatcata gggatggttg 4560 tgagagggtg gtggataacc aagaggaagg agtaggtttt aatgaaccaa cgaccgaaga 4620 gataccgaga acactctccc gtccgaaacg agaaattaaa cgtcccgcta ggttcctatc 4680 atagagtaag agattgggtg aaataatcag atgaaatcaa atagtaaata aataatttgg 4740 ggaattgtat ttatcgttca ttattcggaa aaaagggaga 4780 // ID GYPSY4-LTR_AG repbase; DNA; ANG; 270 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 21-SEP-2005 (Rel. 10.1, Last updated, Version 2) XX DE GYPSY4-LTR_AG is an LTR of the GYPSY4_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY4-I_AG; GYPSY4-LTR_AG; GYPSY4_AG; Gyspy clade; GYPSY34-I_AG; KW GYPSY34-LTR_AG. XX NM GYPSY4-LTR_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-270 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY4_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 78-78 (2003). XX DR [1] (Consensus) XX CC GYPSY4-LTR is a long terminal repeat of GYPSY4_AG (its internal CC portion is deposited as GYPSY4-I_AG). XX SQ Sequence 270 BP; 74 A; 73 C; 53 G; 70 T; 0 other; tgtaacgtcc ggactaatat cgcccactgt cactcaaccc gaaccccgaa cgcagcggta 60 taaacgcatt ataccttgac cgctggagca cccggtgtgc tagatgaact gtcatagaat 120 aaagctctct tcttggcgcg acattgaact gaacagacgt aagccactga cttctgcgta 180 taattatttg tgtgctcttc cgaattgtgc taaatcttat taaaacggcc aattaacctt 240 ccgccaaccg taaaacgctt ggtcgttaca 270 // ID GYPSY44-I_AG repbase; DNA; ANG; 5554 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY44-I_AG is an internal portion of retrotransposon GYPSY44_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY44-I_AG; GYPSY44-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY44_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5554 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY44_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 80-80 (2004). XX DR [1] (Consensus) XX CC GYPSY44_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY40_AG, GYPSY41_AG, GYPSY42_AG, GYPSY43_AG, CC GYPSY45_AG, CC GYPSY46_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY44-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 433-aa GYPSY44_AG1p gag-like CC polyprotein (pos. 385-1683) and the 1051?aa GYPSY44_AG2p CC pol-like polyprotein (pos. 1638-4790). CC The sequence of the LTRs flanking GYPSY44-I_AG is deposited as CC GYPSY44-LTR_AG. XX FH Key Location/Qualifiers FT CDS 385..1683 FT /product="GYPSY44_AG1p" FT /translation="MFNGSQSFVQPAPNILELGKVPDFVRDLRNFDGQPNE FT LNNWIDDVESIMNLCERCRVGETSLMYELIQKTIRRKICGQAADVLNSNNI FT TSKWSEIKETLLLYYCDKRDLKTLDFELTTVKRGKSEPLNSYYGRVNELLA FT SLVTQVQTQDSYKNNASVHIDFFREKALDAFIRGLDKPLSILLKTASPTSL FT AKAYQFCLEYENMDCRSYISGNRMNNTSIPIPKPKELSVPFLPVTTPRGNF FT QVQKPPVPLPRRQPFNNHYNPNQIGNTNTNRSHQMRRDLPVPMEVDESIRS FT RYTSANPIAYNNHSTHTYSNAIHNPNTSNNFVQYSDSNSTNEFYQPTNTIP FT HQYVPNPTQHTNAHYNQFTNTYPNTLYNQYTNSSLSPSPSNFNQQIHPLER FT EARITHTNEMTYLPNYQAPGKDINTQEQPNHFLGMNNVW" FT CDS 1638..4790 FT /product="GYPSY44_AG2p" FT /translation="YTRATEPFFRNEQCMVEPVLPYIVIKNRKGEFPFLID FT TGANVSFINPEIAESCETAKPFEIPLKNIASASGKFQLKSAIDLNFFYPKI FT NCICRFMLHSFHHFFVGIIGTDILYNLNASIDLEKRVLKLHNFSQTLQIPL FT NFYTSSSQSSNSSFRLNHLNNFEQTKLKKVLDSNMSVFYEPKLKLTCTTRV FT ECCINTTDDIPIHQKVYPYPAAYTEEVNKQITELLDNGIIRPSHSAWTAPV FT WVVPKKSDASGEKKFRMVIDYRKLNQKTIADRYPMPEINYVIDQLKGHTLF FT TTLDLASGFYQIKMRPSDIEKTAFAINNGKYEFVRMPFGLKNGPSIFQRVI FT DDVLREEIGKTCYVYMDDVIIFGKNLTDHLINLDKILKLLNEANLKVQLDK FT SEFLHSEIEFLGYVIGCNGIKPNIKKIEVISNYPIPKTLKELRGFLGMMGY FT YRRFVKDFAKIAKPLTNLLRGEESPSSNKKIELEKNEIECFHKMKKILSSE FT DILIYPDYKKPFVLATDASDFAIGAVLSQDEDGKDLPIHFASRTLSKAEEK FT YSVPEKEMLAIFWALKTFRNYLYGATFKIITDHQPLTFALSSKNSNAKLKR FT WKSYLEEHDYELIYRPGKNNVVADALSRIAFSMTGTQHSADTSDDFFIIAT FT EAPINVFKQQIILKMGPDNIETKILFNSYKRITISISNINEASILKILQEN FT FDHSKINGLFTTEAIMGQIQEVYRTHFGNQRLLKIRFSQKILEDIEDEEEQ FT YNLIKKEHFRAHRCAEENKSQILRRFYFPQIRKKIIEFIKECEICHINKYD FT RKPVSYPLQKTPIPHKPFEITHIDIFFIENHHFITYIDKFSKFAQIKLTPT FT RAAIDVVPIIKELITKYHPPKILVLDGEKTFSSRELKTFFQSLEISTYVTP FT VGHSEINGIIERFHSTILEIYRISKMENPTLSPPDIIQISVQKYNNTIHSS FT TKYTPQEIILPSIDSSKIIETVYQNLIQKQSQDLKYHNKSRKVIQIAENKP FT IYEKTKARLKHKPRFKKKTVVKVNNSTVNLEENRKIHKNDLKIVR" XX SQ Sequence 5554 BP; 2044 A; 997 C; 897 G; 1616 T; 0 other; ggcgcccgaa tcaggagagc cctagacggg aagagataag ctcagagcag aaaaaaaaat 60 cttcacaaat ctcagagagc aatactctct gaaacagtgg tccaaagagc cactactgca 120 aaagagcgaa gcgaagcttg ggacactgtg gttatcaccc gatcgctgta cgagacccct 180 agtaaccgga ccggtaaaaa ggattcccct ccaccagctg cctaaccttc ccccgaacgg 240 agtatcacca acgagttaca acggagcatc accgacggag cacgacgaaa atcagcgttg 300 gtaagcaata atagtgtctg tgtgagtatt attttgtaag tattcatagt gtgtggtgca 360 aagtgacagt gtcgaagtga aaaaatgttc aacggcagcc agagttttgt tcaaccggct 420 ccaaatattc tggagctcgg taaggtgccc gattttgtga gagatcttcg caatttcgat 480 ggacaaccaa acgagttgaa taactggata gatgatgtag aaagcatcat gaatctttgc 540 gaacggtgtc gagtaggaga aacttcttta atgtatgagt tgatccaaaa aacaatccga 600 cgtaaaattt gcggccaagc tgccgacgtc ttaaattcta acaatattac ttctaaatgg 660 tcagagatta aagaaacgtt gttattatat tattgcgata aacgagactt aaaaacatta 720 gattttgagc tgacaaccgt taaaagagga aaaagtgaac cattaaattc atactatggc 780 agggttaacg aactattagc atctttggta acacaagtgc agacacaaga tagttataaa 840 aataacgcaa gtgttcatat tgattttttt agagaaaaag cattggatgc tttcattcga 900 ggcttggaca agcctttgtc aattttgctt aaaactgcta gccctacgtc attagcaaaa 960 gcttatcagt tttgcttaga gtatgaaaat atggattgta gatcatacat tagtggtaat 1020 agaatgaaca atacttctat tccgattccc aaaccaaaag agctatcggt accatttttg 1080 ccggttacca ctccacgtgg taattttcaa gttcaaaaac ccccggtacc gttacctcgc 1140 cggcaaccat ttaacaatca ctataatcca aaccagatcg gtaacacaaa tactaacaga 1200 agccatcaaa tgcgacgaga tctacccgta cccatggagg tagatgaaag tattagaagc 1260 cgatatactt cagctaaccc tattgcatac aataatcatt ccacacatac ttattcaaat 1320 gcaatacata acccaaatac aagcaataat tttgtacaat acagcgattc gaattctact 1380 aacgaattct accaaccaac taacacgatc ccacaccagt acgtccctaa tcctacacag 1440 cacactaatg ctcattataa ccagtttact aacacatatc cgaatacatt atataatcaa 1500 tacactaata gctcattatc accatcacca tcaaatttta atcagcaaat tcatccgtta 1560 gaaagagaag ctcgcatcac tcataccaac gagatgacat atttacctaa ttaccaagca 1620 cctggtaaag atattaatac acaagagcaa ccgaaccatt ttttaggaat gaacaatgta 1680 tggtagaacc tgttctccca tacattgtta ttaaaaatag aaaaggagaa tttccattcc 1740 ttatcgatac aggtgcaaac gttagtttta ttaatcctga aattgcagaa tcttgcgaaa 1800 cagctaaacc atttgaaata cctttaaaaa atatagccag tgctagtggt aaatttcaat 1860 tgaaatcagc tatcgattta aattttttct atcctaaaat taattgcatt tgtagattta 1920 tgttacattc ttttcatcat ttttttgtag ggataattgg aacagatatt ttgtataact 1980 taaatgcaag tatcgattta gaaaaacggg tacttaaatt acataatttt tctcaaacat 2040 tacaaatacc attaaatttt tatacctctt catcacagtc atccaattct tcatttagat 2100 taaatcacct aaataatttt gaacaaacta aacttaaaaa agtgttagat tcaaatatgt 2160 ctgttttcta tgagcccaaa ttaaaactta catgcactac tagagttgag tgttgtataa 2220 ataccactga cgatatacca atacaccaaa aggtatatcc ttatccagca gcgtataccg 2280 aagaggtcaa taaacagata acagaattgc tggataatgg tattatcaga ccttcgcact 2340 ctgcctggac ggcacctgta tgggtcgttc caaaaaaatc agacgcctct ggagaaaaaa 2400 aatttcgcat ggtgatcgac tatcgcaagc taaatcaaaa aacaatagct gatcgatatc 2460 ccatgcctga aatcaattat gttattgatc aactgaaagg acacactctt tttacgacgc 2520 tagatttagc ctctggcttt tatcaaataa aaatgagacc gagtgatata gaaaaaacgg 2580 catttgctat caacaacggc aaatatgaat ttgttagaat gccatttggg cttaaaaacg 2640 gtccttccat atttcaaagg gtaattgatg acgtccttcg agaagaaatt ggtaaaactt 2700 gttatgttta tatggatgac gttattatct ttggaaaaaa cttaactgat catttaatta 2760 atttagataa aattttgaaa ctattaaatg aagcaaactt aaaagttcaa ctcgacaaat 2820 cagagttttt gcactctgaa attgaatttt taggctacgt aattggatgt aatggtataa 2880 aaccaaacat taaaaaaata gaagtaataa gcaattaccc tattccaaaa accttaaaag 2940 aactaagagg atttttgggg atgatgggtt attacaggcg atttgtaaaa gattttgcaa 3000 aaattgcaaa acctcttaca aacctcttaa ggggagagga gagtccatca tccaacaaaa 3060 aaatagaatt ggaaaaaaat gaaatagaat gtttccataa gatgaaaaaa atactttcat 3120 cagaggatat acttatttat cctgattata aaaaaccttt tgttttggcc acagacgcat 3180 ccgactttgc tattggcgct gtgttatcac aggatgaaga tggaaaagat ctaccaattc 3240 attttgcctc acgtacctta tcaaaggcag aagaaaaata ctctgtacca gaaaaagaaa 3300 tgttggcgat tttctgggca cttaaaactt tcagaaacta tttgtatgga gccaccttta 3360 aaatcattac cgatcaccaa cctttaactt ttgcactgtc atcgaaaaat tctaatgcaa 3420 agctaaaacg atggaaaagt tatttggaag agcacgacta tgaactaatt tataggcccg 3480 gtaaaaataa tgtagttgcg gacgctctaa gtagaattgc attttcgatg acaggtacgc 3540 aacactcagc agatacttct gacgattttt ttattattgc tacagaagct ccaattaacg 3600 tatttaaaca acaaattatt ttaaaaatgg gccctgataa tattgaaact aaaattttgt 3660 ttaatagtta taaaagaatt actatttcca tcagtaatat taacgaagct tctattctta 3720 aaattttgca agaaaacttc gatcactcca aaataaatgg cctatttact actgaagcta 3780 ttatgggtca aattcaagaa gtctatagga cccattttgg gaatcaaaga cttttaaaaa 3840 ttcgtttttc tcaaaaaata ttagaagata tagaagatga ggaggagcag tacaatttaa 3900 taaaaaaaga gcattttaga gctcatagat gtgccgaaga aaataaaagc cagattcttc 3960 gaagattcta ttttcctcaa ataagaaaaa aaattattga atttataaaa gaatgtgaaa 4020 tttgtcacat aaataaatac gatagaaagc ctgtatctta ccctttacaa aaaactccta 4080 ttcctcataa accatttgaa attacacata tagatatctt ctttatagaa aatcatcatt 4140 tcataactta catagataag ttttctaagt ttgctcaaat taaattgact ccaacacggg 4200 ctgctataga tgtagttcct attattaaag aattgattac aaaataccat cctccaaaaa 4260 tactagtctt agacggtgaa aagaccttca gcagtcgtga attgaaaact tttttccaat 4320 ctcttgaaat ttctacctac gtcacacctg ttggtcacag tgaaatcaac ggtattatag 4380 aacggtttca ttcaaccatt ttagagattt atcgaatctc aaagatggaa aacccgacgt 4440 tatcaccacc tgatatcata cagatatcag ttcaaaaata caataatacc attcactcct 4500 ccacaaaata cacacctcaa gagattatct taccttccat agactcatct aaaataatag 4560 aaacagttta tcaaaatctt attcaaaaac aatcacagga cttaaagtat cacaataaat 4620 cacgaaaggt gatacaaata gcagaaaaca aacctattta tgaaaaaact aaagcgagat 4680 taaagcacaa acccagattt aaaaagaaaa cagttgttaa ggtaaataat agtacagtaa 4740 accttgaaga aaatagaaaa attcacaaaa atgatttgaa aattgtgaga tagtttagag 4800 ttgagttgat caatgaaatg atgcatgtga aatcttattt ttgtctcttt tgaagaaaag 4860 agttctgaaa ttagcaattt tgaaatttaa ataattatgc accattgttc atatacattt 4920 tcatgtattg cacaggctac tagcagttag atagaatatc taactaataa ttagtaatgt 4980 ggaagaattt gttataaaaa aaagaaatta aataactcaa accagctgaa tatgtgaaac 5040 agaaatttaa aactatctta cagtttcacg ttaaactccc atgctctcgt cccatataag 5100 gctttatatt ttgagtgagc attgctgaca tataacactt gtgaattcag atttgagtga 5160 gaattgaaaa ttatgaaaaa ggccactatc gttcgaacac aaaatatccg tgaaactgta 5220 atgtttctat ggtggataat acaatatggt tataatgcca atcacctgaa aattttgcaa 5280 cgtatgtaca aaaagcatta tttcgtgctt gatatcggat gcaggtagaa catctaagaa 5340 agattgcaaa aaagggatgg taaaccataa gaacccaact ccaaacagga gtttttcatc 5400 gagattatgg acgaacaccc tggcatttca tacatcggct ctatgcataa aatgtttctc 5460 ccatctctgt ttttgtttgg ttatatgcct gtaatgttat agtttcatca tatgattcct 5520 tacgatgtgg gcatcgttct tttaagctgg gagg 5554 // ID Mariner-N18_AG repbase; DNA; ANG; 710 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Putative nonautonomous Mariner DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW Mariner-N18_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-710 RA Jurka J.; RT "Putative mariner/Tc1-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 643-643 (2009). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 710 BP; 223 A; 125 C; 127 G; 226 T; 9 other; tacagggttt acctagccaa tcgggcgtcg tcaaagcgtt atcggtcgcc gtaaatactt 60 tttcgggcga cgtaaaccct caattgggca ctgtcatttc tattcgggcc ttgtgaagta 120 tgaatcgggc aaagtacaga atgaatgcgg cctactttat attttgttgc tttctagctg 180 taaaacaata cttttttgcg agtagtttaa ctgatatgta agaaaattca ctatttttaa 240 gtttaattac ataaaaaatt attcaaattc cttttttaaa atttttaatt caaacccgta 300 caatgcaatt tggttttttt ttaaatatat attataaaat agtgcatttc taaactatgt 360 ccgattcata cttyacaagg yytgaataga aatgacagtg cyyaattggt gatttaaktg 420 aaaaagtatt tatggtgaty aacattttaa cgaattgact acataacntg taagttaaac 480 aacatgcaaa atacgtattt ttaaacaact tgatggatca aacacaccgr agcgaccgat 540 ttgacacctt caagagtagg acgctttcat tctgtacttt gcccgattca tacttcacaa 600 ggcccgaata gaaatgacag tgcccaattg agggtttacg tcgcccgaaa aagtatttac 660 ggcgaccgat aacgctttga cgatgccgga ttggctaggt aaaccctgta 710 // ID TransibN3_AG repbase; DNA; ANG; 959 BP. XX AC . XX DT 21-MAR-2005 (Rel. 10.03, Created) DT 01-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE TransibN3_AG is a family of nonautonomous DNA transposons - a DE consensus sequence. XX KW Transib; DNA transposon; Transposable Element; Nonautonomous; KW Interspersed repeat; TransibN3_AG. XX NM TransibN3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-959 RA Kapitonov V.V. and Jurka J.; RT "RAG1 core and V(D)J recombination signal sequences were derived RT from Transib transposons."; RL PLoS Biol 3(6), (2005). XX DR [1] (Consensus) XX CC TransibN3_AG is a family of nonautonomous DNA transposons that CC belongs to the Transib superfamily. TransibN3_AG elements are CC characterized by 13-bp terminal inverted repeats (1 mismatch) and CC 5-bp target site duplications. XX SQ Sequence 959 BP; 336 A; 154 C; 145 G; 324 T; 0 other; cacagtgggc agctgccggg atgaaagtcc aaaaattaat tatcatattt ctttaatccg 60 aggaaagcat tgtaatctac agtatcaaac taagaaaaac acgagatttc agtcccctag 120 tcgtattagt tattacgata gagactccta aagtcttagg tcttgaaaaa actcatcata 180 acaactatat ttttttacct tctttaaacc attgtaactt tcacaaaaat caatcaaatt 240 ttaaaaatga atatgttttg gaaaggtttt aatctattct aaacaaaaat atttagttta 300 tgaaaatatg tcaattactt aaaaagttaa agagtatagt gtgcactact tcctctaaat 360 tgtatcaatc atgtatcttt gacatcatgc caaatttgaa caccgtaaaa caccgtggta 420 tcatgtatac cacgtcaaag ctttacatct aaacaaaaga tctgtaaaga aattagccgc 480 tactttcgcg caactttggc agatattggc attttaatga cagtataaaa tactttcaat 540 aaaacaaagc gttttctggt gaagtatgac taatatccgt atatttggga tgtttagggc 600 agtttttgat gaaatttgat catcgaatgt gatgatcaat gatgacgtga tgattgatca 660 tcgatatttc ggaatgttgc tagcaaacca aatgtaccat gtaagcttta ttttattaca 720 gtcagggtac gtagtatatt aagtataacc ccgtatattc gttacgcagc caaagaagaa 780 catgtatggg atagtaagga tattttcaaa aataattata atctttgcta cataatcata 840 aaacattcat attttgttta atttagattt tcaacattta cactataatg tacctaagaa 900 atcatgtatt caatgacaat tacttttatc ccgctttttg ttctacggtg cccactgtg 959 // ID RETRO932_AG_LTR repbase; DNA; ANG; 199 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO932_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; AGM1; KW Long terminal repeat; RETRO932_AG_I; RETRO932_AG_LTR; ROO; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-199 RA Jurka J. and Drazkiewicz A.; RT "RETRO932_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 19-19 (2002). XX DR [1] (Consensus) XX CC Related to AGM1 from Anopheles gambiae, ROO and BEL from CC Drosophila melanogaster. 5 bp target site duplication. XX SQ Sequence 199 BP; 76 A; 27 C; 44 G; 52 T; 0 other; tgttgacgcc ttacggtagg caatgtgcat ggcacaatta gaattaggac aactagggta 60 attagtacat aaaggattag atcgtaatta taaataagaa gaattagaag agcagtgaat 120 ttgacaggga ataaacatta gcacttgaaa cctcggcgag ataacaacat tattttttac 180 gattgagaaa gcgtcatca 199 // ID AgaP12MITE205B repbase; DNA; ANG; 206 BP. XX AC DQ301487; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP12MITE205B P MITE, complete DE sequence. XX KW P; DNA transposon; Transposable Element; Nonautonomous; KW AgaP12MITE205B. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-206 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-206 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301487; Positions 1 206. XX SQ Sequence 206 BP; 67 A; 38 C; 51 G; 50 T; 0 other; caaggtgtat ggatattagg aggtgaagtg cttcagagca aattgacatt tcacgacttg 60 acataacaat actggaggct ccggtattat aatcggggag ggctagaggg gggtatagaa 120 aattgtatgc gatttgacat aacaatactc aatgatggcg cccggaaaac gcgctaacaa 180 ttcacctcct taataaacac accttg 206 // ID Copia-7_AG-I repbase; DNA; ANG; 1896 BP. XX AC . XX DT 01-SEP-2010 (Rel. 15.09, Created) DT 01-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Copia-7_AG-I. XX KW Copia; LTR Retrotransposon; Transposable Element; Copia-7_AG-I. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1896 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1458-1458 (2010). XX DR [1] (Consensus) XX CC Consensus sequence out of four sequences with a mean p-distance CC of 0,0083 and sd=0,0014, considering the whole alignment. The CC LTRs are 168 nucleotides long and are identical in all the CC sequences suggesting a recent transposition event. CC The consensus sequence has an ORF of 615 aa that contains no CC conserved regions. XX FH Key Location/Qualifiers FT CDS 408..1847 FT /product="Copia-7_AG-I_1p" FT /translation="KMQVRKCKRTIWYRNFFLTLPDSYDPLVTALENIQDK FT DLSLEMVKHRLLGEESKRVDRVDYYVEENSTAFIGGSNQMKKFKGRCYRCG FT KLGHMQKDCRSKMENRNANSVVAGKTVSFMVKPQCEVEEKEQAIHSFVIDS FT GCSDHFINNIKYLQNIRKLKEPFIVDVAKDGVTLVGEYEGTVRGKTKEGVI FT LEMKNVIYLPELRSNLVSVKKMTHAGIDVLFTREDGFEKALMKLEKDVIGV FT AHMKQNLYELELQLETRRSANMCMTAVSTLKPRNEYFRFQATGMVMQNCMV FT DGRDRKSTHFCDACREPYKKNQELAFISLDNDTCGSENRDVKFPFHEQRDN FT DRFQQEGEMVGKDNAANGLQVHQKLVDIEENNEEYFEILSDKGIREKENGE FT REEPITSKLSANVSHYNCEIRKKAIKGESLTSRPDFKVAFLHGKLPEAPPG FT LLRQFLGLLQDVGLKVMTLIRTLPKLHTNKALEAG" XX SQ Sequence 1896 BP; 626 A; 314 C; 484 G; 472 T; 0 other; ggttatgggc tcctctgtaa aggaaggtgc tggcattccc cagctgaatg gaacgaacta 60 cggaaaatgg cgttttcgag ttcgtttgtt cttggaagct tcggaagttt gggaggcgct 120 cgaagaagac gtgcctgagg cggtcggaga accgcggaac aagtttttgc gaatggaccg 180 gaaggcaaaa tccttgctgg ttggatttgt tggagacgat tgtctagcca tagtcgaaga 240 aaagggaaca gccaaggaaa tgtggaaagc ccttgaagac acttttgcga agaagtcggg 300 ggcaagccag acgatcttgc gcaaacgact ggccacgtta cgcatgaagg aagggtgttc 360 tatgcgaagt cattttgccg aattcgacga gctagtgcga cagttgaaaa atgcaggtgc 420 gaaaatgcaa gagaacgatt tggtatcgca actttttttt aacgttgccg gatagctacg 480 atcctcttgt gacagccctt gaaaatatcc aagataaaga cctctcatta gaaatggtga 540 aacatcggtt actaggagaa gagtctaagc gagtcgacag ggtggactac tatgtcgaag 600 aaaattcaac cgcttttatc ggtggaagta atcagatgaa gaaattcaaa ggaagatgtt 660 accggtgtgg aaaattaggc cacatgcaaa aggattgccg atccaagatg gaaaacagaa 720 atgctaactc tgttgtggca ggcaaaactg tgagtttcat ggtgaaacct cagtgtgaag 780 tggaagaaaa agagcaagca atacattcat ttgtaatcga ttccggatgc agtgaccact 840 tcatcaacaa catcaaatac cttcaaaaca ttcgaaaatt gaaagaaccg tttatcgttg 900 atgttgccaa agacggtgta actttagtcg gagaatacga aggtaccgtg cgtggaaaaa 960 cgaaagaagg cgttatttta gaaatgaaga atgttattta tttacctgag ttgagaagta 1020 acttagtttc agtaaagaaa atgacgcatg ctggaatcga tgtgcttttc actcgtgaag 1080 atggctttga aaaagcccta atgaagctag agaaggatgt aattggcgtt gctcatatga 1140 aacaaaatct ttatgagcta gagttgcagt tagaaacgag acgatctgca aatatgtgta 1200 tgacagcggt gagtacacta aagcctcgta atgaatattt tcgatttcaa gctacaggca 1260 tggtgatgca aaactgtatg gtagacggtc gcgatcgaaa atcaacacat ttttgcgacg 1320 catgccgaga accatataaa aaaaatcaag aattggcctt tatctcattg gataacgaca 1380 cttgtggttc tgaaaatcga gatgtaaaat tcccttttca tgaacagcgg gacaatgacc 1440 gttttcagca agagggggaa atggtcggaa aagataatgc tgccaacggt ttgcaagtgc 1500 atcaaaagtt agtcgatatt gaagagaata atgaggaata ttttgaaata ttaagcgata 1560 aaggaatacg tgaaaaagaa aatggcgaaa gggaagaacc gataacttca aaactcagcg 1620 caaacgtaag tcattataat tgtgaaatta ggaagaaagc tatcaaaggt gaatcgttaa 1680 catcaaggcc tgattttaaa gtagcatttc ttcatggaaa actgcctgaa gctcctcctg 1740 gtttgttgag gcaattcttg ggcttgttac aggatgttgg attgaaagta atgacattga 1800 ttcgaaccct tccgaagcta cataccaata aggcattaga agctggttga ttgaaagtct 1860 tgaggaagaa acttgggatg aacgatcgag aggggg 1896 // ID GYPSY47-LTR_AG repbase; DNA; ANG; 345 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY47-LTR_AG is an LTR of retrotransposon GYPSY47_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY47_AG; GYPSY lineage; GYPSY47-I_AG; GYPSY47-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-345 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY47_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 87-87 (2004). XX DR [1] (Consensus) XX CC GYPSY47-LTR is a long terminal repeat of GYPSY47_AG (its internal CC portion is deposited as GYPSY47-I_AG). XX SQ Sequence 345 BP; 150 A; 76 C; 45 G; 74 T; 0 other; agttaacacg acatcacagc accgaaacca aaacaccagc accgaaacaa tgcaacacca 60 ttcatagcga agtatgctat gaccaacaaa gtaatcatcg gaaccagaac cagcatcaac 120 atcgtatcaa catctggcca aatcaaacct ctaataaata accgtagcat aaacatagta 180 accataacat aaacaatgaa aagtagggaa ccgaacaata tataacatga aatcaggaaa 240 tattagccag ttctaaccaa catcttgaac taagaatatg tgcgcgacga attgaaaaaa 300 taaaattatt tttttaatac aattctaacg catgtgttct taact 345 // ID GYPSY14-LTR_AG repbase; DNA; ANG; 404 BP. XX AC . XX DT 30-SEP-2003 (Rel. 8.08, Created) DT 30-SEP-2003 (Rel. 8.08, Last updated, Version 1) XX DE Long terminal repeat from GYPSY14 LTR retrotransposon - a DE consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW Gypsy superfamily; GYPSY14; GYPSY14-I_AG; GYPSY14-LTR_AG; KW Long terminal repeat; RETRO19_AG_LTR; retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-404 RA Jurka J. and Drazkiewicz A.; RT "RETRO19_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 4-4 (2002). XX DR [1] (Consensus) XX CC Related to MDG1, TABOR, DM412, STALKER2, and BLOOD CC retrotransposons CC from Drosophila melanogaster. 4 bp target site duplication. CC The original name RETRO19_AG_LTR has been changed to CC GYPSY14-LTR_AG. XX SQ Sequence 404 BP; 166 A; 40 C; 88 G; 110 T; 0 other; tgtggcgcat acacataggc gtaatgaata taattatagt aagctatata gcagaaatta 60 tagtaagcta tctttatagg aatagttgtg attggtagaa gcgtgttagt ataggttaga 120 ataggatagc taatacgaaa agataaggat agaataggta gcataggaaa ttaggcataa 180 gacggaatta aggatacata ggaagtaaat gaattaagct aggattagat agggaaattt 240 tagtataaat agcgggatag gatttaggat agtcagataa aagttagact tcataagggg 300 aaactctgtc caattataga gaaaattata gagagtataa ataaaagtta cagttccaac 360 aaaacacccc tttttcaatt gtacactaaa agtgatatac caca 404 // ID GYPSY60-LTR_AG repbase; DNA; ANG; 220 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY60-LTR_AG is an LTR of retrotransposon GYPSY60_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY60_AG; GYPSY60-I_AG; GYPSY60-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-220 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY60_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 160-160 (2004). XX DR [1] (Consensus) XX CC GYPSY60-LTR is a long terminal repeat of GYPSY60_AG (its CC internal portion is deposited as GYPSY60-I_AG). XX SQ Sequence 220 BP; 58 A; 44 C; 62 G; 56 T; 0 other; tgttgtggtc acgaagtgga aggctccctt tattcggcgt taagaacgta tgcagccact 60 gagtgactga tgagcggtta gcagacgaga gagcgccaga cgctcgtggg acaccgtcgg 120 gaacacaagg gctgatcgtg taataaagta ttgtgtattg tttattacgt gttcggtgta 180 aatataaact gtgtatactc tcggccatcc gaacacaaca 220 // ID BEL8-LTR_AG repbase; DNA; ANG; 228 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL8-LTR_AG is a long terminal repeat of the BEL8_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL8-I_AG; BEL8-LTR_AG; BEL8_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-228 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL8_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 46-46 (2003). XX DR [1] (Consensus) XX CC BEL8-LTR_AG flank an internal portion of BEL8_AG (deposited as CC BEL8-I_AG). XX SQ Sequence 228 BP; 76 A; 53 C; 43 G; 56 T; 0 other; tgttcgagca gctacgcgaa tcgtgcagat ttgaactgcg gatcggatca catccgcgcc 60 cgttgatgta cacatctcta tcgatcataa attaataaat agagggcgaa aaggcttttt 120 ccctctcttc atcaccagca atttcttaca acggcttcgg taattattag cagcaaaaac 180 caaagaaaga aaaggaaact ttatccgaaa actattcgcg acgaaaca 228 // ID GYPSY29-I_AG repbase; DNA; ANG; 4631 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY29-I_AG is an internal portion of retrotransposon GYPSY29_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY29-I_AG; GYPSY29-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY29_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4631 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY29_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 50-50 (2004). XX DR [1] (Consensus) XX CC GYPSY29_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, GYPSY34_AG, CC GYPSY35_AG, CC GYPSY36_AG, GYPSY37_AG and GYPSY38_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY29-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. CC The consensus encodes the 1521-aa GYPSY29_AGp gag-pol like CC polyprotein CC (pos. 28-4590). CC The sequence of the LTRs flanking GYPSY29-I_AG is deposited as CC GYPSY29-LTR_AG. XX FH Key Location/Qualifiers FT CDS 28..4590 FT /product="GYPSY29_AGp" FT /translation="MSVDDQRVTKDEMLCALQSAEITVPCTATLMQIRALY FT KEAYPSSQQNGDSNDTMCNAAIAESKEESVTQIMKTDVIKENDEAAAEIVL FT LKQKCEILELRNKLAALESQSREYSNTARLLHPEEVKQIIPMFGEGASFLQ FT WIKTVKHSAEVYGWTDQMTLMYASSQLTGAAKEWYSGFRHSVTSFKEFAEG FT MGKAFPDTHNEAAIHKKLLHTFKRNDESYAAYIFRVHALGTTGNVSNAAII FT TYIIRGLSRDPMYDNLVAKEYRDVYDLIDHVNRCEMHYQMREPPSSPTRPQ FT PSSRFTSMLKTNTKPANGREEMVRCYNCSSFGHFSNQCRKPRQSAQLCFLC FT GSPDHKKQDCPNVAQRMPAAIFAPCDQQSFQHQCPASRLPQSGGNDYSHLN FT LSMAAAAENNYGDDNGNQVTGSGARIDPIQEVSVALLNDNELSATQSRVFC FT LFDSGSPKSFISEQLVPNIKHTPQFSGFCGLGNQQLTSLGHVNIRIKFRNI FT NVSHSFMILPKQQTAWPLIAGRDLLKKMNIHLHYLCYRYSQDNLMNLLKEN FT KSTLATHVKARLNSLGIFKTNSKDEADIWTQSIFQSDQKRYDSVDEHNQTN FT DSGTEAESILRTFSELCAIDISKEANTLDIGNELSEDQMCLIASCVENNYL FT QPLNETILESQHSMKISVTNDTPIFCKPRRLSFAERNQVRDIVKNLLEKQI FT IRPSNSPYASAIVLVRKKNGEVRMCVDYRPLNKITIRDNYPIPLIDTCLEH FT LSGKRYFTLLDLKSGFHQIKMHEDSISYTAFVTPDGQYEYLKMPFGLKNAP FT SGFQRFINSVLREFIDEAKLVVYLDDIIIASKTFQDHLETLGAVLKTLRKN FT GLELRIDKCKFGCSNLDYLGYYVNSKGIQPSNYHIKAIQNYPVPKTSKEVQ FT RCLGLFSYFRRFVPSFSNIAKPLSNLLKEKVSFQFDEACMNAFNELKTKLI FT NAPVLAIYDPSRETELHCDASTVGFGSVLLQKQDDGKFHPVAYFSKTASSS FT ESNLHSYELETLSVIYALKRFHAYVHGIPIKIVTDCNSLVETLKNRNCSAK FT IARWSLFLENYEYTMQYRPGTAMGHADALSRSKMAGAVDELDLDIQLQIAQ FT GRDPTLVHLRTELETKPIPGYTLLDGVIYRQSPESKLQLIVPKELIKNVIR FT STHESIGHLGVDKCCSQIAKHYWFSGMKNQVQNFIQNCLKCILFSAPPRKN FT KRNLYNIPKSSVPFDTLHIDHFGPLPSIKSKKKYILVVIDSFTKFTKLYPT FT TTTNTREVCSVLGQYFNYYSRPKRIVSDRATCFSSQEFKTFLSDRNIIHVQ FT NAVCSPQANGQVERVNRVIKPMLSKITDSVDHADWCSKLSEVEYALNNTTH FT SSTYFAPSVLLFGVEQRGTIINEFKEYLDNKNEPSRNLETIRSEASENIKR FT SQEINLIQFGKRHNPAVEFAEGDLVVIRNVDNSANSNKKFIAKFKGPYIVH FT KQLPNDRYVIRDIDGFQHTQIPYDGILESDKLRHWIAPDADLAPELCDATL FT HDENV" XX SQ Sequence 4631 BP; 1483 A; 958 C; 936 G; 1254 T; 0 other; tctcagaagt gggatagtga gcccaaaatg tccgtagatg accagagagt gactaaggat 60 gagatgctgt gcgcactcca gtcggccgaa ataaccgtgc catgtacggc aacactgatg 120 cagatccgtg ctctctataa ggaagcgtac cctagctcgc aacaaaatgg cgattccaac 180 gacaccatgt gcaacgcagc cattgccgaa agtaaagaag agagcgtcac ccagatcatg 240 aaaactgatg tgatcaaaga aaacgacgaa gctgcagcag aaattgtgct cctaaagcaa 300 aaatgtgaga tacttgaact gcggaacaaa ttagcagcgc ttgagagcca aagtcgtgaa 360 tactccaata cagctagatt gcttcatccg gaagaggtta agcagattat tccgatgttt 420 ggagagggtg ccagtttctt gcagtggata aaaacggtaa aacacagtgc tgaagtttat 480 ggttggaccg accaaatgac attgatgtac gctagcagtc agctaactgg cgctgcaaag 540 gagtggtaca gtggtttccg gcattctgta acatcgttta aagaatttgc cgaaggaatg 600 ggaaaagctt tcccagacac gcataatgaa gctgctattc acaagaagct gttgcataca 660 ttcaaaagga atgatgagtc ctacgccgcc tacatctttc gtgtacatgc ccttggaaca 720 actgggaacg tgagtaacgc agcgattata acatacatca tccgaggact atcgcgcgat 780 cccatgtatg acaacttagt ggcgaaagaa tatcgtgatg tgtatgatct gattgatcat 840 gtaaaccgat gcgagatgca ttaccaaatg cgtgaaccac cgtcatcccc cactcgtcca 900 caaccttcaa gccgttttac atcgatgctg aaaaccaaca ccaaaccagc caatgggcga 960 gaagaaatgg tgcgatgcta caattgttcg agtttcggcc atttttccaa ccagtgccga 1020 aaaccacgtc agtctgccca actatgtttc ttgtgtggaa gccctgatca caagaaacaa 1080 gactgcccga acgttgccca gaggatgcct gccgctatat ttgcaccgtg cgatcagcag 1140 agtttccaac atcagtgtcc ggcatcacga ttgcctcaat ccggtggcaa tgattactca 1200 catctgaacc tgtcgatggc tgctgcagca gaaaacaatt acggagatga taatggaaat 1260 caagtgactg gttcaggagc taggatcgat ccaattcagg aggtaagtgt cgcattgctg 1320 aatgataatg aattaagcgc aacgcaatca cgtgtgtttt gtttgtttga ttctggcagc 1380 cctaaaagtt tcatcagtga acaacttgtg ccaaacatca aacatactcc tcaattttcg 1440 ggattctgtg gattaggcaa ccagcagctt acatctttgg gccatgtgaa cattagaata 1500 aaatttcgaa atatcaacgt ttcccattcc tttatgatac tgccaaaaca acaaacagca 1560 tggcccttga ttgcaggaag ggacttatta aaaaagatga acatccattt acattacctt 1620 tgttatcgtt attctcaaga caatctgatg aacttactaa aagaaaataa atctactctt 1680 gcaacacacg tgaaagcgcg ccttaattca ttggggattt ttaaaactaa ttcgaaggat 1740 gaggcagaca tttggacgca atctattttc caatcagacc agaaaagata cgatagtgtt 1800 gatgaacata atcagaccaa cgatagtggt acggaggctg agagtatttt gaggactttt 1860 tcagagttat gtgctattga tattagtaag gaagcgaaca cattggatat aggaaacgaa 1920 ttgagtgaag accagatgtg tctgatagct tcatgtgtgg aaaataacta tcttcagccc 1980 ttaaatgaaa ctattcttga atcacaacat tcaatgaaaa taagtgttac aaatgatact 2040 ccaatattct gtaaacctag gcggttgtct tttgccgaac gcaatcaggt tcgtgatata 2100 gttaaaaatt tattagaaaa acaaattatt cgacccagca attcacctta cgcatctgct 2160 attgtgttag tacggaagaa aaatggtgag gtacgaatgt gtgttgacta ccgtccgtta 2220 aataagataa caattcgtga taactatcca atacctttga ttgacacttg cctcgaacac 2280 ctaagcggta aacggtattt cactctctta gatctcaaaa gcggctttca tcaaattaag 2340 atgcacgaag attctattag ttacactgcc tttgtaacac ctgacggaca gtatgaatat 2400 ctgaaaatgc catttgggct aaaaaacgct ccatcaggat ttcaaagatt cattaactca 2460 gtactgcgag aattcatcga cgaagccaaa ctagtcgtct atcttgacga cataatcata 2520 gcgtcaaaga cgtttcaaga tcatttagaa actctaggtg ctgttttaaa aactcttagg 2580 aagaatggat tggaactgcg tattgacaaa tgcaaatttg ggtgcagtaa cttggattac 2640 cttggttact atgtaaactc aaagggaatt caaccaagca actaccacat taaggctatc 2700 caaaattacc cagtgcctaa aacctcaaag gaagttcagc gttgtcttgg tttgttttcc 2760 tattttagac gattcgtccc gtctttctcg aacattgcaa aacctttgag caatctactt 2820 aaagaaaaag tgtctttcca atttgatgaa gcatgtatga acgcatttaa tgaattgaaa 2880 accaaattaa ttaatgctcc tgtgttggca atttatgatc cttctcgaga aacagaacta 2940 cattgcgatg ccagcacagt tggatttggc tcggttttac ttcaaaagca agatgatggg 3000 aaattccatc cagtggcata tttttcgaaa actgcttctt ctagtgagtc caatcttcac 3060 agctacgagt tagaaacttt gtcagtgatt tacgccttga agcgttttca tgcatatgtg 3120 catggtatcc ctataaagat agtaaccgat tgtaattccc tagtggaaac tttgaagaac 3180 aggaactgct ccgccaaaat tgcaaggtgg tccctgttct tagaaaacta tgaatacact 3240 atgcagtatc gccctggaac agcaatgggt cacgctgatg cacttagtcg ctccaaaatg 3300 gcaggtgctg ttgacgagtt ggatctcgac attcagttgc agatagctca gggccgagat 3360 cctacattag ttcatctaag aaccgaacta gaaacaaaac caattccagg gtacacacta 3420 ctggatggtg tgatatatcg tcaatcccct gaaagcaaat tgcaattgat cgttccaaaa 3480 gaactgataa aaaatgtgat tagaagcact catgaaagca taggccattt aggtgttgac 3540 aagtgctgtt cacaaatcgc caaacactat tggttttcag gtatgaaaaa tcaggtacaa 3600 aatttcattc agaattgcct taaatgtata cttttctctg cccccccaag aaaaaacaag 3660 cgcaatttgt acaacatccc caaaagttca gtacctttcg atacccttca catcgaccat 3720 tttggcccat taccatccat aaaatctaaa aagaaataca ttttagtggt gatcgattct 3780 tttactaaat tcacaaaact ttaccccacc actacgacca acacaaggga ggtttgttct 3840 gtgttgggac aatattttaa ctattatagc cgtcccaaac gaatagtgag tgatcgtgca 3900 acttgttttt catctcagga gtttaaaacg tttctcagtg atcgtaacat aatacatgtt 3960 caaaatgccg tttgttcccc tcaagccaac ggtcaggttg agagagttaa cagagtaatc 4020 aaacccatgt taagcaaaat tactgattcc gtggatcatg cagattggtg ttccaaatta 4080 tctgaagtcg aatatgcgct gaacaacaca acacactcgt caacgtattt tgctccttcg 4140 gttttgttat tcggtgtcga acaacgtggt accattataa acgaattcaa agagtatcta 4200 gataacaaaa acgaaccttc tagaaattta gaaactattc gttctgaagc ttctgaaaat 4260 ataaaaagat cacaggaaat caacctgatt cagtttggca aaagacacaa cccagcggta 4320 gaatttgcgg aaggggatct tgttgttatt cgaaacgtcg ataattcagc aaattctaac 4380 aagaaattta ttgccaagtt caaaggtcct tacatagttc acaaacaact tccgaacgac 4440 cgttatgtaa ttcgagatat cgatggtttc caacatactc aaatcccata tgatggtata 4500 cttgagtctg ataagcttag acattggatt gcgccagatg ctgatcttgc cccagaatta 4560 tgtgatgcga cactacatga tgagaatgtt taacgaattg aggacaattt atttgtcagg 4620 ataggccgag t 4631 // ID GYPSY33-LTR_AG repbase; DNA; ANG; 222 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY33-LTR_AG is an LTR of retrotransposon GYPSY33_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY33_AG; GYPSY33-I_AG; GYPSY33-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-222 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY33_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 59-59 (2004). XX DR [1] (Consensus) XX CC GYPSY33-LTR is a long terminal repeat of GYPSY33_AG (its internal CC portion is deposited as GYPSY33-I_AG). XX SQ Sequence 222 BP; 73 A; 42 C; 45 G; 62 T; 0 other; tgtaggcgaa tgtggatcat aaacgtcaac ctcatacatc aactgtcaac cgaataatgt 60 ccaaaccttg tacaaatgta aaatcttggc aacccggccg agctgtcaga aaaagagaat 120 aaaaagggga attcacgttg tgtacgcagt tggaaaagca cgtgtaattc tgtgtcttat 180 gtccttcagt ccttaagaat attttatatg acgttgttta ca 222 // ID T1 repbase; DNA; ANG; 4634 BP. XX AC M93689; XX DT 03-DEC-2002 (Rel. 7.11, Created) DT 20-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE T1 is a non-LTR retrotransposon. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; AGT1; KW CR1 clade; ORF1; ORF2; T1; T1_AG; reverse transcriptase. XX NM T1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Besansky J.N., Bedell A.J. and Mukabayire O.; RT "Evolution of the T1 retroposon family in the Anopheles gambiae RT complex."; RL Mol. Biol. Evol 7(3), 229-246 (1990). XX RN [2] RP 1-4634 RA Besansky J.N., Bedell A.J. and Mukabayire O.; RT "A retrotransposable element from the mosquito Anopheles RT gambiae."; RL Mol. Cell. Biol 10(3), 863-871 (1990). XX DR Genbank; M93689; Positions 1 4634. XX CC T1 is a CR1-like non-LTR retrotransposon. XX FH Key Location/Qualifiers FT CDS 1487..4414 FT /product="T1-ORF2p" FT /note="reverse transcriptase" FT /translation="LTINRPFVIYYQNVRGLRTKYNELRLSANESGFEMLA FT LTETWLNESIPSNMVLDSDSYNIYRCDRSRLNNERSRGGGVLLACSSRYPS FT VALNMNQPTLEALCIRVSFPKFRLYVGIVYVPPYLSSDRNYFESLSAFIXD FT AYMHMKPNDHLILLGDFNQPALGWSPAAAVRSDSSLPMRHYVPHISLNSSS FT SCFLDVLNLHELYQLNGVHNHSNHYLDLVLSNSAAAACSSVYPASSLLLPQ FT DAHHPALEIALPSSLFRASRVRNELPSAPNSLSVRYNFRLTDYRKLNSILS FT RADWSFFYQCTSVDEAVQSFNALLTSALLSCTPIFRSPPNPPWSNRTLRNL FT KKDRMKYLRRYRLNRSAFNFRLFKYAASAHRLYNRARFEAYSSRLQSRFRS FT DPASFWQFVRIRRGCNTLPNEMVLDSRTASTPVEICELFSAHFSQMFEPPV FT SDPNLIEGGLLYTPENLINLSDISVSSETVVQVLFGLKRSFTPGPDGIPAS FT VLINCKDVLAPHLAKIFNLSLSLGVFPALWKSCWLFPVHKKGCRSIVSNYR FT GITQTCATAKTFELCIFPTILHSCSSAISPKQHGFMPGRSTSTNLMSFVTN FT IFRSFEAGTQLDAIYTDFHAAFDSLPHSLLLAKLSKLGFGDGIISWLSSYL FT SNRSCRVKTGSYLSEEFFCTSGVPQGCVLSPLLFSLFINDVCNVLPPDGHL FT LYADDIKIFLPVSSSSDCMSLQHYLNAFVHWCSSNLLRLCPDKCSVISFSH FT SLSPISFNYTLSNSSLSRVLSIRDLGIILDSRLNFKLQLDEVLLKANRTLG FT FILRFTSIFRDQSFLRNLYYALVRPLLEYASIIWNPPTIDGCSRIESIQRL FT FTRVAFRRLFGAASLPPYETRLQLFNLHSLSFRRQVSQACFIGGLLLSDTD FT APDLLSSISLYVPSRSLRPRDPLSIETRHTLYTFNDPILSCFRLFNHFYYL FT FDFDSSLNSFRNRIFSSNSL*" FT CDS 169..1497 FT /product="T1-ORF1p" FT /note="DNA/RNA-binding protein" FT /translation="ILAPSLLLFRQFCRDIVWLRSCSCHSSVCAVSFVMQC FT STCNAPTDSANSVSCAGVCGSKHHTHCTGLSRDSTRELGRNNQLLWLCKNC FT NEFRNGTNSLLTSEIAALLELVKAEILTTIDSSLSSLRSAIKSDLLAEILA FT LADKLTPVLAKPSVSQPSRTHTSTNASSLNATNTRTTKTASTRRTFTNSME FT LTADIQQAANDTNTVEASDSCNHYTHRTKVTSDISAGPCRTNTKSSSDPVL FT NHDTTNTGIAEKVWLYFTNIKSHVSADDMRVWLKAVLPTDNIDVYRLTKKG FT ANLDLMSFISFKVSIPKSLKDLALQSTIWPVSLTVREFVDRGLPKQRIHER FT ARFDPSALTSHRSSSANCSSAAPKSTAHPDHFLDHRSPSPQRGNQSLSQMT FT EILEAIQPEFPPTPPQLSPGVGLQSQNNLSNTNRSPQISPFAKRIAHN*" XX SQ Sequence 4634 BP; 1059 A; 1101 C; 880 G; 1593 T; 1 other; agagagagat cgactgtcaa accgagtggt tgtgccttcc ttgtgctcct gatttttgat 60 gctgttttgc cgtgttactg cctgattttc aacatttgca acattggtgc tgctgagttg 120 cttgtactgg ctagtgcgtg ccttcgattg ttatcaagtt gcacttgaat tcttgcacct 180 agtctcctgt tatttcgtca gttttgccga gatattgttt ggcttcgttc ctgctcctgt 240 cactcgtcag tttgtgctgt gtcattcgtt atgcagtgct caacctgcaa tgcacccacc 300 gatagtgcaa attcggtgtc ctgcgccggt gtgtgtggct ccaagcatca tacccattgc 360 acgggtttgt cccgtgattc tactcgagag cttgggcgga ataatcaatt gttgtggttg 420 tgcaaaaatt gtaacgagtt tcgcaatggc acaaactcac ttctcacaag tgagatagca 480 gccctactcg agttggtgaa agcagaaatt ctcaccacga ttgactcatc tctctcttct 540 cttagatcgg ctatcaagag cgatttgctt gctgagatcc tcgctctcgc tgataagcta 600 acacccgtat tagctaagcc gtctgtttct cagccatcgc gaacgcacac gtccactaat 660 gcatcgtcac tcaacgccac taataccaga acgactaaaa cagcatccac tcgccgtaca 720 tttaccaact caatggagct cactgcagat atccaacaag cagcgaacga taccaacact 780 gtggaagctt ctgatagctg caaccactac actcatcgta ctaaggtgac tagtgatatt 840 agtgctgggc catgccgaac aaatacaaaa tcatcttctg atcctgtttt gaaccatgat 900 accacgaaca ctggcatagc agaaaaagta tggttatact tcacgaacat caaatcgcat 960 gtctccgctg atgatatgcg tgtgtggctt aaagctgtgc tgccaaccga caacatagat 1020 gtttaccgtc tcacgaaaaa gggtgcaaac cttgatttga tgtcctttat atcgtttaaa 1080 gtgagtattc ctaaatcact taaggatctg gcgcttcaat ctactatttg gccagtttca 1140 cttactgttc gtgagtttgt tgatcgtggc ctaccaaagc aacgtataca tgaaagggcc 1200 cgatttgacc cttctgcgct tacttcgcat cgttcaagca gtgcaaattg ctcttcagct 1260 gcgccaaaaa gcaccgctca tccggatcat tttttggatc atcgatcgcc atccccacag 1320 cgcgggaatc aatcactatc ccagatgacc gagatcctag aggctatcca accggagttt 1380 cctcccacac cccctcagtt atcaccgggg gtggggcttc aatcacagaa caatctcagc 1440 aacacgaatc gctcaccaca gatcagcccg tttgccaaac ggatagctca caattaatag 1500 acccttcgtg atctattacc aaaatgttcg aggccttcgc accaaatata atgaattgcg 1560 cctttctgcg aatgaatcag ggtttgaaat gcttgccctt actgaaacct ggttaaatga 1620 atcgattcca tccaatatgg tcctggatag tgattcttac aatatatacc gttgcgatcg 1680 cagcaggtta aacaatgaac gatcgcgtgg gggtggtgtg ctgcttgcat gttccagtcg 1740 ttatccgtct gtggcactta acatgaatca acctacgctt gaagctttat gtattcgtgt 1800 ttcttttcct aagtttcgtc tttatgtggg gattgtttat gtgccaccgt atttgagcag 1860 cgaccgcaac tatttcgaat ccctttctgc tttcatcagn gatgcataca tgcatatgaa 1920 accgaatgat catcttatcc ttcttggcga cttcaatcaa ccggcgttag ggtggtcgcc 1980 tgcagccgca gtaaggtcag attcatcttt acctatgaga cattatgtgc cacatatctc 2040 tttgaattca tccagttcct gctttttgga tgtgttaaat ttgcatgaac tctatcagct 2100 gaacggggtg cataaccatt caaatcatta tctggacctg gtgctctcta actctgctgc 2160 tgctgcttgt tcttctgtgt atcctgcttc gtcactgctc ctgccccagg atgcccatca 2220 tcctgctctg gaaattgcgt taccgtcttc tttatttagg gctagtaggg ttaggaatga 2280 attgccttct gctcctaatt cattgagtgt tcgttataat tttcgtctta cagactatcg 2340 taaacttaat tctattctat ctcgtgccga ctggtctttt ttttatcaat gtacatcggt 2400 cgacgaggct gtccaatcgt ttaatgcttt gttaacctct gcactccttt catgtacacc 2460 tatttttcgt tcccctccta atcctccctg gtccaatcgt actcttcgca acctgaaaaa 2520 ggatagaatg aaatatctta ggaggtatcg tctgaaccga tctgctttca actttcgttt 2580 atttaagtac gctgcctctg cgcatcgact atacaacagg gctcgttttg aggcctattc 2640 gagtagactg caatcgcgtt tccgttctga tccagcatcc ttctggcaat ttgttaggat 2700 tcgaagaggg tgcaatacgt tacctaatga aatggtactt gattctcgaa ctgcctctac 2760 gcctgttgag atctgtgagc tattctctgc acatttttcc caaatgtttg agccaccggt 2820 tagtgaccct aaccttattg agggtgggct actctacacg ccagagaact taattaatct 2880 ctccgatatt tcggttagct ctgaaacagt tgtacaggtg ttatttgggt tgaaacgttc 2940 ttttactcct ggtccagatg gcattcctgc ctcagtttta ataaactgta aggacgtgct 3000 tgctccacac cttgctaaaa ttttcaacct ttcactttct ctcggggtct ttcctgctct 3060 ttggaaatcc tgttggcttt ttccggtaca caaaaaggga tgccgtagca ttgtctctaa 3120 ttatcgtggg ataactcaaa catgtgccac agccaaaact tttgagctat gtatctttcc 3180 aaccatactt catagttgta gttccgctat tagccctaaa cagcatgggt ttatgcctgg 3240 taggtctact tctactaatc tcatgtcttt tgttaccaat attttcagat cttttgaggc 3300 aggtacccaa cttgatgcaa tatacactga ctttcatgct gcatttgata gtttgcccca 3360 ctctttacta ttagctaaac tatctaaact tggttttggt gatggcatta ttagctggct 3420 gtcctcatac ttaagtaatc gatcttgcag ggttaaaacc gggtcgtact tatctgagga 3480 gtttttttgt acgtcaggtg tccctcaggg ttgtgtgcta agtccacttc tgttttcttt 3540 gttcatcaat gatgtctgta atgttttacc tcctgatggt catctccttt atgcggatga 3600 tatcaaaatc tttttacctg tgtcctcttc ttctgattgt atgagtcttc agcattacct 3660 taatgcattt gttcattggt gttcatccaa cttacttcgc ttgtgccctg ataaatgttc 3720 tgttatttct ttctctcact ctctttctcc tatttcattt aactatactc tctctaactc 3780 gtctctctct cgtgttttgt ccatccgtga ccttggtatt atactcgaca gtcgtcttaa 3840 ctttaaactg cagcttgatg aggttctact aaaagctaat cgaactcttg ggtttatttt 3900 acgttttacc tctattttta gagatcaaag cttcttaaga aacctttatt atgctctggt 3960 aaggcctctt cttgaatatg ctagcatcat ctggaatcct cctactattg atggctgttc 4020 gagaattgaa agcattcagc gcctttttac cagggttgct tttcgtcgtt tgttcggtgc 4080 tgcctcacta cctccctatg aaacgcgatt gcagttattc aatcttcact ctttaagctt 4140 ccgccgccaa gtgtctcagg catgttttat tggtggctta ttactttctg atactgatgc 4200 tcctgattta ctctcgtcca tctcgttgta tgttccctct cgttcccttc gtcctcgtga 4260 tcctctgtca attgaaacac gtcatactct ttatactttc aatgatccta ttctatcctg 4320 tttcaggttg tttaaccact tttactatct ctttgatttc gactcctctc tcaactcttt 4380 ccgtaaccgt attttttctt ctaattctct ttaattattc ttctaagttt cattaagttt 4440 tgatagtctc tacgctctac ctatgttttt ttttctttaa tttttttgct aggtctagac 4500 tagtttagtt aggcttagtt tttcatgaat tattttgttt atttgttagg gtttatttag 4560 ttttaagtct gccttattta gccttgatgg cggatattgt attaataaat gaaatgaaat 4620 gaaatgaaat gaaa 4634 // ID HidaAg1 repbase; DNA; ANG; 5523 BP. XX AC AB090822; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon HidaAg1 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; gag-like; HidaAg1. XX NM HidaAg1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090822; Positions 1 5523. XX FH Key Location/Qualifiers FT CDS 274..1695 FT /product="HidaAg1_1p" FT /translation="MSFSSQMQNALGLTRSMSADTSKTSVGKQLPASGIPT FT LRAPMAAGNAGSVVSKTVEDLQRSLAAEKEEKMKLTVLLQELQAQISIMMK FT KSRETKEEARRDKEKAIRHREEYRRDMALIREENTKLLAQLMAMKVVTTTA FT GSIPSASLSQRQQSSPQPSMASVVANGDTASTSHRVTLTQSQYRRAPISNF FT VESDGIWREVTRRKSRRSDNRRNERESTQYQQSVHQPQQSSRDQQHGAQHR FT PQTTRPNRQDIIEVTSFTGKMWYQVYKQIREAPEMEKMNEKLHIGRRTAKL FT NLRMKVARSIDSSEAMARIQGVLRDEGSVRVLTQMTEVIITNVDPLANDGD FT IRSAIGNVTGSASSIATIQLWQLSDGTQRARVCLPLAHAKLIIRLRLKVLY FT TMCAVKEAPHTPIEKLRCYRCLERGHVSRDCHSPVNHSNVCIRCGTSGHLA FT ATCEAEVRCASCAGPHRMGSAQCVQSNSQ" FT CDS 1695..5213 FT /product="HidaAg1_2p" FT /note="endonuclease and reverse transcriptase." FT /translation="MMITLQINISNCSTSQNLMLQAAKEQHADVILVSELY FT RHPPNNGNWAVDSSGRVAVVAAGSRPIQRMWGSAVPGLVAADIGGITFISC FT YASPRMTVAEFEEFLNAVEIEVSAHPNVVLGGDFNAWHEAWGSARTKRKGE FT ELLNTVEQLGLIVLNRGNTSTFTGRGIAGESVIDVTFASPSIVRYNTWEVL FT KSYWYSDHRYVRFSVDSSSVLGNGIQLHRHQHQLQPQQRRFHRQSPAHRRK FT PRWRRAGRRWKVGQFLPESFCFALEAVHFAEIARTPETLQVALSRACDVAM FT ERVSSSTPYYQTKPQVYWWTPERAQLRELCKEASDNAHSRVDPDERAAASE FT IHQERRSELKRAITTGKGQLFQQQIDEVNANVYGSGYQVVTSHLRGSRTPP FT ELDREVLERIVTDLFPDHNSFDWPMPTETSSEPYHIRPVTDLELERIADDM FT CSRKAPGLDGIPNIALKTAIKKHTAVFRSIYQGCFDRGEFPAHWKIQRLVL FT LQKPGKPPGESSSFRPLGMLNGLGKVLERLILNRLNEFLENGETSHLSPNQ FT YGFRRGKSTVQGILRVVQAGRTAKSFNRTNGRDMRCLMVVSLDVRNAFNTA FT SWKSIAMALRSKEVPASLQKLLQHWMTDRQLVFDTDDGPVTRNLSTGVPQG FT SILGPTLWNVMYDSVLDVQLPEGAEIIAFADDLLLLDPGITPEAASQRAEE FT AVSAVNLWMENHCLELAPAKTELVTISSKRQGNINVPVVINGVERRTTRSI FT RYLGVVIDNQLSWKSHVEYCTTKALRTAKALGCLMRNHSGPKCAKRRLLAS FT VVDSILRYAAPVWHEATKNQECRRMLQRVQKHCAIKVSSAFPTVRYQTAVV FT LASMIPICLLVQEDARCYQRQQEAGGALSAGILRAEERINTMQSWQEEWDA FT DASQADASRFVRWTHRVIPDIAAWHFRRHGEVNFHLSQVLSGHGFFRDDLC FT RMGFTPSPDCIRCTGVPETAEHAMFECPRFAEIRQKLLGEANTDAITPETL FT QFHLLQSQEKWSRIAEAAKQITSALQRDWNEERARLAVSSTLSPSHPVGPS FT DRNQVIAARRERRNARRRERRAREREAQQQNPSLVFSASSEATEGRESAHP FT ERREQVRPQRRIRQHMPQQKEVVELSDVTQYATAISEDVYSSNPIQGGLTA FT AESAVATETEISSR" XX SQ Sequence 5523 BP; 1524 A; 1292 C; 1438 G; 1269 T; 0 other; gaagaggttc ccgaaggggt gacgtgtcaa cagtccaggc agaggagctc cgtgtttttc 60 ttccagattt tgttgtaaaa agacgcaaga agataggcat tttataagat aacagcagtt 120 aatcgcgttt acacttgaac cagtgtcata cgacagatta gggtgctgct acactccgcg 180 aattttatgc acacacatcg aaatgacagc cacacagcag ctggaggcag ttaatgtttt 240 ctccacaaag gggagcaagt ttcggtggta aagatgagtt tttcatctca aatgcagaat 300 gcgctcggat tgactcggag catgtcagcg gatacatcaa aaacaagcgt ggggaagcag 360 ttgcctgctt ctggaatacc aacactacgc gctccaatgg ccgctgggaa cgctggctct 420 gtagtctcca agacagtgga agacttacag cggtcgctag cagcagagaa agaggagaag 480 atgaagctca ccgtgctttt acaggaacta caagctcaaa ttagcattat gatgaaaaaa 540 tctcgagaga cgaaggaaga ggcacgtcgt gataaagaaa aagccatcag acatcgtgag 600 gagtaccggc gcgacatggc actaatccgt gaggagaata cgaaactcct ggcccaatta 660 atggcaatga aggtagttac aacaactgca gggagcattc catccgcttc cttatcgcag 720 aggcagcaat cttcaccgca gccctctatg gcatcggtag tggcaaatgg ggacacagcg 780 tctacatcac accgggtgac gctgactcaa agccagtata ggagggcacc aatttctaat 840 ttcgtggaat cagatggtat ttggcgcgag gtgactcgac gcaagtcacg tcggtctgac 900 aatcgccgga acgagaggga atctacgcag tatcaacaaa gcgtgcatca acctcaacaa 960 agttcccggg accagcagca tggtgcgcaa catcgacctc agacgacacg tccgaaccgg 1020 caagacataa ttgaggtaac atcatttaca ggtaaaatgt ggtatcaggt gtacaaacaa 1080 atacgcgaag cacctgaaat ggagaagatg aacgaaaaat tgcatattgg tcgtcgtaca 1140 gccaagctca atctacgcat gaaagtggct cgtagcattg acagttctga ggcgatggca 1200 cgcatacaag gagttctacg agatgaaggc tcagtccgtg tactcaccca gatgactgag 1260 gtcatcataa cgaatgtaga tcccttagca aatgacggag acattcgaag cgccatcggg 1320 aacgtgacag ggtctgcttc gagcatcgcg actatacaac tatggcagct gtcagatggt 1380 actcagaggg cacgagtttg cttgcccctg gctcatgcaa aactaataat tagactgcgc 1440 ctaaaggtgt tatacaccat gtgtgcagtg aaagaagctc ctcacactcc catagagaaa 1500 ctgcgttgct atcggtgttt ggaaaggggc catgtgtcgc gagactgtca cagcccagtc 1560 aatcattcga acgtatgtat ccgctgcggt actagtggtc acttagcggc cacctgcgag 1620 gcagaagtac gttgcgcttc ttgtgctggc ccgcatcgta tgggcagtgc tcaatgtgtt 1680 cagtctaatt ctcaatgatg attacgctgc aaattaatat ttcgaactgt agtacttcgc 1740 agaatcttat gcttcaagca gcgaaggaac aacatgctga cgtgatactg gtatcggagc 1800 tataccgaca cccaccaaat aatggtaatt gggcggtgga ctcatcggga agagtagcgg 1860 tggtagctgc tggatctcga ccaatccagc ggatgtgggg cagcgctgta ccgggtctag 1920 ttgctgctga cattggtggt ataaccttca tcagctgcta tgcttctcct cgaatgactg 1980 ttgctgagtt tgaggaattc cttaacgcag ttgaaatcga agtaagtgcg caccctaacg 2040 tagtgctagg aggggatttc aacgcctggc acgaggcttg gggaagcgct aggacgaagc 2100 gaaaaggcga ggagctgctc aataccgtcg aacagctcgg gttaatagtg ctaaaccgtg 2160 gtaatacctc aactttcacc ggacgaggaa tcgcaggaga aagcgtgata gacgtgactt 2220 ttgcaagccc atcgattgtg cgctacaata catgggaagt gcttaaaagt tattggtata 2280 gcgatcatcg ttatgtccga ttttctgttg acagctcatc tgttttaggt aatggtatac 2340 aacttcatcg tcatcaacac caacttcaac cacagcagcg tcgttttcat cgacaaagtc 2400 ctgcacaccg acgaaaacct cgctggcgcc gcgctggccg acgatggaag gtggggcaat 2460 ttctaccgga atctttttgt ttcgctctcg aagcagtcca cttcgcggag attgcaagga 2520 ctcccgagac acttcaggtg gctctctcta gggcgtgtga tgtagcaatg gaacgcgtca 2580 gttcatcgac accctactat caaacaaaac ctcaggtata ttggtggacg ccagagagag 2640 cacaattacg tgagctctgc aaagaggcta gcgacaatgc ccattcacgt gtggaccctg 2700 atgagagagc agcggcatcg gaaattcatc aagagaggcg aagtgagctg aaacgtgcca 2760 taactacggg gaaggggcag ttatttcaac agcaaataga cgaggttaat gcgaacgtgt 2820 atggttcggg ttatcaggtc gtcacctccc acttgcgcgg tagtcgcact cctcccgaat 2880 tggatcgaga agtgttggag cgcatagtca ccgacctgtt tcctgaccac aattcattcg 2940 attggcccat gcctacagaa acctcatccg aaccttatca catccgacct gtgacggatt 3000 tagagctaga acgtatagct gacgatatgt gctcgagaaa ggcacctggg ctggacggca 3060 ttcctaatat cgctctcaag actgccatca agaaacatac agcagtgttt cgttctatct 3120 atcaaggctg tttcgatcgc ggcgaattcc cggctcattg gaaaattcag cgtttggtgc 3180 tactgcagaa acccgggaag ccaccgggag aatcttcatc cttcaggcct cttggaatgt 3240 tgaatgggct tggaaaggtg ctagagcgcc ttattctgaa ccgtctaaat gagtttcttg 3300 agaacggtga gacttcccat ctttccccaa accagtacgg cttccgtcgg gggaaatcga 3360 cggtgcaagg tatcctgaga gtagttcaag ctggaagaac cgcgaaatct tttaatagaa 3420 caaacggacg tgacatgcgc tgtttgatgg tggtttcctt ggatgtgcgg aatgcattta 3480 atacggcgag ctggaaatcg attgcaatgg cgctaagatc caaagaagtc cctgcttctc 3540 tacaaaagtt actgcagcat tggatgactg atcggcaact tgtttttgac accgatgatg 3600 gccccgtaac gcgaaatttg tctacaggcg ttccacaggg gtccatattg ggccccacat 3660 tgtggaatgt tatgtatgat agtgtgctgg acgtacagtt gcctgaagga gcggaaatca 3720 tagcctttgc cgatgatctg ttgttattag atccgggaat tactcctgaa gcggcttcgc 3780 agcgtgcaga agaagcggta tccgcggtta atctttggat ggaaaatcat tgcctggagc 3840 tggcgccggc caagaccgaa ttggttacaa tctcgagcaa aaggcagggc aatataaacg 3900 ttccggtggt catcaatgga gtggagcgga gaacgacccg aagtattcgc taccttgggg 3960 tcgtgatcga caaccaattg tcttggaagt cgcatgtaga gtattgcacg acgaaagcgc 4020 ttcgaacagc aaaggcgctt gggtgtctta tgcgcaatca cagtggcccc aagtgtgcaa 4080 aacgacgtct tctggcatcc gtagtagact ccatccttcg ttatgccgcg cccgtttggc 4140 atgaagctac caagaatcag gaatgccgga ggatgctgca aagggtgcaa aaacattgtg 4200 caataaaagt gtcaagcgca tttccgacgg tacgttatca aacagctgtt gtgctagcca 4260 gcatgatacc tatctgtctg ttggtacaag aagacgctcg atgttaccaa cgacaacaag 4320 aggcgggagg agccctttcg gcagggatac ttcgagcaga ggagcgcatc aacaccatgc 4380 aaagctggca ggaagaatgg gacgcggacg caagtcaagc tgatgccagc agattcgtgc 4440 gatggacaca tcgtgtcatc cctgacatcg cggcatggca tttccgaaga cacggagaag 4500 ttaacttcca tttgtctcag gttttgtccg gtcatggctt tttccgtgac gacttgtgtc 4560 ggatggggtt cacaccgtcg ccagattgca tcaggtgtac gggtgttcca gagactgccg 4620 agcacgcaat gttcgagtgc cccagatttg cggaaatcag acaaaagctg cttggtgaag 4680 ccaacactga cgcgattacg cccgaaacac tgcagttcca tctcctacaa agtcaagaaa 4740 aatggagcag gatcgctgaa gctgcgaagc agatcacctc cgcattgcag cgggactgga 4800 acgaagaacg agcgcgtctg gcagtttcca gcacattatc accctcgcat cctgttggac 4860 caagcgatag aaaccaggtc attgctgcaa gacgtgaacg ccgcaatgcc aggcgccgtg 4920 aacgaagggc ccgtgagagg gaggcacagc aacaaaaccc atctcttgtc ttttcagcgt 4980 cttcagaagc aacagaaggg cgcgaatcag cccacccaga acgcagagag caggtacgac 5040 cacagcgcag gatcaggcaa catatgccgc aacaaaaaga ggtagtcgag ctttcagacg 5100 tcacccaata tgcaacagct atcagtgagg atgtctactc ctccaaccct attcagggag 5160 gacttacggc ggcagagtca gcagtggcta cagaaaccga aatatcctct cgctaatcgg 5220 tgacacaaca acaaatcgga agctccttca gactcgtcag tgtcctgttg gaattgggtt 5280 ggatagattt gtatttttct ttatttttgt cttaaaaatc attttgcttt attttgtttc 5340 gaactatcaa atagttatta ttaatgtaat acacacacaa gagaactgct atgacggcat 5400 tacaaaatcg cacttactaa ccctcgcggg aattgcatgt tgcgaaaggc gagggagggt 5460 tattctactt tgtttgtata aaataatacg aaataaatct cccgacattt atattaaaaa 5520 aaa 5523 // ID GYPSY61-I_AG repbase; DNA; ANG; 4551 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY61-I_AG is an internal portion of retrotransposon GYPSY61_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY61-I_AG; GYPSY61-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY61_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4551 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY61_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 161-161 (2004). XX DR [1] (Consensus) XX CC GYPSY61_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, is phylogenetically grouped with CC representatives of the mag lineage of other organisms. CC GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, GYPSY23_AG, CC GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, GYPSY28_AG, CC GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, GYPSY59_AG, CC GYPSY60_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, GYPSY65_AG, CC GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, are other CC members of this same lineage in Anopheles gambiae. The CC GYPSY61-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. The consensus encodes the 1490-aa CC GYPSY61_AGP gag-pol like polyprotein (pos. 68-4537). The CC sequence of the LTRs flanking GYPSY61-I is deposited as CC GYPSY61-LTR_AG. CC GYPSY61_AGP: CC MLNSDDPNSDQQPSRSSHESHRQSSVPDPSTAWIHEMFKQQQSILHQQQEAFMKQQESFLTRMM CC SSMNVREPNGPEFLVESLAKQVTEFRYDPEENVTFAAWYRRFEGLFEKDAAKLPDDTKVRLLLR CC KLGVAEHERYNSYILPQKASDFCFEETVVKLRALFGSKESCVSKRYKCYQMQKTSSEDYVTFSC CC RINKATEEAELAGLSIETQKCLIFVCGLKDEQDADVRMKLLSRMEEKKDITLAQLAELCEKLTN CC LKRDTTLLAGESKVQMIGQEKCRSTKGKWQGSRKLHQRPEKQSRVQCFLCGEGHWARKCTFKHH CC KCSKCFKFGHKEGFCNAASRWTNRMRQNRNVRSVTVNTVRSGRGCVRVVINGTPLEMMLDTGAD CC ITIISRRLWHHVGKPVLKPSMVKARTASGDLLHILGEFAGSMIVAEQCMSCVIRVTTADLALFG CC KDAMDIFNLWDVPLTSVCNRVGLDEPCSEILQKEFPKLFSGKLGCCTKAKIQLELKEGATPVFR CC PKRPVAYAMFQAVDKELERLENDGIISKVDYSEWATPIVVVRKSNGTIRVCGDYSTGLNDMLQP CC HQYPLPLPQDIFASLATCTIFSQIDLSDAFLQVEVEELCRKLLTVNTHRGLYAYNRLPPGVKTA CC PGAFQQLMEVMLAGLGGVAVYLDDIVVGGPNEEAHMKNLRAVLKRIEEYGFTIRLDKCSFQKKQ CC IKYLGHLLDSKGLRPDPARIEAILNLQVPTDVTGVRSFLGAVNYYGKFVRNISMLRHPLDSLLK CC EGASFTWTKQCQDAFDQFKAILSSDLLLTHYDPRQEIIVAADASSFALGATISHRFKDGSIKVV CC QHASRTLTKTEQKYSQPDREGLAIIFAVTKFHRLVYGRHFRLQTDHKPLLRIFGSKTGIPVYTA CC NRLQRYALTLLLYDFQLEYVPTDKFGNADVLSRLIAKHEKPEDDYVIASIEIEEDLRSVVNSVS CC KALPLKFSDVERETKNDAQLRKVYEFTRNGWPHAAVRDRTLSNFHSRRESLSTFGDGLLFGERL CC IIPSSQRYKCLQQLHLGHPGIERMKALARSYVYWPGLDSEIESLVKSCRQCAMAAKSPVSSLPI CC CWMKANAPWQRVHVDYAGPIEGDYFLLAIDSFSKWPEIIPTKTISSTATIRILRNIFARFGMPT CC VLVSDNGTQFTSADFAEFCNTNGVEHIRTAPFHPQSNGQAERFVDTFKRALKKIREGGSSVSEA CC LDTFLLTYRSTPSKLLEQKSPAELMFSRKIRTCLELLRPPQRLDPVSEGNPRKFKMNDLVYAKQ CC YRRNSWRWVPGMISGRIGRVMYEVTIEQNRKIRSHVNQLQKRVDRNNTSKVAAQLDNTSNTSLP CC LNVLLDAWNMVEEVHYPDAYQTTSQHVAETAKDCTSSSSSSCPTSSSSSSSSSSPSSSSSSSPS CC SSCPASPSSSSSPAPSSSSPAPSSSSPTSSSSSLQRSAKAFLSPTSPGFETADSGTPSPVQPLR CC RSSRVRRSPLWMAAYKRI. XX SQ Sequence 4551 BP; 1310 A; 951 C; 1162 G; 1128 T; 0 other; gtggcgacga ggagaaaaaa aaacgcctga caacgtagac acacaaaagc acgttagaag 60 atacacgatg ctgaactcag atgaccccaa cagtgaccaa cagccatcgc ggagcagcca 120 tgaaagccac cggcaaagca gcgtccctga tccgagtacc gcttggatac atgaaatgtt 180 caagcagcag cagagcatcc tacaccaaca gcaggaagca tttatgaagc agcaggaaag 240 ttttttaacc cggatgatgt cgtcgatgaa cgtgcgtgag ccgaacgggc cggagttttt 300 agtcgaatca ttggcgaaac aagtgacgga gtttcggtac gatcctgagg aaaatgttac 360 ctttgctgcg tggtataggc ggttcgaggg gctttttgag aaggatgctg ctaaattgcc 420 agatgatacc aaggtgcgat tgttgttgcg gaaacttgga gtggcagaac atgagcggta 480 caacagttat atattgccgc agaaagcaag tgatttttgt ttcgaggaaa cagtagtgaa 540 actgcgcgcg ctgtttggat caaaagaatc gtgcgtcagc aagcgatata agtgctatca 600 aatgcaaaaa acgagctctg aggattacgt tacgttttcg tgccggataa ataaggctac 660 cgaagaagcg gaactagccg ggctcagcat agagacacag aagtgtttga ttttcgtgtg 720 tggcctgaaa gatgagcagg atgccgatgt gcgcatgaaa ttgctgagcc gtatggaaga 780 aaagaaggat attactctgg cacaactagc ggagctgtgt gaaaaactaa cgaacctcaa 840 aagagataca accttgcttg caggtgaaag taaagtccaa atgattggcc aagaaaagtg 900 tcgaagcact aagggaaagt ggcaaggtag tcggaagcta caccaacgtc cggaaaagca 960 gagtagagtg caatgttttt tatgcggcga aggccactgg gctagaaaat gcacatttaa 1020 gcatcataaa tgcagcaagt gttttaagtt tggacataag gaaggatttt gcaatgcagc 1080 gtctcggtgg acgaacagaa tgaggcaaaa tcgcaatgta agaagcgtga cggttaatac 1140 ggtcaggagt ggacgtggat gcgtcagagt tgtcatcaat ggcacaccgt tggaaatgat 1200 gcttgataca ggcgctgata taacgattat ttcgcgtaga ctgtggcatc atgttggaaa 1260 acctgtattg aaaccttcaa tggttaaagc gagaactgca tctggtgatt tgttgcacat 1320 cttaggcgaa tttgctggtt ccatgatcgt agcggaacaa tgcatgagtt gcgtgattcg 1380 agtaactacg gcggatttgg cactttttgg gaaagatgcc atggatatat tcaacctttg 1440 ggatgtgccg ctgacgtcag tttgtaatcg cgttggtttg gatgaaccgt gcagtgaaat 1500 attacagaag gagttcccaa aacttttttc cggcaagcta ggttgctgca ctaaggcgaa 1560 aattcagttg gagctgaaag aaggtgcaac gcctgttttt cgtccgaagc ggcctgtagc 1620 ctatgcgatg tttcaagcgg tggacaagga gcttgaacgg ctggaaaacg atggaataat 1680 ttccaaagtg gattattcag aatgggcgac gccaattgtt gttgtgcgta agtcaaacgg 1740 aactatacga gtgtgcggag actactcgac gggcttgaac gacatgctgc aacctcacca 1800 atacccgctt cctcttcctc aagacatatt tgcaagtctg gccacctgta cgatttttag 1860 ccagatagac ctttcggatg ctttccttca agtagaggtc gaggaacttt gccggaaact 1920 gctaactgtt aatacacata gaggtttgta cgcgtacaac agattacccc caggggtgaa 1980 aacagcaccg ggtgcttttc agcagctgat ggaggttatg ctggcaggtc tgggtggcgt 2040 tgctgtttat ttggacgaca ttgtggttgg tggaccaaat gaagaagccc atatgaaaaa 2100 tcttcgagct gtgcttaaaa gaattgagga atatggcttc accatacgat tggataaatg 2160 ttcgttccaa aagaagcaga taaagtactt gggccatttg ttagattcga agggacttcg 2220 accagaccct gctaggattg aagctatcct caatttgcaa gttcctactg atgtaacagg 2280 agtgagatcg ttccttggag cagtgaacta ttacggcaag tttgtaagaa acatcagtat 2340 gctacgtcat ccgctggata gtcttcttaa ggaaggagcg agttttacgt ggacaaaaca 2400 atgtcaagat gcttttgatc agttcaaggc aattctttca tcagacctgc tcctgactca 2460 ctacgatccg cgtcaagaga taatagttgc ggcggatgcc tcttccttcg cattaggagc 2520 gactattagt cataggttca aagacgggtc gatcaaggtt gtgcagcatg catcacggac 2580 gttgaccaaa acagaacaaa agtatagtca accagatcga gagggtttag ctattatatt 2640 cgctgtgacc aaatttcaca ggctggttta tggaaggcat tttcgactac aaacggatca 2700 taagccgctg ctgcgaatat ttggatccaa aactggtatt ccagtgtata ctgccaaccg 2760 gttacagcgt tatgcattaa cgctgcttct gtatgatttt caattggagt acgtgcctac 2820 agacaagttt ggtaatgccg atgtgttgtc gagactaata gctaaacatg aaaaacctga 2880 agatgactac gtgatcgcaa gcattgaaat agaagaagac ttacgatctg ttgtgaacag 2940 tgtgagtaaa gccctaccgc tgaagttcag tgatgttgaa cgcgaaacga aaaatgatgc 3000 tcaattgcgg aaggtgtatg aattcactag aaatggatgg ccacatgcag cagtgagaga 3060 caggactttg agcaatttcc acagtaggcg tgagtcctta tctacatttg gtgatggtct 3120 cctttttgga gaaaggttga taataccatc gtcgcagcga tacaaatgtt tgcagcagct 3180 acacctaggc catccaggga tagaacggat gaaggccctg gctcgaagct atgtgtactg 3240 gcctggcctt gattcggaga ttgagagcct tgttaaatcg tgtcgacagt gtgcaatggc 3300 agccaaatca cctgtatcca gtttaccaat ctgttggatg aaagcgaatg cgccgtggca 3360 gcgagtgcat gtggactatg ctggtcctat cgagggtgat tattttctgc tagcgatcga 3420 ctctttttca aagtggccgg aaatcatccc tactaagaca atatcatcaa cagccaccat 3480 tcgcattctt cgcaatatat ttgcacgttt cggtatgcca actgtcctcg tgagcgataa 3540 tggaactcaa ttcacaagcg cagactttgc agagttctgt aacacaaatg gtgttgagca 3600 tatccgtacg gcaccctttc atccgcagtc gaacggacag gcggaaaggt tcgtcgacac 3660 cttcaaaagg gcgcttaaga agataagaga agggggaagt agcgtatcag aagcattaga 3720 tacctttcta ttgacataca ggagtacgcc gagcaaactg ttggagcaga aatcgcctgc 3780 tgaattgatg ttcagccgaa agataagaac ttgcttggaa ctgctacgtc caccacaaag 3840 attagatcct gtatcagaag gtaatccgag aaaatttaag atgaacgacc tggtgtatgc 3900 aaagcaatat cgtcggaata gctggagatg ggttccaggt atgatcagcg gacgcatcgg 3960 tagagtgatg tatgaagtaa ctattgagca aaacaggaag atacgttcac acgttaacca 4020 gttgcagaaa cgagttgata gaaataacac cagtaaggtc gcagcccagc tggataacac 4080 ttcaaatacg tcgttgcctt tgaatgtgtt gttagatgca tggaacatgg ttgaggaagt 4140 ccactatcca gacgcttatc aaactactag tcaacatgtg gcagaaactg caaaagattg 4200 tacgtcttct tcatcgtcat cgtgcccaac gtcatcatca tcatcatcat catcatcgtc 4260 cccatcttca tcgtcatcgt catcaccatc gtcatcgtgc ccagcttcac cgtcatcgtc 4320 atcgtctcca gccccatcgt catcgtctcc agccccatcg tcatcgtccc caacttcatc 4380 gtcatcatca ttgcagcggt ctgcaaaagc gttcttgtca ccgacctcgc cagggtttga 4440 aacggctgac agtggcacgc cttcgcctgt acaacctcta cgtcgatctt cgcgggttcg 4500 aagatcacct ctgtggatgg cagcgtacaa gcgaatctaa agaagggggg a 4551 // ID GYPSY13-LTR_AG repbase; DNA; ANG; 424 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY13-LTR_AG is an LTR of retrotransposon GYPSY13_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY13_AG; GYPSY13-I_AG; GYPSY13-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-424 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY13_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 169-169 (2003). XX DR [1] (Consensus) XX CC GYPSY13-LTR_AG is a long terminal repeat of GYPSY13_AG (its CC internal CC portion is deposited as GYPSY13-I_AG). XX SQ Sequence 424 BP; 171 A; 62 C; 79 G; 112 T; 0 other; tgtagcgcct gcaagatagg caaaacgcgc acagaataat tatagctagc taggtttaaa 60 ttatacacat atacacttac acacatacac atagttagat tttaattatg catatatata 120 tttacacaca tacatctaca attaaggtta cacgtaggat agaatagaat agcataagat 180 aggaaatagg aaatagtaaa ataggcgata attaggatta gccaatgagg aataggatta 240 aggaattttg gcgcgaacga gtgagtataa attaaggata agttgaggtt taggatcatt 300 aacaaaagtc cgatctaaga ggaaacgctg cccgagttaa catagaagaa ataaactcca 360 agtaactaca caaattcaag ccacattatt cttttgggta gttagaagat ttcatatcac 420 taca 424 // ID TRANSIB2_AG repbase; DNA; ANG; 2542 BP. XX AC . XX DT 29-JAN-2002 (Rel. 7, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE TRANSIB2_AG is a TRANSIB-like DNA transposon - a partial DE consensus. XX KW Transib; DNA transposon; Transposable Element; KW TRANSIB superfamily; TRANSIB2_AG; transposase. XX NM TRANSIB2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2542 RA Kapitonov V.V. and Jurka J.; RT "TRANSIB2_AG."; RL Direct Submission to Repbase Update (27-DEC-2002). XX DR [1] (Consensus) XX CC TRANSIB2_AG is a young Transib-like DNA transposon. CC Its copies are only ~1% divergent from the consensus sequence. CC The consensus is incomplete at its termini and encodes a 662-aa CC TRANSIB2_AGp transposase. XX FH Key Location/Qualifiers FT CDS join(315..1889,1893..2300) FT /product="TRANSIB2_AGp" FT /note="transposase" FT /translation="TFYYKDDFLQIQRYRKTIRIQKPTDYNVNKVLQYISS FT VININEIEKQRLVGNLKYLYYRFRTLFAKSSRNMLRFENKNRPWLALELKV FT SNVNLSDSSRGRPKKPFAELSIRNKRRFVANEQKSSVDIEQELYRVRLLAY FT REKNCNLMAVIDKLLSHPENVFESIKNCGKSVSLEESLALFIDNRWSKAQY FT INMYQKTKNMFPSYTALSNFKKTCSPCEDFINVSETKASVGLQAVLNHTAS FT RIINMKKDKIIQNFDSENVSFKNINLLCSWGIDGSTGHSNYQQKFDGVNES FT MVTDSELLVTAFSPIRLAQSENDGNIFWLNLLPQSTRFCRPLAIEYVKESK FT EKVLESINFIKTEISNLIPFKIDLSETKYVIITYSFYMSMIDGKVLAYVTN FT TSSMQCCCICGAAPNEMNSKDNLENGFLAREESLHYGISPLHCWMRFFECL FT LHISYRLEFKQWKVTKNFKDIFTQRKKSIQQKIYEEFGLRVDEPRPMGANS FT TTGNVCRRAFSDVTKLSRILEIDEQLISRKNILIAINCSQPIKPHALSLYC FT KDTYSIYLNNYSWFKMPSTVHRVLAHIGEVILRAPAPIGALGEEAAEGRHK FT LYRQDREIHARKNSRINNLKDIFMQALYSSDPYISSISLDKRLQKTSKNQY FT PDEVKQFFPEFL" XX SQ Sequence 2542 BP; 926 A; 397 C; 463 G; 756 T; 0 other; gaatgtcaac aaatgataat tagtaaatat tttaattatg aatgaacatg aacttcatta 60 aatttatgaa ttatatacgt ttaaaatggt tatctaaaca ccaaaacaaa tatttgctgc 120 aaatttattg attcattaac tgccgattta aactccataa tgattggaat cgggactgac 180 attatgacag atcaaatgtc agactggtgt atcccacgag accccccccc cctccgaaga 240 ggggaacaga aaccgccatg tttacagttt ggtatgtgca agcttatcta gtagtttgtt 300 ggtgcataat ataaactttt tattataaag atgacttcct tcaaatacaa cgatatcgta 360 aaacaattcg aatacaaaaa cctactgatt ataacgtcaa taaagtgcta cagtatatat 420 caagtgtgat taacattaat gaaattgaaa aacaacgact agtgggcaat ttaaaatatc 480 tttactatcg gtttcgtact ctttttgcca aaagctctag aaatatgtta cgttttgaga 540 ataagaatag gccgtggctc gctttagaac ttaaggtatc caatgtaaat ttaagtgata 600 gttctcgagg tagaccaaaa aaaccattcg ctgaactttc catacgaaat aagcgacgtt 660 ttgttgctaa tgagcaaaaa agcagtgtag atatagaaca agagctgtat cgtgtccgtc 720 ttttagcata tagggaaaag aattgcaatt tgatggccgt tattgataag ttactcagtc 780 atccagaaaa tgtttttgag agcattaaga attgtggcaa aagtgtatct ctagaagaaa 840 gtttagcgtt attcatagat aatagatggt caaaggcaca atacataaat atgtaccaaa 900 aaacaaaaaa tatgttccct tcttacacgg ctttaagcaa ttttaagaaa acttgttcac 960 catgtgaaga ctttattaat gttagtgaaa ccaaagccag cgtaggactg caagcagttt 1020 tgaaccatac ggcatctaga attatcaaca tgaaaaaaga taaaattatt caaaattttg 1080 atagtgaaaa cgtaagcttc aaaaacataa atttattatg ttcttgggga atcgacgggt 1140 caactggtca cagtaattat cagcagaaat ttgacggggt gaacgaaagc atggtaacgg 1200 acagtgaact gctagtaact gctttcagtc caataagact agcacaaagt gaaaatgatg 1260 gaaatatttt ttggttaaat ttgctgccac agagcactag attttgcagg ccgttagcga 1320 tagagtatgt aaaggaatct aaagaaaaag tattagaaag tattaacttc ataaaaaccg 1380 aaatttcgaa tttgattcca ttcaaaattg atttgagtga aacaaaatat gtaattatta 1440 catattcttt ttatatgagc atgatagatg gaaaagtatt ggcctatgta acaaatacaa 1500 gttcgatgca atgttgttgt atttgtggag ctgctcctaa cgagatgaat agtaaggata 1560 acttagaaaa cggattttta gctagggaag aatcgcttca ttacggaata tcacctttgc 1620 attgttggat gcgctttttt gaatgcctat tacacatttc gtacagactt gaatttaagc 1680 aatggaaagt aacaaaaaat ttcaaagata tatttactca gcgtaagaaa agcatccaac 1740 agaagatata tgaagagttt ggtttgcgag tggatgaacc aaggcctatg ggtgctaaca 1800 gcacaactgg aaatgtatgc cgtcgtgcat tttctgacgt gactaaactt agtcgtatat 1860 tagaaattga tgaacaactt atcagtagat aaaaaaatat tttgattgcc atcaattgtt 1920 cgcagccaat aaaaccgcac gccttaagtt tatactgcaa agacacatat tcaatttatt 1980 taaacaatta cagttggttt aaaatgcctt caacagtgca ccgcgtactt gcacatattg 2040 gagaagttat attacgagcc ccagcaccaa taggcgctct aggagaagaa gctgctgagg 2100 gtcgacataa actgtataga caagatcgtg aaattcacgc gagaaaaaac tcaagaatca 2160 ataatctaaa agatattttt atgcaagccc tttattcttc agatccctac attagttcca 2220 tttccttaga taaacgcttg caaaaaacat caaaaaatca atatccagat gaagtaaaac 2280 agttttttcc agagttttta taacactagt tggtctagtt ggcacatatt gagtatgcaa 2340 atgatgtttt agagtgatat caatgaagga gaagaagaga atcaagaatt tcaaaatact 2400 agcactacta aatactactg cacgacaatg ttgaagaatt atgaaagctt ttaaactatg 2460 cagaactatt actattaatt taaagaccct gaagtacttg ttgaattgtt gaattgagaa 2520 agtctaccga tgttgtgcat tg 2542 // ID AGAM2 repbase; DNA; ANG; 1751 BP. XX AC . XX DT 14-SEP-2004 (Rel. 9.08, Created) DT 28-FEB-2009 (Rel. 14.03, Last updated, Version 2) XX DE Anopheles gambiae Agam 2 non-LTR retrotransposon - a consensus. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; pol-like domain; AGAM2. XX NM AGAM2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1751 RA Cook M.J., Martin J., Lewin A., Sinden E.R. and Tristem M.; RT "Systematic screening of Anopheles mosquito genomes yields RT evidence for a major clade of Pao-like retrotransposons."; RL Insect Mol. Biol 9(1), 109-117 (2000). XX RN [2] RP 1-1751 RA Gentles A. and Jurka J.; RT "Anopheles gambiae Agam 2 non-LTR retrotransposon - a RT consensus."; RL Direct Submission to Repbase Update (30-JUN-2004). XX DR [2] (Consensus) XX CC 99% average similarity to consensus. XX SQ Sequence 1751 BP; 447 A; 428 C; 353 G; 523 T; 0 other; ccaactttat ccgtctagct gaactgcttc aatctactga ctggtctttc ctggaaggta 60 atccggatgt taactcagca ttggaactat ttaacacaac actactggcg cttatatctg 120 attgctgtcc tctaatgcca ccgcgtcgca gtccaccatg gtctgatgca cgtttacgcc 180 gtctaaagca aaacaaagcg tcatgctttc gcttttacag tacaaatggt acacaacact 240 ccaagttaag gttcattgct gcccataatg tatatagaag ctataacaga caacttcata 300 ctcgttatct cattcgagtg aaattctcgc tcatcagaca cccaacacga ttctggaggt 360 atgttgaatc aaaacgcggt aacagctcgc tccccgatgt tctcacatat aacaacgtct 420 ccacaagcag taaggaaggc atgtgcaact tatttgctga tcgtttcaag gactgcttta 480 ctacggatga tgcaagctta tcattagaag cagctttaaa taatgtgccc cgtgatgttg 540 ttgacattga tgtacgcgat attattatct cggtagatac tgtactacgt gctctgcagc 600 aggtcaaaac ctcatacaac cctggacccg atggtattcc tacggcaatc cttgccaaat 660 gccgcgaatt tttagcagag cctctgtctc aaatctacca actctctttt gcacaaagta 720 ctgtacccac ggcctggaaa tcctctgtga tgtttccggt gtacaaaaaa ggagataaaa 780 actctgctga gaactaccgc ggtataacca ccttgccttc ttgtgccaag gtgtttgaga 840 tcgtcataca aaactcgcta atgtatcact gtcgttctta tatttctaca cgccagcatg 900 gtttctttcc tcgacgcagt gttaccacaa acctggtgga attcgtctcc aactgccatg 960 cagcctttac ttccggagct cagatggatg cagtatacac tgatcttaag gctgcgtttg 1020 atcgtgtgaa ccatcgcttg ctgttggcta agctcgcccg gatcggtctc tctactccgc 1080 tggtgaattg gttcaggtcc tatatctctg aacgtagcta ctacgtacaa atcgatggtg 1140 tctcctctaa cgttttcgag agttcatctg gcgtccctca gggcagcaac ttgggaccac 1200 tgctgttctc gctttttatt aacgacgtca cactggccat tacggaagca gattgtctgc 1260 tttatgcgga tgatgttagg ctgtttcgta tcgtacggaa cacttccgac tctctttctc 1320 tgcaaagatc gattgatgtt ttctctgact ggtgtatcaa caacgacctg ctaatttctg 1380 ttgataagtg tacgtcaatg tctttcttta gaatagctag tccgataagg tatatctata 1440 gcatatcagg gacacaacta ccgcggtgca atacggtcag ggatttagga gtcaccctgg 1500 atcgcaaact agatttccga caacattact gtgatatttt agacaaagct aacaaaatgc 1560 taggatttat tcgtcgacat tcgagagaac ttaatgaccc acactgcctg ttaactctgt 1620 ataagtccta tgttcgttcc atcctcgagt ttagttctac agtttggtgt ccgttctcta 1680 gtgtttggtc caatagaata gaagctgtcc aaaagagagt tactcgtatt gtcctacact 1740 tcactccgtg g 1751 // ID GYPSY54-I_AG repbase; DNA; ANG; 5874 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY54-I_AG is an internal portion of retrotransposon GYPSY54_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY54-I_AG; GYPSY54-LTR_AG; Gypsy clade; KW mdg1 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY54_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5874 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY54_AG, a member of the Mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 100-100 (2004). XX DR [1] (Consensus) XX CC GYPSY54_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the Mdg1 CC lineage of other organisms. CC GYPSY8_AG, GYPSY9_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, CC GYPSY13_AG, GYPSY14_AG, GYPSY15_AG, GYPSY16_AG and GYPSY17_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY54-I_AG consensus was reconstructed after multiple CC alignment of 4-7 copies. CC The consensus encodes the 440-aa GYPSY54_AG1p gag-like CC polyprotein (pos. 305-1624) and the 1183?aa GYPSY54_AG2p CC pol-like polyprotein (pos. 1558-5106). CC The sequence of the LTRs flanking GYPSY54-I_AG is deposited as CC GYPSY54-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1558..5106 FT /product="GYPSY54_AG2p" FT /translation="IFTRAFSIYLPKASISKHAYKKLIQGTLVDFNTKTSV FT PLKQNNMFSVGLQTETCKETKQTLHSTPIDECESLNIECTRDGRLMAKILS FT QNVNKKMVNFLIDSGACFNALSLDIAKELGLDFMNHNDKVTLSSFDGSLSN FT TEGSMFLKLHIGNFVYPIKFYILEKLSVPAIIGADFLKKYTMCIGSNFQII FT YLKKPLNTEPLNTADENIKKNMDELEQIDGTLFENSDNTQKENFELLPHIE FT KQLADCDVVKTSKDRELVGSERYKELMDKINLNHLNNATKINTKKIIADFS FT DIFYLPGDKLTYTQAATHEIETSSNIPIFKRQYRFPTALNMIMEEQIEEML FT KQHIIRPSKSPWNAPVMCIPKKSENDQKKYRIVVDFRALNIITKPFIYPIP FT RIDDIMDNIGNSKIFSTIDLKSGFFQIPIAPKDAEKTAFSTSKGHYEFLRM FT PMGLKNSPATFQKLMNTVLYTIQPIKAFVYLDDIVVFGDTVEEHNNTLTKV FT LEALKCHNLKIEPAKCKLLHTEITYLGHTINEKGIKPTQLNVKAILNMQTP FT KNLKEVRSFLGTINFYGKFIPNVSDIRKPLHELLKKNVKFIWNERCDEAFT FT KLKSLLISEPLLVRPDFGDVFVITTDASDYALGAVLTNEKTIERPIAYASR FT TLVGAEKRYHPIEKELLAIVWAVDHFKHYIYGQKFIIYSDHRPLVSIWRLK FT ENSPVLSRLRLRLQGLECEIRYKKGKENIVADFLSRLPEVKENEHDEEVNN FT TVAVVTRGQTKKKVVLELNNDNGEERIEASKSTTLQDDHWENLMKIDLEKE FT TSEKEAKRKEQLEKVKQTVVTKFQNYNPFSDKIEIERSEINTVLHEFHDAP FT LGGHVGVRRMKKRIKELFFWKGMDRDIEGHVKGCKSCQLNKIGRFNKIPMQ FT ITTTSKQAFEKMFMDIVVLPESERGNKYGLVIQDDLSRYLIVTALENQEAT FT TVAKAFVEEVICRYGAPVEIVTDQGTNFMSQIMKETCKILKIKKINTSAYH FT PQANLVERANRELKIYLRQFVGKNYQQWDSVLPYFAMEYNTSINSSTGFTP FT YELVFGRKARLPTSIYKEKNRKKLYSDFCEELITKLTEIHSIARNNLINSK FT EKRKEKYDRNAIDWQPMWGEMVLVRNNQTGVGQKLQGIWKGPYEVVGIPSE FT QTCEIRNGRKIEKVHNNRLKKFNE" FT CDS 305..1624 FT /product="GYPSY54_AG1p" FT /translation="MNPRVLALRNRSVQVHSKSVNNINPSLPQLNSTANSE FT SCLNTQYNYQKRLQILKGELLNQSLEIDRALNASFSKMAYEFPTEKAMLCI FT PEYHGAAKELDGFLFQVEYFANQIPRGESEEDLIRTVMFKLKGNAAGFFSR FT ILDDTWANVKLNLIKLFGEKMSVEAIFQQVETLQQGVNEPFTQYKERVARL FT KQNIMNIDTKNSEDSYAQKNLKIHFLAGLRNPELQTLARTNKHLEFDDLLE FT YLEDEVVQIEHIRKIEARLNRSQTPVPNNEYDETYNASQNNSFNNRSMGHN FT TTYGGNQPHTLMNNSSASRCDHNNTNNLDTCDFGQPQHNSTTHNQNNTNFN FT YDGQPQHNSTTHNQNNTYFNSSLAQPNRYFASYKNNYTQHPRKHYYQNRNQ FT YTQPPQHHYENTCPPSSQNYYAHKFSQGPSQYTYQKQAYQNMHTKN" XX SQ Sequence 5874 BP; 2297 A; 949 C; 1097 G; 1531 T; 0 other; tctggtgaca gcggtaaagc gcatacttag tagcgcaaat aacgctaatt ctggctggtg 60 tatgaatcag gtatagtggt gaaggaaaaa aaaatttttt ttgggtaagt ggtgaaggga 120 aaatcgtagc ttgttagtga ttaagtgata attagtgaaa taagtgtaaa ttttcatttt 180 tttttctcaa caagtttcct taaaagtatc aacttagtaa taataagcag taagaggtag 240 ttggtacagt agtttaagta tcatttttat taatccttca ttctttacgc ttacgtagag 300 ggtaatgaac ccaagggtgt tggcacttag gaacagatct gttcaggttc attcaaaaag 360 tgtaaataat attaatccat cattaccgca actaaactcc acggcaaata gtgaaagttg 420 tctaaacaca caatataatt accaaaaacg cctccaaatt ctcaaaggag aattattaaa 480 tcaaagcctt gaaatcgata gagcgttgaa cgcgtcattt tcaaaaatgg cttacgaatt 540 cccaacggaa aaggcaatgc tatgcattcc tgaatatcac ggagctgcaa aggaattaga 600 tggattttta tttcaagtgg aatattttgc gaatcagata ccaagaggcg aatcggaaga 660 ggatttaatt agaacggtaa tgttcaaatt aaaaggaaat gctgctggat ttttcagtcg 720 catattagat gatacgtggg cgaatgttaa attaaattta ataaagcttt ttggagaaaa 780 aatgagtgta gaagccattt tccaacaagt ggaaactttg cagcaaggag tcaatgaacc 840 attcactcaa tataaagaaa gggtggcaag gttaaaacaa aatataatga acatcgatac 900 gaaaaacagt gaagattctt acgcacaaaa aaatttaaaa attcactttt tagccggatt 960 aagaaaccca gaactgcaaa ctttagccag aaccaataaa cacttggaat ttgatgactt 1020 gttggaatat ctggaagatg aagttgttca aatcgaacat attcggaaaa ttgaggctag 1080 actaaatcgc tcgcaaacac cggtaccaaa taatgaatat gatgaaacct acaatgcgag 1140 tcaaaacaat agttttaata atcgttcaat gggccacaac acaacgtatg gaggaaacca 1200 gccccataca ctaatgaata atagtagtgc ctcacgctgc gatcataaca acactaacaa 1260 tttagatacc tgtgattttg gtcaaccaca acataatagc actacacata accaaaataa 1320 cactaacttc aattatgatg gtcaaccaca acataatagc actacacata accaaaataa 1380 cacttacttc aacagtagtt tagctcagcc aaatagatat tttgcatctt ataaaaataa 1440 ttatacacaa catccacgaa aacactatta ccaaaataga aatcagtaca cacaaccacc 1500 acaacaccat tatgaaaaca cctgtccacc atcatcacaa aactattatg cacataaatt 1560 ttcacaaggg ccttctcaat atacttacca aaagcaagca tatcaaaaca tgcatacaaa 1620 aaactgatac aaggaacact agttgatttt aacactaaaa ctagtgttcc cttaaaacaa 1680 aataacatgt tttctgttgg tttacaaaca gagacgtgta aggagacgaa gcaaacacta 1740 cacagtactc ccatagatga atgtgaatcg cttaatattg aatgcactag agacggtagg 1800 ttaatggcta aaattttgtc gcaaaatgta aataagaaaa tggtgaattt tttaattgat 1860 tcgggagcct gtttcaacgc cttgagtttg gatatagcaa aagagctagg tttagatttt 1920 atgaatcaca atgataaagt aactttatca agtttcgatg gatccttatc caatactgaa 1980 gggtccatgt ttttaaaact acatattggc aattttgtat accctataaa attttatata 2040 ctagaaaaac ttagcgttcc agctataatt ggagcagatt ttttaaaaaa atatacaatg 2100 tgcatcggat cgaattttca aataatttat ctgaagaagc cattaaatac tgaaccatta 2160 aataccgcag acgagaatat aaaaaaaaat atggatgaat tagaacaaat agacggtacc 2220 ctttttgaaa acagtgataa cactcaaaaa gaaaatttcg aattgctccc acacatagag 2280 aagcaactcg cagattgcga cgtagttaaa acgagcaaag atagggaatt agtaggttca 2340 gaaaggtata aggaattgat ggataaaatc aatttaaatc atttaaataa tgcaacaaaa 2400 atcaacacaa aaaaaattat tgctgacttc agtgatatat tctacctccc gggagataaa 2460 ttaacataca ctcaagcagc gacacatgaa atcgaaacta gttccaatat cccgatattt 2520 aaaaggcaat acaggttccc aacagcatta aacatgatca tggaagaaca aatagaagaa 2580 atgttaaagc agcatataat caggcctagc aaaagcccgt ggaatgcgcc ggtaatgtgc 2640 ataccaaaaa aatccgaaaa cgatcaaaaa aaatatcgga ttgtcgttga tttccgtgcg 2700 ctaaatatta tcactaaacc attcatatat ccgattccta gaattgatga tataatggac 2760 aatattggta acagcaaaat attttcaaca attgacttaa agtcagggtt ttttcaaata 2820 cctatagcgc caaaagatgc tgaaaaaacc gccttttcaa catctaaggg gcattatgaa 2880 ttcttaagga tgccaatggg gctcaaaaat agcccggcta cgtttcaaaa acttatgaac 2940 acggtgctct acacaattca acccattaaa gcctttgtct atttagatga catcgttgta 3000 tttggggata cagtagagga gcataacaat actttaacta aagtattaga agcattgaaa 3060 tgccataatt taaaaattga accggcaaaa tgtaaactgc ttcacacaga gattacttat 3120 ttaggtcaca caatcaatga aaaaggcatc aaaccaacac aattaaatgt aaaggcaatt 3180 ttaaatatgc aaacgccaaa aaatcttaaa gaagtacgtt catttttagg aactattaat 3240 ttttatggaa aattcatccc gaacgtctct gatatacgta aaccattaca tgaattattg 3300 aagaaaaatg tgaaatttat atggaacgaa agatgtgatg aggcatttac aaaattaaaa 3360 agtttgctta tttctgagcc tttgttagta cgtcctgatt ttggagatgt ttttgttata 3420 actactgatg ccagtgatta tgctctagga gcagtattaa caaatgaaaa aacaatagaa 3480 cgcccaattg catatgccag tcgtactctg gtaggagcag aaaagcggta tcacccgatt 3540 gaaaaggagc ttctagctat tgtttgggca gttgaccatt ttaaacatta tatttatggt 3600 caaaaattta ttatttattc cgaccacagg ccattagtat ccatttggcg tctcaaggaa 3660 aattctccag ttttgtctag actgagattg agactgcaag ggttagagtg tgaaattcgg 3720 tataagaaag gtaaagagaa tatagttgcc gatttcctat cacggcttcc cgaggttaaa 3780 gagaacgaac atgatgagga agtaaataat acggttgcag tggtaacacg cgggcaaacg 3840 aaaaaaaaag tagttctcga acttaacaat gataacgggg aagagagaat agaagcaagt 3900 aagagtacta ctttacaaga tgatcactgg gaaaacttaa tgaaaattga tttggaaaag 3960 gaaacaagcg aaaaagaagc taagagaaaa gagcaattag aaaaagttaa acaaactgtc 4020 gtaaccaaat tccagaacta caacccattt tctgataaga ttgaaattga gagaagcgaa 4080 ataaatactg tgcttcatga atttcatgat gccccattgg gaggacatgt gggagtgaga 4140 cggatgaaga agaggatcaa ggagttgttt ttctggaagg gaatggacag agacattgag 4200 ggacacgtta aaggttgcaa gtcgtgtcaa ttgaataaaa ttggccggtt caataaaatt 4260 cccatgcaga taacaacaac gtcaaaacag gcatttgaaa aaatgttcat ggatattgtt 4320 gttctgccgg aatcagagag agggaacaag tatggattag tcattcaaga cgatcttagt 4380 agatatttaa tagtaacagc cctggaaaac caagaagcaa caacggtggc aaaagcgttt 4440 gttgaggagg ttatttgccg ttatggagct ccagtagaga tagtaacgga ccaaggcaca 4500 aatttcatga gtcagattat gaaagaaaca tgcaagatct taaaaataaa aaaaataaat 4560 acatcagcat accacccaca ggcaaattta gtagaaaggg ccaacaggga actcaaaata 4620 tatttgagac aatttgtggg caaaaattat caacagtggg acagtgtgtt gccatatttt 4680 gcaatggaat acaacacaag tataaattcg tctacgggat ttactcctta tgaattagta 4740 ttcgggagaa aagctagact tccaacttca atttataaag aaaagaatag gaaaaaatta 4800 tatagtgatt tttgcgaaga gcttattaca aaactaacag agattcattc aatagcaaga 4860 aataatttga taaattcaaa agagaaacgc aaggagaaat atgataggaa tgcgattgat 4920 tggcaaccca tgtggggaga aatggtatta gttaggaata atcagacggg ggttggacaa 4980 aaattgcagg gtatctggaa agggccttat gaagtggtag gaataccgag cgagcaaact 5040 tgcgaaataa ggaacggaag aaaaatagaa aaagttcata ataataggtt aaaaaagttt 5100 aacgagtaac ataaggcatt caaaaaaaaa ataataataa taaaaaccta ggttaagtta 5160 gtctaacaaa caacaagtag atttgtaaat agatatatat gaacacctta taattagcca 5220 aattttcaat cgtagtatca taaattcgcg aataagttta gttatagagt tttcatatat 5280 atctatacat agaaattttt gtaaatagaa ataaggattt aaaaaaaaaa agataaaggc 5340 gactatagct aacattcccg tgagcaatcg atacacattc cctagcgata cacactataa 5400 gcgcaaacac atatcgacac tgaaggcacc aacgaaaagc acacggcata aaaggcatgt 5460 taaaacaaca cctaaactaa acaaggcacg cgaaatgaaa ggttgagctg gcgaggttga 5520 gcggatgagg ttgaatgatt gatagaaacc gaattgctcc cccagcgttt gttataagcg 5580 ataagtcccc aagccagaac agaacatctg aaatacgaaa taggtcaaat tagtaaaatt 5640 tagggtacga taatattaca ttcaaacccc aaacgacact atttacgtat attgcgagca 5700 ccagcaactt cagatgatta atagcgagac tgcaccacga gggtcgttca ggcggaaatg 5760 atgcgaaaac cagagatatc aattgataag gcgatatcga tttgcatgaa aaaaacggca 5820 ctataaaaaa ttttcttgca gcactttcgt taaaaaaaaa aaagtgtaat gata 5874 // ID BEL5-I_AG repbase; DNA; ANG; 5282 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE BEL5-I_AG is an internal portion of the BEL5_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL5-I_AG; BEL5-LTR_AG; BEL5_AG; Bel clade; RING Zn-finger; KW integrase; peptidase; reverse transcriptase. XX NM BEL5-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5282 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL5_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(4), 69-69 (2003). XX DR [1] (Consensus) XX CC BEL5_AG is an young family of Bel/Pao-like LTR retrotransposons. CC BEL5-I_AG, an internal portion of BEL1_AG is flanked by CC BEL5-LTR_AG CC LTRs. The BEL5-I_AG consensus sequence was reconstructed based on CC multiple alignment of 20 copies; they are ~1% divergent from CC the consensus sequence. CC The consensus sequence encodes one 1723-aa BEL5_AGp Bel-like CC protein CC (pos. 93-5261). CC BEL5_AGp is composed of the peptidase (pos. 118-200), RING CC Zn-finger (pos. 300-375), reverse transcriptase, and integrase CC (pos. 1414-1600) domains. XX FH Key Location/Qualifiers FT CDS 93..5261 FT /product="BEL5_AGp" FT /note="Bel-like protein: composed of the peptidase FT (pos. 118-200), RING Zn-finger (pos. 300-375), FT reverse transcriptase, and integrase (pos. FT 1414-1600) domains" FT /translation="MAEARKVKSLRIQQTATQLSIQAIHTFSKSYNKETQR FT AEAIIRKQNLQKQYAKFIDLQDELLPLDEGNEAENLALRQTVESAYYDAEA FT NLASAVEANDKKPCVPASNIKLPDVKLPVFDGKPQNWSSFHSIFVAMIDSA FT ELYSGVQKLYYLRTSLSGPALQLIQSVPISEENYSVAWNLLLNHYNHPKRL FT KQLHVEALFEDAALKKECANELRKLIENFEANVNALTQLGEPTAQWDTLLI FT QMLSRQLDPSTLRSWKEHSAEKKIDSYSDLVAFLYRRVGVLEVLPSTSSGK FT PPKQRVFATTTMPNNNTKGCACCNRDHPVYTCDEFKKLSLNAKQKVITQHK FT LCYNCLRPGHHLRDCKSASTCKSCHKRHHTQLCSLPQPSPTVPSSEEDQRD FT PPTTLASTSVVESITCASAGQHKTVLLATATVIIVDDEGHKHNARVLLDSG FT SESCFITENLAQQMKSTRERSNLCISGISSTNTTAKQSIRATLRSRVGRYF FT ANLQFFILPRVTGNLPSSSIDTTGWNLPDNIFLADPHFDCIGRIDVLIGAE FT VFFDIMRPAGRILLGKDQPVLVNSELGWIVSGPAVSTLTPTSQASSITVNH FT ASTTVQDVHKLMERFWXIEEGEIVNAQSNEHAACEEHFRRTVSRNSSGRYV FT VRLPLKEHLLTKLVDNRKAAVRRFHFLQSRLSSNNEFRNSYSSFIDEYAEL FT GHMKRISEEEYNNTNQHHYYLPHHAVTRQESLTTKLRVVFDASSKTSSGIS FT LNDVLMVGPTIQDDIRSIIMRARKHSIMIVADIKMMYRQVLVDXRDTSLQL FT IVWKPTPSESLQTYKLCTVTYGTASAPYLATRVLSQLADDEGSSYPIAAKV FT LKKDFYVDDLLTGTSTAAEASEVITQLTALVSKGGFTLRKWATNDEQVRQT FT ISKDKLSEDESFCFDRDQIIKTLGLHWHPLNDSMTYRIKPFEEKLITKRST FT LSGIARLFDPIGLIGPVVTKAKIFMQSLWTLKASDGSIWNWDTELPEQLQK FT QWLSFKNELNLLNTIQIQRCVLLNEATSVQLHIFADASQTAYGACAYLRST FT NKAGQIKTSLVASRSRVAPLKSQSIPRLELCSALVASELYKSIQQAMQLDA FT DIYFWLDSTVALYWIQASPSKWNTFVGNRVSKIQQATSSCTWSHINGQENP FT ADHISRGLTASELVNCDLWWNGPPWLQLDQEQWPKPQLSSQLPSEVSIEGR FT SITTTAAATKSPPCDAILLVNELLSKFSDYHKLLRVIAYCFRMRTKRDSSQ FT ETVVISTDELWNAEIRILQVVQRDIFEKEWTQLRQNSPVSNKSRLKWFHPF FT LCDDNLIRIGGRLSKSNQPFESKHQILLSAGHPLAEMLIRHLHKKHHHAAP FT QLLITILRQKYWVIGARSLAKRICHECVPCCRARPRLLEQFMAELPTSRIT FT PSRPFSIVGVDYWGPIGLSPIHRRASSGKAFVAVFICFATKAVHLELVANL FT TTAKFIQAFRRFVARRGLCSDIHSDNGRNFLGASRELRALVTSKQHRIEII FT QECTSQGMRWHFNPPKASHFGGLWESAIHSAQKHFFRVLGKTTLPQDDMET FT LLCQIECCLNSRPLVPLSDDPSDLEPLTPGHFLVGSNLKAVPDNKLEDIPS FT NRLKHYQLVQKLLQQIWTRWSAEYLATLQPRSKWLKPPVKIDVGQLVLVKD FT ESTTPLHWPLGRIIKTHPGDDGVARVVTLKTASGEYTRPIAKLCLLPVTSM FT VQN" XX SQ Sequence 5282 BP; 1604 A; 1294 C; 1086 G; 1296 T; 2 other; ttttggtcct tcgaatcgcg gatcgacgat tgatcatcat ttagttcttc gagcaacgaa 60 ttgtggtcat tcgaagagtg taaacaaacg aaatggcaga ggcacgtaag gtaaagtcgc 120 tgcgtatcca gcaaacggcc acgcaactct ctatacaggc aattcataca ttttccaagt 180 cttataacaa agaaacgcaa agagcagagg caattattcg caaacaaaac ttacaaaaac 240 aatatgcaaa attcattgac ctgcaagatg aactgttgcc cctagatgaa ggcaatgaag 300 cggagaatct tgcgcttcga caaaccgtgg aatcagcgta ctatgatgct gaagctaatt 360 tagctagtgc tgttgaagca aatgataaaa aaccatgtgt accagcatct aacatcaagc 420 taccggacgt gaagcttccc gtcttcgatg gcaagccaca aaattggtct agctttcatt 480 cgatctttgt cgcgatgatc gatagtgcgg agctgtattc aggcgtacaa aagttgtatt 540 acctgcgtac atcgctatcc ggcccagcac tccagctaat acaaagcgtt ccaatcagcg 600 aagaaaacta ttccgtggca tggaatctgt tgctcaatca ctacaaccac ccaaagagat 660 tgaagcagtt gcatgtggaa gcattatttg aagatgctgc gctgaaaaag gaatgtgcaa 720 atgaactacg caaactgata gaaaatttcg aagccaatgt aaatgcatta acccaattag 780 gcgaaccaac tgctcaatgg gatacgcttc taatacaaat gcttagccgt cagctcgatc 840 cgtcaacact acgaagctgg aaagaacatt cggcagaaaa gaaaatcgat tcgtatagtg 900 atttggttgc gtttctgtac cgtcgagtag gagtgttaga agtgttgcca tcaacgtcat 960 caggtaaacc acccaagcaa cgtgtatttg caacaaccac aatgccgaac aacaacacca 1020 agggttgtgc ttgttgcaac agagaccatc ctgtgtacac gtgcgatgag tttaaaaaac 1080 tatccttaaa tgcaaagcaa aaggtcataa cacagcacaa attatgttat aattgtcttc 1140 gtcctggtca tcatctacgt gactgcaaat ctgccagcac ctgtaagagt tgtcacaagc 1200 gccatcatac acaattgtgt tctttaccac aaccctcgcc tactgtaccg tcatcagaag 1260 aagatcaacg agatcctccg accacgttag catcaacatc ggtcgtcgag tcgatcacat 1320 gtgcttcagc aggtcaacat aagacagtcc tcctggccac tgccactgtc ataatcgttg 1380 acgatgaagg ccacaaacac aacgcacgag tgctgctaga ttcaggaagt gagagttgtt 1440 ttatcactga aaatctagcc cagcaaatga aatcaacaag ggagagaagc aatctatgca 1500 tctctggaat cagttccacc aacacaaccg caaagcagag catccgagcg acacttcgct 1560 cgcgggttgg gcgatacttt gccaacctgc agttcttcat actaccaaga gtcacgggga 1620 atcttccatc gtcatcgatc gacaccacgg gatggaacct gcctgacaac atctttcttg 1680 cggaccctca cttcgattgc atcggccgaa ttgatgtctt gatcggcgcg gaggtctttt 1740 tcgacattat gagaccagct ggacgaatac ttctcgggaa ggatcaacca gtccttgtca 1800 actcggagct cggatggatc gtatcggggc cagccgtaag tacactcaca cctacttcac 1860 aagcttcttc tattacagtc aaccatgcat ctacaacggt tcaagatgtt cataaactta 1920 tggaacggtt ctgggmaata gaggaaggtg aaatagtcaa cgcacaatct aatgaacatg 1980 cagcgtgcga agaacacttt cgtcgcaccg tttcacgaaa ttcttccgga cgctacgtcg 2040 tgcgtcttcc actgaaagaa catcttctca caaaactagt tgacaatcgc aaggcagcag 2100 ttcgtagatt tcattttttg caatcccggc tcagttctaa caacgaattt agaaacagct 2160 acagctcatt tatcgatgaa tacgctgaac tggggcacat gaagcgcatt tcggaggaag 2220 aatataacaa cacaaaccaa catcattatt accttccaca ccatgcggtg acgcgtcaag 2280 aatcactaac aaccaagttg cgcgttgtct ttgacgcctc cagcaagaca tctagcggca 2340 tatcgttgaa cgatgttttg atggtagggc caacaattca ggacgatatc cgatctatca 2400 tcatgagagc acgtaaacat tcgatcatga tagtcgctga tattaaaatg atgtaccgtc 2460 aagtgctcgt agatgmtcgt gatacatcgt tacaactcat cgtatggaag ccaacaccgt 2520 ccgaatcact gcaaacatac aaactgtgta ccgtcacata cggtacagct agtgcaccct 2580 atttagcaac acgagtattg tcacaacttg ccgatgacga aggcagtagc tatcctattg 2640 cagccaaggt attgaaaaaa gatttttacg tcgatgattt acttaccggc acttccaccg 2700 cagccgaagc ttccgaagta attacacaac taactgcgct tgtttccaaa ggtggtttta 2760 ctctacgtaa atgggccacg aatgatgaac aagttcgcca aaccatctca aaagacaaac 2820 tttcagaaga cgagtcattt tgtttcgatc gcgaccagat tatcaaaact cttggtttgc 2880 attggcatcc attgaatgac tccatgacgt atcgcatcaa accatttgaa gaaaaactga 2940 tcacaaaacg ctcaacgtta tcgggaattg cacgattatt cgatccaatc ggccttatcg 3000 gaccagtagt cacgaaggca aaaatattca tgcaatctct ctggacactt aaggccagcg 3060 atggctcgat atggaactgg gatactgagc ttccagagca gctccagaaa caatggctat 3120 catttaaaaa tgagctcaac ttacttaaca caatacaaat acaacgatgc gttctactaa 3180 atgaagctac tagcgtccaa ttacacattt ttgccgacgc atcccaaaca gcatacggcg 3240 cttgtgccta cttgcgctca actaacaagg caggccaaat caaaacatca ctagtagcat 3300 ctcgatcgcg ggtcgcgcct cttaaatcac aaagcattcc cagactagaa ctatgcagcg 3360 ccctcgtagc aagtgagcta tacaaatcta tccagcaagc tatgcagcta gacgccgata 3420 tctatttttg gcttgacagt acagtcgctc tttactggat tcaagcatca ccgtcgaaat 3480 ggaacacctt tgtcggcaat cgtgtgtcca aaatacaaca agccactagc agttgtacat 3540 ggagccacat aaacggccaa gaaaatcctg cagatcatat ttcacgggga ttaactgcaa 3600 gcgaactcgt caactgtgac ctttggtgga atggcccgcc atggctccaa ctagatcaag 3660 agcaatggcc caaaccccaa ctgtcatctc aactaccatc agaggtttca atagaaggtc 3720 ggtcgatcac tactacagct gctgccacca aaagtccccc atgcgatgca atactgttgg 3780 taaacgagtt gctatccaag ttttctgact accacaaatt gctacgagta attgcatatt 3840 gctttcgaat gagaactaaa cgtgattctt cgcaagaaac tgtcgttatt agcacagacg 3900 aattatggaa tgctgaaatc agaatcttgc aagtggttca aagagatatc tttgaaaaag 3960 aatggactca gctccgtcaa aacagtcctg tttccaacaa atccagacta aaatggttcc 4020 atccgtttct ttgcgatgat aatcttatcc gcattggtgg acgtttatca aaatctaatc 4080 aaccatttga aagcaaacat caaatattgt tgtcagcagg acatcctctg gcagaaatgc 4140 tgattagaca cttgcataaa aaacatcatc atgccgcacc gcaactattg atcaccatcc 4200 ttcgtcaaaa gtattgggtc ataggggcca gatccctagc taaacgcatt tgccatgaat 4260 gcgtaccttg ttgtcgtgct cgtcctcggc tgctagaaca atttatggca gaactaccaa 4320 cttcacgcat aacaccaagc cgaccattct caatagtggg agtggattat tggggtccca 4380 tcggtctatc acccattcat cgccgtgcat catctggtaa agcatttgta gctgtcttta 4440 tctgtttcgc tacaaaggct gttcatctcg agctcgtcgc aaaccttacc acagccaaat 4500 tcatccaagc atttcgtcgt ttcgttgctc gtcgtgggtt atgcagtgac attcacagcg 4560 acaacggacg gaacttccta ggcgcctcca gagaactgcg agcattggtg acgagcaaac 4620 aacatcgaat tgaaatcatc caagaatgca cgtcacaagg aatgcgctgg catttcaatc 4680 caccgaaagc gtcacacttt ggtgggctat gggagtcagc aattcattct gctcagaagc 4740 attttttcag ggtccttggc aaaacaactc tgcctcaaga tgacatggaa acgttgcttt 4800 gtcaaattga atgctgcctc aactcacgtc cactggttcc tctcagtgac gatccgtctg 4860 acttagagcc actaacacca ggccattttc tggtcggaag taatctaaag gcagtccctg 4920 ataacaaatt agaggatata ccatccaatc gtcttaaaca ttaccaactg gtacaaaagc 4980 ttctacagca gatttggaca agatggagcg cagaatattt ggcaactctt cagcctagga 5040 gcaaatggct caaaccacca gtaaaaatag acgttggcca acttgttttg gtcaaggacg 5100 agtcaactac cccattgcat tggccgctag gacgcatcat caaaacacac ccaggcgatg 5160 atggagtagc acgagttgtg acattgaaga cagcttctgg cgaatacact cggccaattg 5220 cgaagttgtg tcttcttcca gtaacttcaa tggttcagaa ctaacgtctg aaggggccag 5280 ta 5282 // ID BEL9-LTR_AG repbase; DNA; ANG; 263 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL9-LTR_AG is a long terminal repeat of the BEL9_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL9-I_AG; BEL9-LTR_AG; BEL9_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-263 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL9_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 48-48 (2003). XX DR [1] (Consensus) XX CC BEL9-LTR_AG flank an internal portion of BEL9_AG (deposited as CC BEL9-I_AG). XX SQ Sequence 263 BP; 99 A; 48 C; 39 G; 77 T; 0 other; tgttggaatt tggataatta ttttcgaact taaatttgaa ttgtacataa aacaaagcgt 60 taggaaacta agattaggtt ataaactact cgacgttaga aactgctcga tctacacgaa 120 ttatgaactg catgaaatat tatatacaaa aaaggacaaa cgaatataca gttgcaaacc 180 gaccagcgat aaaggtgtac actttcctat tcaaatttcc tccaaatatc acagccagcc 240 ccttttcgtg aatatatttc aca 263 // ID GYPSY26-I_AG repbase; DNA; ANG; 3921 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY26-I_AG is an internal portion of retrotransposon GYPSY26_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY26-I_AG; GYPSY26-LTR_AG; GYPSY26_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3921 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY26_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 17-17 (2004). XX DR [1] (Consensus) XX CC GYPSY26_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, CC GYPSY24_AG, GYPSY25_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY26-I_AG consensus was reconstructed after multiple CC alignment of 6 copies. CC The consensus encodes the 1278-aa GYPSY26_AGp gag-pol like CC protein CC (pos. 36-3869). CC The sequence of the LTRs flanking GYPSY26-I is deposited as CC GYPSY26-LTR_AG. XX FH Key Location/Qualifiers FT CDS 36..3869 FT /product="GYPSY26_AGp" FT /translation="MANDEEPRPTTSAANAANATNNNNTAEAPRTITHANF FT AFDAFDKAKCKWSRWVERIETAFEIYAIKDVRVKRNLLLHHMGGEAYDVLC FT DKIAPKRPRECEYKEVIATLEEYFHPEPLEISENFRFKCRRQGDKDAHSAE FT ESVDEYLVALRKIATTCNFGQYLSTALRNQLVFGLKRNDIRNRLLEKRNLT FT LEEARDIAVGMELSKKSSAEIENGGPSQDVHAVQKRNEKHNHKEAKANKSL FT PKNVTCYRCGDAKHVANKCKHINTICNFCRKKGHLAKVCMKKQSQNEVCTV FT ISLNNIVPKWWIDLTVDNITMRFEVDTGSPLTIIGKQCYEKYFKNKPLQKC FT NVELVSYTNNAIEVLGTIKVKVDNEMLPLYVVNLMKRPLLGREWLNAVPDW FT NNRLQVNEVNEMAINTNSLNTILQKYADVFDPNLGKISGVQAHLTLKENAQ FT PIFLKARRVPFNLIDAVDQELDKLVAEDVLEEVPTSKWATPIVPVRKAEGK FT VRICGDYKQTVNQKLQVDQHPLPTVEELFASLAGGKRFSKIDLVQAYLQME FT VAPEDREMLTLNTHRGLFRPKRLMYGIASAPAIWQRQMEGILHGISGVSVF FT LDDIKITGPNDQIHLQRLEEVLKRLNDRNIRLNKEKCLFFAKQIDYCGYTI FT DEHGIHKMRDKIEAIANMRRPSNKDEVRSFVGLVNYYGRFMRDLSTILYPL FT NNLLKNDTPFKWNKEQEKSFQKVKEHIQSNECLVHYSPELPLLLATDASPY FT GVGAVLSHVYPDGTERPIQFASQTLNKVQQKYMHVDKEAYAIMFAVKKFFQ FT YLYGRQFTLVTDNQAIAKILGEHKGIPVMSALRMQHYATYLQAFDYKIRFR FT KAADNANADTLSRLPLSCQDTTTFFEETDGIEINQIETLPLTVKELSQATA FT EDLTVKTLIQGITHGKQVPNENRFGIEQQEFTVQQGCLLRGVRVYVPAKLR FT ARVLEELHSTHFGVTRTKSLARGYCWWPGIDIAIETMIKNCAECQSTRPEP FT SKVPLHCWEKPTAPFERVHVDFAGPFMDTYFFIMVDAYSKWPEIRICKSTT FT AEKTVQMCREIFSSHGLPSVLVSDHGRQFTSDVFQRFLKMNGIVHKMGAPY FT HPATNGQAERYVQTFKQKLKALKCSKSELHLSLCNILITYRKMIHPSTGKT FT PSQLIYGRQIRSRIDLMLPSNEVHHGGNLHTKQFSDGDRVRVRDYLSSDKW FT KFGRVMEKLGKLRYSVRLDDGRVWERHTNQMMGVGEDLPDSNNNLTDAQPT FT SSQDAVNLRRSSRLQGR" XX SQ Sequence 3921 BP; 1343 A; 766 C; 876 G; 936 T; 0 other; aagtggcgac gagggacgga gtaaaaaccc gcagtatggc gaacgacgag gaacctcgac 60 cgactacaag tgctgcaaat gctgcaaatg ctacaaataa caacaatact gctgaggccc 120 cccgcaccat cactcatgca aatttcgcat tcgacgcctt tgataaggca aaatgcaaat 180 ggagtcgatg ggtggaaaga atcgaaacag cgtttgagat atacgctata aaggatgtaa 240 gggtcaagcg aaatcttctt cttcaccaca tgggaggtga agcatacgat gtgctgtgcg 300 acaaaatagc tccgaaacga ccacgggaat gcgagtacaa agaggtgatt gctacgctgg 360 aagaatattt ccatccagaa ccattggaaa ttagcgaaaa tttcaggttc aagtgccgac 420 gtcaaggtga caaagatgcc cattcagcag aagaaagtgt ggacgaatat ttagtggcat 480 tgcgaaaaat tgccacaaca tgcaatttcg ggcagtattt gtcaaccgca ttgcgaaacc 540 aattagtatt tggtttgaag agaaacgata tccgtaatcg tttgcttgaa aagcggaacc 600 ttacgttaga ggaagcgaga gacatagctg ttggcatgga gctatcaaaa aaaagcagtg 660 ctgaaataga aaatggtggt ccatcacaag atgtacatgc agttcagaaa cgtaatgaaa 720 aacataacca caaagaagca aaagcaaaca aaagtttgcc aaagaacgta acctgttatc 780 gatgcggtga tgcaaaacat gtggcaaaca aatgcaagca tatcaatacg atttgcaact 840 tctgtcggaa gaaagggcat ttagcaaaag tttgcatgaa aaagcaaagc caaaatgaag 900 tgtgcacggt aatttcgtta aacaacatcg ttccgaaatg gtggatagat cttaccgtag 960 ataacattac aatgcgtttc gaagtggata caggatcacc gctaaccatt attggaaaac 1020 aatgctacga gaaatatttc aaaaacaaac ctcttcaaaa atgcaatgtc gaactcgtga 1080 gctatacaaa caatgcaatc gaagtgcttg gcacaataaa agtaaaagtg gataatgaaa 1140 tgcttccgct ctatgtggtg aatctcatga aacgaccatt acttggtaga gaatggttaa 1200 atgcggtgcc cgattggaat aatcggttgc aggtaaacga agtaaatgaa atggcaataa 1260 atacaaacag ccttaatact attttgcaga agtacgccga cgttttcgat ccaaatttag 1320 gcaaaatatc gggagtacaa gcacacctga ccctcaaaga aaacgcacag ccaatcttct 1380 tgaaagcacg acgtgtgccg ttcaatttga tcgatgctgt agatcaagag cttgacaagc 1440 ttgttgcaga ggacgtgcta gaagaagtac caaccagcaa atgggcaacg ccgattgtac 1500 cggtgcggaa ggcagaaggt aaagtacgaa tttgcggaga ttacaagcaa actgttaatc 1560 aaaagttgca ggtagatcaa cacccattac caactgtaga agagttgttt gcgtccctag 1620 ctggtggtaa acggttttca aaaatcgatt tagtacaagc ttatctgcaa atggaggtag 1680 cacccgaaga tcgtgagatg ctgactctta atactcatcg aggactgttt cgtccaaagc 1740 gtttgatgta tggtattgca tcagccccgg cgatttggca acggcaaatg gaaggaatcc 1800 tgcatggaat ttcaggggta agcgtttttc tcgatgatat taaaatcact ggtcctaatg 1860 atcaaattca tttgcaaagg ctagaagaag tcttaaaacg actaaatgat agaaacatac 1920 gccttaacaa agaaaaatgc ctgttttttg ctaaacaaat agattattgt ggttacacca 1980 tagatgaaca tggtatacac aaaatgcgtg acaagatcga agcaattgct aatatgcgta 2040 ggccaagtaa caaagacgaa gtacgttcgt ttgtgggtct ggtcaattac tatggaaggt 2100 ttatgcgaga tctgagcaca attctctatc ctttaaataa cttgctaaag aatgacactc 2160 catttaaatg gaacaaagaa caagagaaat catttcaaaa agtgaaagag catatccaat 2220 ctaatgaatg cttagtgcac tattcaccag aactaccact tttgttggca acagatgctt 2280 ctccatacgg agttggtgca gtattgagcc atgtatatcc cgacggcaca gaacgaccta 2340 ttcagttcgc gtcacaaact ctgaataagg tacaacaaaa gtatatgcat gtagataaag 2400 aggcatacgc tatcatgttt gcggtgaaga aatttttcca atatttatac ggacgtcaat 2460 ttacactggt taccgataat caagcgattg ctaaaattct aggagagcac aaaggtattc 2520 cagtaatgtc agctttacga atgcaacact acgcaactta cttacaagca tttgattaca 2580 aaatacgatt cagaaaagcc gccgataatg ctaatgcaga taccctttca cgactgccat 2640 taagttgcca agataccact acattctttg aagaaactga tggcattgaa attaaccaga 2700 ttgaaacact tccacttaca gtaaaagaac taagccaagc aacagccgaa gatctaacag 2760 taaaaacact catacaaggt atcacacatg ggaagcaagt accaaatgaa aacagattcg 2820 gaattgaaca acaagagttt acggtacaac aaggttgctt gcttcgggga gtaagagttt 2880 acgttcctgc taaattgaga gcgcgagtac tagaggaact gcattcgaca cattttggcg 2940 taacaagaac aaaatctcta gcaagaggct attgctggtg gcccggtatc gacatcgcta 3000 ttgagacaat gataaagaac tgcgctgagt gccaatctac aagaccggaa ccatcaaaag 3060 ttcccttaca ctgttgggag aaaccaacgg ctccgttcga aagagttcat gtagactttg 3120 cgggaccttt catggatacc tatttcttta tcatggtcga cgcctacagt aaatggccgg 3180 aaatacgaat ttgcaaatcg actacagcag aaaaaacggt acaaatgtgt cgcgagatat 3240 ttagtagtca tggattacca tcggtattgg tcagcgatca cggaagacag ttcacttccg 3300 atgtgtttca gcgatttttg aagatgaacg gaattgtcca taagatggga gcaccatacc 3360 atccagctac taacgggcaa gcggaacgtt atgtgcaaac gttcaagcaa aagctgaagg 3420 ctttgaaatg ttcaaagtct gagttgcatc ttagtctatg caacatacta atcacttacc 3480 gcaaaatgat tcatccttcc acaggtaaaa caccatcaca actgatttac ggtcgtcaaa 3540 taaggtcacg tattgatctt atgctacctt cgaatgaagt acatcatgga ggaaatttgc 3600 atacgaagca attttcggat ggcgaccgtg taagagtgcg agactaccta tcgtcggaca 3660 aatggaaatt tggacgagtg atggaaaagt taggaaaact gcggtattcg gtacgtctcg 3720 acgatggcag agtatgggaa cggcacacta atcaaatgat gggtgtgggg gaggacctac 3780 cagattcgaa caacaatctg acagatgcac aacctacaag tagtcaggat gcagtgaacc 3840 tgagacgttc gagcagattg caaggaagat gatttcaatg caatcatccg gatccttgtt 3900 ttcagtcgtc aaggggagag a 3921 // ID MARINERN3a_AG repbase; DNA; ANG; 357 BP. XX AC . XX DT 11-FEB-2003 (Rel. 8.01, Created) DT 11-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE MARINERN3a_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN3_AG; MARINERN3a_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-357 RA Kapitonov V.V. and Jurka J.; RT "MARINERN3_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(1), 7-7 (2003). XX DR [1] (Consensus) XX CC MARINERN3a_AG is a subfamily of MARINERN3_AG. CC The MARINERN3a_AG and MARINERN3_AG consensus sequences are 85% CC identical to each other. CC MARINERN3a_AG copies are ~95% identical to the MARINERN3a_AG CC consensus sequence. They are flanked by the TA target site CC duplications. CC This element has 23-bp terminal inverted repeats (1 mismatch). CC Classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. XX SQ Sequence 357 BP; 106 A; 70 C; 69 G; 112 T; 0 other; cagtggagcg ccgattatcc gggctcttcg ggactcgacc tcgcacggat agtcgaataa 60 cacggataat gagtcaaagt atttatttta tcacaaaatc ctggtttgtt tttgaaatta 120 atattatgtt ttggtaaaac ctagccagtt ttataaactt catgtttttc cgatgagaat 180 atttgaaatg tgccatgaat tatagcgtat tttgttatga caaaatgctc tggtaaccgt 240 tatttcaaat atcctcctca gcacgcaatg tcatctatgt attgaactgt catttttcaa 300 cagcacggat aaacggacaa ccggataaaa ggtacccgga taaacggcgc tccactg 357 // ID Ag-R1-6 repbase; DNA; ANG; 5931 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE An R1 clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW R1; Non-LTR Retrotransposon; Transposable Element; Ag-R1-6. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5931 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-5931 RA Kojima K.K. and Jurka J.; RT "Non-sequence-specific R1 clade non-LTR retrotransposons from RT Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 10 CC sequences with >93% identity. It is the most similar to R7 CC elements but no sequence specificity is found. XX FH Key Location/Qualifiers FT CDS 533..1441 FT /product="Ag-R1-6_1p" FT /translation="MELRSRPGPVECSLGARSKKPAMPTVACKESRQPLSF FT RANTSSTARVTEASELRLQLDETRREKAELQALILQLQTQLESMRQQLVES FT SERVDRQAKEAREDSLRRENFLREEVDRRAKEMREEARKDRELIAKMMQRQ FT REQPNTQERQTRKQQDQQCQQQSSQPSSQQQKPHRQVVSNEVNEGASTSAG FT IVEDPFTEVVKRGSRRWQEIRAERRQQQQHQQQLQTTSIAGGRLSVDDHQS FT LHQKNLQQQRREQLNKEQHRPARKQPDKIFVAPAAGVTYFTLYQKVRLNPN FT LPSAKAVLHVS" FT CDS join(1450..3633,3542..5665) FT /product="Ag-R1-6_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="MGEVTIPNIDMLATEADIRKALQIALEKEAILASINV FT WEQRDGSLRARVKLPRSDANYLIDKRLIIDFSSCKVREAPKPSAESRRCFR FT CLERGHMVRECQGTDRSSLCIRYGAADHKAANCTNDVKCLLCGGPHRIAAA FT SCAVTSRPCLMDILQININKCRIAQDLALNTMRVEKADVLLLSELYAVPQN FT NGNWVVDRDRSVAIVTSGVRYPIQRIRSVTVPGIVVADVNGITIVCCYVKP FT NVSVRQFEEIMERIDILIRGHPRVLLAGDLNAWHTAWGSERTKPKGIALLQ FT LVNDLGLEVLNIGTSHTFRGCGSARPSKIDVFFASPSICRPDLAANPATCW FT RILSRYSYSDHFYIRYTVGEQPPREQQSGTARRQRTAVRLAGTRWNTRHFD FT PTLFESALQSTRFEDRATHAKSLIESLTRACDETMLRVFPSQDHTGRPAYW FT WTPAIEELVNECRIAEQQRLASPTDPDISALDRQARHALKTAIKASKKQFF FT DRMLQALHDDETGQCIRKVLYRLQPSRTAQERDPVVLESVVSTLFPDHPPA FT EWPTEDEHEVGNRVPLREVTDLXLQIIASGMQXRKAPGLDGVPNAALASAI FT KNYPGSFRQVYQECLDRSCFPEQWKKQRLVLLLKPGKPPGDPSSFRPICLL FT DNAGKTFERLLLDRLNEHLEDSDNPQLSEYQYGFRRGRSTLLAIQQVVNAG FT RRAMSFGRTNNRDRRCLWGCRARRAQRVMRADEPCRSGAQTTVIGAASGVV FT ALDVRNAFNTASWQCIAEALRDKGVPLQLRNILQDYFTDRELTYDTTDGPV FT TRRVSAGCPQGSILGPTLWNVGYDGVLRLELPEGAQAVGXADXLAILAAGT FT TPEHAAAIAEEAVERVHAWMRQHHLQLAPEKTECVMISSLRRGHPEIPIRV FT CGIEIRSKQAIRYLGVMIYDHLLWRPHVTMAVEKANRVVKVVTNVMRNHSG FT PRVAKRRLLAGVSEAIIRYGAPIWAEATDRQWCRRMLASVQRPLAQRVVCG FT FRSMSYSVAVLMARLIPHHLLIKEDARCHQRYFADPEASRAVIRREERAVT FT LEVWQREWDANASNPGASRYARWAHRLIPEVHSWMAQKRGEVGFHLAQILS FT GHGFFREFLHVCGFAPSPECPECTGSVESVVHVLFHCPRFAEVRRDLLEMG FT VDGSITEENLGQMLLKSSDSWIRIQEVACRITTVLQVRWREDEAALNTLAN FT YTSAEATVNAEQMQDDPAAARRAARNRQARARRAQERVEQLGDVEXASAFR FT QLYGPAASDATSSIPAAVPLLLEATATVDPTPVAAAVAVAEATTATSPAAE FT AAPRVAAEVAPFSATTGMAPPSSSSSPNLSPRRRRQAQRQARAHMPPGRQQ FT NGRAVPSAEELERRRQRQREMERARRQRQRRARDSQAVTIPFSPLPSDEMQ FT ERRSTLTADEEAAISMXATSGR" XX SQ Sequence 5931 BP; 1611 A; 1430 C; 1689 G; 1191 T; 10 other; gagtgggggt tcgacgggta gctaaaccaa ccaagtgtaa gcgggaatac gctccagaga 60 tatttcggat aaacattatg aaacgtgtga aagggcgcgt tggaagagtg tttgttgcta 120 aataagatgg aacactgtaa caaataagtg gagtttcttg ggaatacgct ggagaagtgt 180 gagaaaagac gtataaaata agcgagcaat actgcgagtg gacgataata atgacaagtg 240 acagcagggc cgtcccgttg aacattttat aatgcatgaa gcacgattta aatctaccct 300 cgaatggaag tgttttccga ccaaaggcta acctcgtcca agcggtacca acgatctttc 360 tgtgcatcgg gataaataag tgtggggcag tactgagttg aagtgaatta aaaaaaaaat 420 aataataata atataaaaaa aggaaaaaga ataataatta ttattatatt acataaataa 480 taataaaaat aaaaaaagct catcaggttc aatccaggtt accactaaca tcatggaact 540 ccgatcccgt ccaggcccgg tagaatgcag tctcggggca aggtcaaaga agccggcaat 600 gcctacagtg gcatgcaagg agtcacgaca gccactgtct ttccgagcta acaccagcag 660 cacggcgcga gtaacagagg cgagcgagct gagactacag ctcgatgaga cgcgcaggga 720 gaaagctgag ctgcaggctc tcatattgca gctccaaact caactggaat ccatgcgcca 780 acagttggta gagagctcgg agcgtgtgga ccgacaggcc aaagaagcac gcgaagactc 840 cttacgtcgc gagaactttc ttcgagagga agtagatcgg cgagcaaagg aaatgcgtga 900 ggaagcaaga aaagatcgag aactcattgc caagatgatg cagcgccagc gcgagcagcc 960 gaatacacaa gaacggcaaa cgcgaaagca gcaggatcag caatgtcagc aacaatcgtc 1020 acagccatcg tcacaacaac aaaagccgca ccgccaggta gtgagcaacg aagtgaatga 1080 aggagcatca acgtcggcag gtatagttga ggacccattt actgaggtgg ttaaacgagg 1140 gtcgcgtcgc tggcaagaaa ttcgggcaga gcgccgacag caacagcagc accaacagca 1200 attgcagaca acctcgattg cgggtggacg gctatcggtt gatgatcatc agtcactgca 1260 tcaaaagaat ctgcagcagc agagacgcga gcagctcaac aaggagcagc atcgtccagc 1320 ccgtaaacaa ccggataaaa tcttcgtggc gccggcagct ggagtgacgt actttaccct 1380 gtaccagaag gttcggctaa atccgaatct gccgtcggcg aaagcggtac tgcacgtgtc 1440 gtagcagata tgggggaagt cacaatacca aacattgaca tgctggctac ggaggcagat 1500 atccgcaagg cacttcaaat tgccttggaa aaagaggcga tactggcctc aatcaatgtc 1560 tgggagcagc gcgatggctc actacgagcg cgtgtgaaat tgcctcgtag tgacgccaac 1620 tatctcatcg acaagcggct tataattgat ttttctagct gcaaggtgcg cgaagcaccg 1680 aagccatctg ccgagtcgcg tcgatgcttt cgatgccttg aacgtggtca catggtccga 1740 gaatgtcaag ggacggaccg atctagtctg tgtattcgct acggggcagc cgatcacaag 1800 gcggcgaact gtaccaatga cgttaagtgc ctgttatgtg gcggcccgca tcgaattgcc 1860 gccgcctcct gcgctgttac atcaaggccc tgtctaatgg atatcttgca gataaacatt 1920 aataagtgca ggattgcgca ggaccttgcg ctaaatacga tgcgtgtaga gaaggcggac 1980 gtattgcttt tgtcagagtt atatgcagtc cctcagaaca acggaaactg ggtggttgac 2040 agggacaggt cggtggccat cgtgacgagt ggggtgcgct atccaataca acgtattcgc 2100 agcgttacag ttcctggtat cgtcgtagcg gatgtgaacg ggataacaat tgtatgttgt 2160 tatgtgaaac caaatgtcag tgtccggcag tttgaggaaa ttatggaaag gatagacata 2220 cttatccgtg gccatccgag agtgctgttg gcaggtgact taaacgcctg gcacacagca 2280 tgggggagtg aacgcactaa gccaaaaggc attgctcttt tacagctggt taatgatctg 2340 ggacttgagg tactgaatat cgggacatcc cacacattcc ggggctgtgg atcagcacgg 2400 cccagtaaga tagacgtatt ttttgccagt ccgtccatct gccgtccaga cctagctgca 2460 aaccctgcaa cctgctggcg gatcctgtca cgatactcct attctgacca tttttacatc 2520 cgctacacag ttggagaaca gccaccgagg gaacagcaat cagggacagc gagaagacaa 2580 cgcacagcag tgcggttggc cgggacgcga tggaacacac gtcatttcga tcctacactc 2640 ttcgagtcag cactgcagtc aacccggttt gaggaccggg ctacgcatgc taagagtctg 2700 attgagtcac tgacaagagc ctgcgatgag acgatgttgc gcgtgttccc atcacaagac 2760 cacaccggtc ggccagcgta ttggtggact ccggccattg aagagttggt gaacgagtgc 2820 cgtatcgcag aacagcagcg attagcttca cccaccgacc ccgatatttc agcattggat 2880 cgacaggctc gtcatgcgtt gaagactgcg attaaggcga gcaagaaaca gttctttgac 2940 cgaatgcttc aagcgttgca tgatgatgag acgggacagt gcatccgtaa ggtcctgtat 3000 cgcctacagc cctctcggac agcgcaagaa cgggatccgg tggtattgga gagcgttgta 3060 tcgacgttgt ttccggatca tcctccagca gagtggccaa ctgaggacga acatgaagtw 3120 ggcaacaggg tgccgcttcg tgaggtgact gatttgganc tgcagattat cgccagcggn 3180 atgcaacnta ggaaagcccc tggtctcgat ggcgtgccga acgctgcatt ggcgtctgcc 3240 attaagaact atccaggatc ctttcgacaa gtgtatcagg agtgcctaga caggtcatgc 3300 ttcccngaac aatggaaaaa gcagcggctg gtactcctgc tcaagccggg caagccaccg 3360 ggggatccat cgtccttccg cccgatctgc ctgctggaca acgcgggcaa aacgttcgaa 3420 aggttgttgc tggatcgctt gaatgagcac ctggaagatt cagataatcc acaactgtct 3480 gagtaccaat acggatttcg gcgtgggaga tcgacattgc tggcaatcca acaggtggta 3540 aatgcgggca gacgagccat gtcgttcggg cgcacaaaca accgtgatcg gcgctgcctc 3600 tggggttgtc gcgctcgacg tgcgcaacgc gtttaacacc gccagctggc aatgtattgc 3660 tgaggccctc agggataaag gagttccctt gcaattgcgt aacattctgc aggattattt 3720 cactgacagg gaactgacgt acgatacaac agatggtccc gtgacgcgtc gggtctcggc 3780 cggctgtcca cagggatcta tattgggtcc tacgctttgg aacgtcggat acgacggcgt 3840 gttacggttg gagcttcctg aaggtgcaca ggcagttggg wtcgcagacg awttggcaat 3900 tttggcggca ggaaccacgc cagaacacgc ggcagcaatt gcagaggagg cagtagaaag 3960 agtgcacgca tggatgcgac aacatcacct gcaattagca ccagaaaaga cggaatgcgt 4020 catgatttcc agcctccgtc gagggcatcc agagatcccg ataagggttt gtggaattga 4080 gatccgctcc aagcaggcga tacgctatct gggggtcatg atttatgacc accttctttg 4140 gcgaccacac gtgacgatgg cagtggaaaa ggccaaccgc gttgtgaagg tcgtgaccaa 4200 cgtaatgagg aaccacagcg ggccccgggt agctaagcgg agattgcttg ccggggtgtc 4260 tgaagcgatt attcgctacg gtgcacccat ctgggctgag gccacggatc gccagtggtg 4320 ccggcggatg ttggcaagcg tccagcgacc tctggcgcag cgagtcgtat gcggattcag 4380 atccatgagc tacagcgtag ctgtactcat ggcaaggctc atccctcatc accttttgat 4440 aaaggaagac gcgagatgcc accaacggta cttcgccgac ccggaagcaa gtcgtgctgt 4500 catccgacgt gaggaacgcg cggtgacgtt ggaggtatgg caacgggagt gggatgcgaa 4560 cgcatcgaat ccaggagcca gccgttacgc acgttgggca cacagattga tccccgaggt 4620 gcattcatgg atggcacaga agcggggtga ggtgggcttt catcttgccc agatactctc 4680 gggacacgga tttttccgag agtttctgca tgtatgcggc ttcgctccat ccccagaatg 4740 cccggaatgc acaggctcgg ttgagtcagt ggttcacgtg ttgttccact gtccgaggtt 4800 tgcagaggtt cggcgcgacc tactggagat gggagtggac ggttctataa cagaggagaa 4860 cctcggacaa atgttgctga aaagttcaga ctcttggatc cgcattcagg aagtagcatg 4920 ccgcatcacc acggtgctgc aagtgcgctg gagagaagat gaggcggcgc tgaatacgct 4980 tgctaattac acatccgcag aggcgacagt caacgctgaa cagatgcagg acgacccggc 5040 tgcagcccga cgagcggccc gtaaccgtca agcaagggcc cggagagcgc aggaacgcgt 5100 ggagcagctt ggggacgtgg aattkgcgtc ggcgttcagg cagttgtacg gtccagcagc 5160 cagcgatgcg acatcatcaa ttccggcagc agtgcctttg ctattggagg caacagcaac 5220 agtagaccct acaccagtag cagcagccgt agcagtagca gaagcaacaa cagcaacatc 5280 accggcagct gaggcagcac cacgagtagc agctgaggtt gcaccctttt cagcaaccac 5340 cggaatggcg ccgccgtcat catcgtcatc accaaatctc tctcccagac gtcggagaca 5400 agcgcagcgg caggccagag cacatatgcc ccccggcaga caacagaatg gtagagcagt 5460 gccttctgca gaggagctgg agcgtcgtcg ccagcgacaa cgagaaatgg agagagcccg 5520 acgacaacgg cagcggaggg cgagggacag tcaggccgtc acgatccctt tttcgccatt 5580 accatcggac gaaatgcaag aaaggagatc tacactaaca gcagatgagg aagctgctat 5640 cagcatgsag gcaacatcgg ggcgttaaac aatccgcagt atgtagggtg gcttagaaga 5700 gctacggaag gacagtaagg tctcacgggg aaaaaaaatc gaaggaatac gacacagtat 5760 cgtttttttt tgagttaaat taattaaatc actaataaag aaaggaaggt ccgtatggac 5820 ggttaaacct ggtgggtaaa tccctaacgg gtaattccca ctgggaggcg caagcagttg 5880 ggaacgaatg watattggaa ccaataaact gcttatttat atttaaaaaa a 5931 // ID GYPSY60-I_AG repbase; DNA; ANG; 4628 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY60-I_AG is an internal portion of retrotransposon GYPSY60_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY60-I_AG; GYPSY60-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY60_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4628 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY60_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 159-159 (2004). XX DR [1] (Consensus) XX CC GYPSY60_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY60-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. The consensus presents two ORFs. ORF1 CC encodes a 1109-aa GYPSY60_AG1p CC gag+protease+reversotranscriptase+RNase polyprotein (pos. CC 216-3542) and the ORF2 encodes a 423?aa GYPSY60_AG2p CC containing only the integrase (pos. 3346-4614). The sequence CC of the LTRs flanking GYPSY60-I_AG is deposited as CC GYPSY60-LTR_AG. XX FH Key Location/Qualifiers FT CDS 3346..4614 FT /product="GYPSY60_AG2p" FT /translation="KTGGHVTLPLAQIHLVITAGRKTIQRWEGCIFVGERV FT LIPEKLRKQCLSQLHRGHPGIQRMKAIARSYVFWPSLNEDIIDLVSNCHSC FT ALAAKSPAHADPLPWPKTTTPWERVHVDYAGPIDGDYFLVVVDAHTKWPEI FT IRTASITARITVSILRGLFARFGMPTTLVSDNGTQFTSGEFSEFCLSNGVH FT HITSAPFHPQSNGQAERFVDTFKRSLTKIRVGGAPLQEALDLFLQTYRTTP FT NPQLEQNKTPAEVMFGRPIRTCFDLLRPPRKIQHDFREGNERSFERDDLVY FT AKAYSRNNWHWVPGRVIRKRGNVTYEVLTGHRSVIRHINQLKRRGPFSSPG FT PMTERFNPLPLDVLLDSWSIPVTPSVLPDVYPTQLAPPVATPTEPPSLPPH FT RSIPQHQRSPRRSSRSRRAPRRFDPYLRY" FT CDS 216..3542 FT /product="GYPSY60_AG1p" FT /translation="MNSPGDQPPSEEQRRASQNWYSVPGAASMPQQPQQPS FT LVAVPPNLQSAMPSAYQNPAPSQNFVEGSSHQAASSHQATSSHQAASSEQM FT MQLIILMQKQISQLTLMQNNSATAKPPENVLTSAPFAEQVLDSLSHHIKEF FT HYEEEAKMTFASWFARYEDLFERDASRLDDSAKVRLLIRKLGAAEYERYAN FT FILPKSCRDFSFSETIKKLSSLFGVKESLLHRRYKCLNLMKRRSEDYLAYS FT CRVNRACVDFEIGKLTEEQFKCLIYVCGLRDEEDVEIRTRLLGKIDDRNDT FT TLESLTADCQRILNLKKDSAMIEKAATEQVFAVQKKTDEPKFSERQNFQSR FT KHEYKRPQTLPGRNCWLCGKNHWARDCPFKSNVCRVCKKKGHRDGYCPKPK FT RYSDSPTYRNKFSSRVVSVNTTNVQQRRKYINIFINNVSTRLQFDTGSDIT FT IINRNVWQRLGKPELKRTVSARTASGSGLFLLGEFEANVTIGSVTHVATLR FT VAQADILLLGTDLIDLFALGSTPMDAFCRHISSEDYSKTVERRFQEVFNGM FT GLCTKASVKLQLKENVRPVFCPKRPVAYAVQELVDKELDRLEQMSIISPTD FT YSEWAAPIVVVRKANGSIRLCGDYSTGLNDALQQHEYPLPLPEDIFARLSQ FT CKIFSKIDLSDAFLQVEIDPAYRSLLTINTHRGLFTYNRLPPGIKIAPAAF FT QQLIDTMLAGVKGVSCYMDDIIVGGATEQEHEANLMAVLKRIQEYGFSIRS FT EKCAFKVQQLRYLGYIIDTHGLRPDPAKIDVIKRLPEPTDVSGVRSFLGAI FT NYYARFVPNMRELRYPLDNLLKTNAQFRWTADCKRAFERFKSLLSSNLLLT FT HYDPRHKIIVSADASSIGIGATISHMFPNGTIRVVQHASRALTKTEEGYSQ FT IDREGLAIIFAVTKFHKMIFGRRFQLQTDHRPLLRIFGSKKGIPVYTANRL FT QRFALTLLSYDFGIEYVRTESFGNADVLSRLINKHDKPDEDCVIASVALEM FT DVKSIATSALVTFPLSFREVARETNRDPIARKLHYYIKNGWPRNIALGADS FT SRYHSRKEDYSTVGRMYFCRRESVNTRETTQAMSVPASSRTSWYPAHEGDC FT A" XX SQ Sequence 4628 BP; 1362 A; 1022 C; 1086 G; 1158 T; 0 other; gtggcgacga ggaaaacaaa atagactttt gggagttttt tcgagtaaag tttacgtgtt 60 atgcgtgtgt cagtcattgc tataactctc gtgctagcgt cgcgcgtgtg tatgtgcgtc 120 gcgtgatcgt accgcgtctc gggtcgtatt taatttgttt tatcgggtgt tctgtgcgta 180 ctgcaacaat tccttcgtta cggaagcgga gaagaatgaa cagcccagga gatcagcccc 240 catcggaaga gcagcgtcga gcatcgcaaa attggtacag cgttcccggt gcagcaagta 300 tgccgcagca gccgcagcag ccatcattag tagcagtgcc accaaatcta caaagtgcaa 360 tgccctcagc ataccagaat ccagcgccat cgcagaattt tgttgaaggg tcatcgcacc 420 aagcagcgtc atcgcaccaa gcaacgtcat cgcaccaagc agcgtcatcg gagcaaatga 480 tgcaactgat cattttaatg cagaaacaga taagccaact cacattaatg caaaataatt 540 cggcaactgc caagccccca gaaaatgtat taacttccgc accttttgca gagcaagtac 600 tcgattcttt atcgcaccac ataaaggagt tccactacga ggaggaagca aaaatgacct 660 ttgcttcgtg gtttgctcga tatgaggatt tgtttgaacg agatgcttca aggctggatg 720 atagcgcaaa agtgcggctc ctcataagaa aactgggcgc agcagaatat gaaagatatg 780 ccaatttcat attgccaaag agttgtcgtg atttttcgtt cagcgaaacg ataaaaaagc 840 tgtcatcgtt attcggagtg aaagagtcat tgttgcatag gcgttataaa tgcttgaacc 900 taatgaaacg acgtagcgaa gattatctgg catattcttg tcgtgtgaat cgagcctgtg 960 tagattttga aattggaaaa ttaactgaag aacagttcaa gtgtctgata tacgtatgcg 1020 gactgcggga cgaagaagac gtcgagattc gaacacgtct acttggaaag atcgacgacc 1080 gaaacgacac aacactggaa agtttaacag cagactgcca gcgtattttg aacttaaaaa 1140 aagacagtgc tatgatcgag aaagcagcaa ccgagcaagt tttcgctgtg caaaagaaaa 1200 ctgacgaacc aaagttttct gagcgccaaa atttccagag cagaaagcac gaatataagc 1260 gtcctcaaac cctacctgga agaaattgct ggctttgtgg aaaaaatcat tgggcgcgtg 1320 attgtccttt taagtcgaat gtgtgtcgtg tctgtaagaa aaagggtcat agagacggtt 1380 actgtccaaa accgaaacgt tattcagata gcccaacata cagaaacaag ttttcatcac 1440 gagtggtttc ggtgaataca actaatgttc agcaaaggcg aaaatatatc aatattttta 1500 taaataatgt aagcactcga ttgcagttcg acacaggatc tgatataacg atcattaatc 1560 gaaacgtatg gcaacgtctt ggaaagcctg aactaaaacg aacagttagc gcaagaacag 1620 catcaggaag cggactcttt ctgttaggag agtttgaagc gaatgttacc atcggaagcg 1680 taactcatgt ggctacttta agagtagcac aggcagacat actattgctc ggaacagact 1740 taatagacct gttcgcactt ggttctaccc ccatggatgc tttttgtagg catatttcat 1800 ctgaggatta ctctaagacg gttgagcgga gatttcaaga ggtcttcaac ggcatgggtc 1860 tgtgcacaaa ggctagtgta aaactgcagc tgaaggaaaa cgttcgtcca gtattttgcc 1920 caaagcgacc ggtagcttat gcagtacagg agttagtcga caaggaactc gatcgactag 1980 agcagatgag cataatatct ccaacggatt actctgaatg ggcagcacct atcgtcgtcg 2040 tccgcaaggc aaacggcagc attcggcttt gcggggacta ctcaactgga ctaaacgacg 2100 cgttgcagca acatgagtat cctttaccat tacctgaaga tatcttcgct agactatcgc 2160 aatgcaaaat atttagcaaa attgatctat ccgatgcttt cttacaggta gaaatcgacc 2220 ctgcctaccg atctttactc accatcaata cgcatcgagg tttattcacc tacaatagat 2280 tgccgcctgg tattaagatt gctccagcag cgttccagca gcttatcgac acaatgctgg 2340 ctggtgtaaa aggagtgtca tgttacatgg acgacatcat tgtcggagga gccactgaac 2400 aagagcatga agcgaattta atggcagttt tgaaaagaat tcaagagtat ggattcagca 2460 ttcgatcgga aaaatgcgct tttaaggttc agcaactaag atatttgggt tacatcatcg 2520 acacccacgg attgcgcccc gatccagcga agatcgatgt cataaaaagg cttcctgaac 2580 caacagatgt gagcggcgtc cgatcctttc taggagccat aaattattat gccagatttg 2640 tcccaaacat gagagagttg cgatatccac tggataactt attgaaaaca aatgcacagt 2700 ttcgatggac cgccgattgt aaaagagcgt ttgaaagatt taaatccctg ctatcatcga 2760 acttgttgtt aacgcattat gatccaaggc ataaaataat agtatcggca gatgcatcat 2820 caataggtat tggtgccact attagccaca tgtttcccaa tggtaccata cgtgtggttc 2880 aacatgcctc tagagcactt acaaaaacag aagaaggata tagtcaaata gatcgtgaag 2940 gactggcaat catcttcgcg gtaacgaaat tccacaagat gatatttgga aggcgtttcc 3000 agcttcagac agatcatcgt ccactactgc ggatctttgg gtcgaaaaaa gggatcccag 3060 tctacactgc caaccgcttg cagcgttttg cgctcacgct attgtcatac gatttcggca 3120 ttgaatatgt acgcaccgaa tcattcggaa atgccgatgt actctccagg ctaattaaca 3180 agcatgataa accggatgag gattgtgtaa tcgcttctgt tgcactagag atggatgtaa 3240 agtctattgc aacaagcgct ttagttacgt ttcccttgag ttttagagag gtcgctcgag 3300 aaacgaatcg tgatcctata gcaaggaagt tgcattacta cataaaaaac gggtggccac 3360 gtaacattgc ccttggcgca gattcatctc gttatcacag caggaaggaa gactattcaa 3420 cggtgggaag gatgtatttt tgtcggagag agagtgttaa taccagagaa actacgcaag 3480 caatgtctgt cccagcttca tcgaggacat cctggtatcc agcgcatgaa ggcgattgcg 3540 cgtagctatg tgttttggcc gtcgttaaat gaagacatca ttgatctcgt cagtaattgc 3600 cattcatgtg ctctggcagc aaaatctcct gctcatgctg atcctttgcc gtggccgaag 3660 acaacaacgc catgggagcg tgtccatgtg gactacgcgg gcccaataga tggagactat 3720 tttttggtag tagtcgatgc gcatactaag tggccagaga tcatccggac ggctagtatc 3780 acagcccgta taaccgttag catcttgaga gggttgttcg cgcgtttcgg tatgccaacg 3840 acactggtga gcgacaatgg cacacaattc accagcggtg agttttcaga gttctgtttg 3900 agtaatggtg tgcatcacat tacgtctgcc ccgtttcatc cgcaatcgaa cgggcaggca 3960 gaaagattcg tagatacgtt taagcgctcg ttaaccaaga taagagtagg aggagcaccg 4020 ctgcaggaag cactggacct attcctacag acatatcgaa ccacgccaaa ccctcagtta 4080 gagcaaaaca aaacacctgc tgaagttatg tttggacgac ctatacgaac gtgttttgat 4140 ttactccgtc ccccaaggaa aattcaacat gactttagag aagggaatga aagatcgttc 4200 gagcgtgatg accttgtcta tgctaaggcg tatagtagga acaattggca ttgggttcca 4260 ggaagagtga ttcggaaacg tggaaatgtt acctacgagg tactgactgg tcatcgcagc 4320 gtcattagac acattaatca actgaaaagg cgtggaccat tcagcagccc gggtcccatg 4380 acggaacgtt tcaatccatt accgttggat gtattattgg attcctggag catacccgtt 4440 acaccatcag tgctcccaga tgtgtatcca acacaacttg caccgccagt ggcgactcca 4500 actgagccgc catctcttcc cccacatcga tccattcctc agcatcaacg ttcaccacgt 4560 cgctcttctc ggtctagaag agctccacgt cggttcgatc cgtacttacg ctattaaaaa 4620 gggggaga 4628 // ID GYPSY17-I_AG repbase; DNA; ANG; 6793 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY17-I_AG is an internal portion of retrotransposon GYPSY17_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY17-I_AG; GYPSY17-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY17_AG; mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6793 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY17_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 175-175 (2003). XX DR [1] (Consensus) XX CC GYPSY17_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY9_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, CC GYPSY13_AG, CC GYPSY14_AG, GYPSY15_AG, and GYPSY16_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY17-I_AG consensus was reconstructed after multiple CC alignment of 6 copies. CC The consensus encodes the 454-aa GYPSY17_AG1p gag-like protein CC (pos. 1761-3122) and the 1260-aa GYPSY17_AG2p (pos. 2966-6745). CC The sequence of the LTRs flanking GYPSY17-I_AG is deposited as CC GYPSY17-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1761..3122 FT /product="GYPSY17_AG1p" FT /translation="NPNFDHMTKLKEIVRRLELIHKTLQQNKGNIRQCALA FT TYRLQVDEIYSVFRKEIETNYDKYSDSEIKFYNNIIQNLITNIVEKVNNET FT INTDNNTSDLNETLKSHKKLTLKTTAHVIISILSIYKRQKQIVPTANIDHT FT AIKTNTSVDSSYKPNMDALEILKTATSLIPTFSGRYDEAEAMLAALETMKE FT AVDEQHHRLIMRVVQSKLKGKGRKIIGKTVTNIEDALAKIGAYVKKTESPE FT DIATAIHALKQKTTPKDFGEEIQALAEELEQAYLGEDVAPALATAKTNKIA FT MAAFGKGLKKEIHQAIVLSGTIPTLDAAIRAIISIDKTNQSTQDKKSDNRQ FT NGQENRYSSNQRQTNTRGIDQRQNNNNWRFPPQQNNNSWRSQPQQNNNNWR FT SPQTQNGRQGAGFNTRNRQPAGPNFLGQREPQRAILYTQMETQNPQVDQPS FT TSHYGQHTQ" FT CDS 2966..6745 FT /product="GYPSY17_AG2p" FT /translation="TGSRVQHPKQATRRPKFFRATRTAKGNPLHANGDPEP FT TGGPALHKPLRATYTINVQKSNFIRTRLGLADSICNLFVDSGSDISIIKGN FT KVRPTQIYKPKDIVDIISVGEGTITTHGSTITDVIVEGKKIQQLFHIVPDN FT FKIPADGILGRDFFMNHRCIINYDTWIFSVKHNGEFLEAPIEDTINGKTLI FT PPRCEVIRKLDKLKELDTDAVVCAEQLQEDVLVGNCIVNKNYPFIKIINTS FT NKAKLVNISHIKTIPLNEFEIVKTSNHKDENRLAIIKDLIRKENISEDTDK FT SFEQLLLSYNDIFHLPNDHLTTNNFYEQDIKLEDKRPVYIPNYKQNHSQGP FT EIKKQIEKMLQDDVIEHSVSHYNSPILLVPKKSSDEKKWRLVVDFRQLNKK FT LLPDKFPLPRIDSILDQLGRAKFFSTLDLMSGFHQIPLEESSKKYTAFSST FT DGHYQFKRLPFGLNISPNSFQRMMTIAMTGLTPECAFVYVDDIVVVGASEN FT HHLKNLEKVFERLRHYNLKLNPEKSCFFKKEVTYLGHKITDKGILPDDSKY FT DSIKNYPIPQNADDARRYVAFCNYYRKFIPNFALKAKPLNSLLKKNTKFEW FT TQECQEAFEYLKNTLISPQILQYPDFSKQFILTTDASTIACGAVLAQEHDG FT IDMPICFASRTFTKGEANKAIIEKELAAIHWAIMHFKHYLYGTKFTVKTDH FT RPLVYLFGMKNPSSKLTRMRLDLEEFDFTVEFVKGKQNVVADALSRIKITS FT DEIKSINVITRSMNKPVTSDNVLGNTSESDQLKMFHALAYDEVKDLPKLET FT SVKRNENTIELIGKILNKRKSKELLSVRDIHLNTDIGLQEPLLVKDFQKRK FT EKSAIVQFIKNIEKKLVMKSITQLAVSETDEIFKEVNPNEFKQIANNHLKN FT IQILIYTKPQTINDEKTINDILDKVHNTPTGGHIGQYRMYKKIRKEYVWNK FT MKKSIKDFLDKCITCKLNKHLIKTVEPFVKTDTPNIPFEVVSIDTVGPFQK FT TNNNNRYAVTLQCNLTKHVTVIAIPNKEANTVARAVIEKFMLIYGTNIKEF FT RTDMGTEYKNEIFKNISEILRIEHKFSTPYHPQTIGALERNHRCLNEYLRI FT FTNEHKDDWDDWINYYSFAYNTTPNLDHGYTPFELVFGRNEKISTNITEKS FT TPLYNYDDYSKEFKYRLKLAHDRTRKHIEQEKMKLLKEQQNINQVNFQIGD FT QIALTNENRTKLDPVYKGPYKVKEINGPNMIIENTDGVTQNIHKNRAIKI" XX SQ Sequence 6793 BP; 2779 A; 1286 C; 1173 G; 1555 T; 0 other; tggcgaccgt gacagcgtag caaaaacaaa gctacaagaa cgttaaagtg cagtgaaaaa 60 aataaaataa aagtgcagtg ctgtggagaa aaagttatac gcaaagtgcg gtgcgtggaa 120 aaaaatacag tagaaagaaa aaggttaaag aaagaacggt gtcagttttt accacataac 180 cattatcgtg agaactcaat acagtgcccc aataatctaa taagcaataa agtgatcttt 240 aatcgatttt taacaaagtg cgagtggtgc agtttgaggc ataaggtaac accaaacgcg 300 ggaaataaat aaaaacgttc caagtgcaat ctaaaagtgg cacaagttta aaagttagca 360 aaaacacatc gttgcgcaca agtatcaagt acccgaaaag cagtgcagga agtgaaatcg 420 cagtgtcagt aaactgaaaa aagtagcgag tgtgtatatt tattctggca taacacacat 480 aagctatggg tcacgataac tcaaaaccgg aaacaaacgt taagggcaat aacgacctca 540 ccatcgttca gacccaaaac atacacactg aacaacacga ggcacacgag ttcaaactta 600 acctgatttt ggctctgctc gcgttgattg ttattgcaaa agcgctcaaa atagtgtaca 660 aaatagtgaa aaatcaggcg aaagaacaag ctgtaaagat gttgtcttta cctaaataaa 720 tagtgaagaa aaaaaaaaga aaaaaaacat tagtgaagca ataattcgga tataaaagac 780 aaagtgccga tcaagtgttc atcagacgac aaccacaagc cgcaaaccag gaagacgcac 840 actgcccacc tgcaagccgc cgtagccgcc cgcctacaag gaagtcgtaa ccgcctacca 900 ggaagaccaa gcgcccacca gcaagcgagc gcagccgctt accaagaaga ccaagccgcc 960 caccagcaag cagcctacca ggaagaccaa gcgcccacca gcaagccatc gcagccgcca 1020 ctaactgagg agatgagcac ctggcacggg ctgagaaatt ttaacgaaga cggtgaatac 1080 gtggggccat acatcccgaa caagaaggat gtcgaggatc aacgaatcga gatcctggag 1140 ttactgaaaa acgccggcaa caaggagata gaaaacaagg tagaacacat atatactagc 1200 tataaagaaa ataaaataac aactcgggag acatttgaca aagccacaga tgtcttaatg 1260 acattattag ggctgtagga ggaataaaaa caattaaata acaagaaaaa aaattttttt 1320 ttcatacata ttattaacaa caagaaagaa atttttagtt tttttttact atcaataaaa 1380 aataaaataa aataaagtct tcttttacaa aaaaaaaact tatacaaatc cactcgcata 1440 gaaagaaaat aacatttacg tttgtccacc ttggaaaata tttcctctca ctttttccgc 1500 ttttaaattt tttttctaaa tactattcac aattagaaac tcaaaaaaaa aatcaatcaa 1560 taaataaata aaataaaata aagaattttc ttcttttttc tcaacatata aaagaaagaa 1620 actttcctcg actctcgagg aaaaccagat tttattttat ataagctttt atataaaaat 1680 ataagtctgc tttctctaga aaaaataaat tgtacttgat tttatgtatg aacttaaatt 1740 ttttttccca caatcattga aatccaaatt tcgaccatat gactaaacta aaagaaattg 1800 ttaggagact agaacttatt cacaaaaccc tacagcaaaa caaagggaac attagacagt 1860 gcgcattggc cacatacaga ttacaagtag acgaaatata ttccgtgttc aggaaagaaa 1920 tagagaccaa ctacgacaaa tacagcgact cagaaatcaa gttctacaac aacattatcc 1980 aaaatttaat caccaacata gtagaaaaag ttaacaacga gacaattaac accgacaaca 2040 acactagcga cttgaatgag acattaaaat cacacaaaaa actaacgtta aaaaccacag 2100 ctcacgttat aatatcaatt ttatctatat ataaaagaca aaaacaaatc gttccgacag 2160 cgaacattga ccacacagct attaaaacaa atacgagcgt agactcaagt tacaaaccaa 2220 acatggatgc tttagagatc ctaaaaacgg ccaccagcct catacctaca tttagcggta 2280 gatatgacga agctgaagcc atgttagctg ctttagaaac aatgaaagaa gcagtggatg 2340 aacagcacca cagactaata atgcgggtgg tacaatcaaa actaaagggg aagggcagaa 2400 aaatcatcgg aaaaacggtg accaacatag aagacgctct ggcaaaaatc ggtgcatatg 2460 ttaagaaaac agagtcgcca gaagatattg ccaccgcaat tcatgcatta aaacaaaaaa 2520 ctacaccaaa ggattttggt gaagaaatac aagctctggc agaagagcta gaacaagcat 2580 atcttggaga ggacgttgca ccagcactgg caacggcaaa aaccaacaag atagcaatgg 2640 cagcttttgg aaaagggctc aaaaaggaaa tccatcaggc aatagtattg tccggtacca 2700 ttcccaccct tgatgcagct attagagcaa ttattagcat cgataagacg aatcaaagca 2760 cgcaggacaa aaagtcagac aatagacaga atgggcagga gaatagatac agctcgaatc 2820 agcgtcagac aaacactcgt ggtatagacc aacgtcaaaa caataacaac tggagattcc 2880 caccccagca aaacaataac agttggagat cccaaccaca gcaaaataat aacaattgga 2940 ggtcaccaca aacacaaaac ggtagacagg gagccgggtt caacacccga aacaggcaac 3000 ccgccggccc aaatttttta gggcaacgcg aaccgcaaag ggcaatcctt tacacgcaaa 3060 tggagaccca gaacccacag gtggaccagc cctccacaag ccattacggg caacatacac 3120 aataaatgtg caaaaatcaa attttattag gacaagatta ggtctagcag attcaatatg 3180 caacctattt gtagattcag gttccgacat ttctatcatc aaaggcaaca aagtaagacc 3240 tacacaaatt tataaaccaa aagatatagt ggatatcata agcgtaggag aaggaacaat 3300 aaccactcat gggtccacaa ttacggatgt aatcgtggag ggaaagaaaa tccaacaatt 3360 atttcacatc gtaccagata acttcaagat accggcagat ggtatactcg gtagagattt 3420 ttttatgaac caccgatgta taataaatta cgatacttgg attttctctg taaaacacaa 3480 tggagagttt ttggaagcac ccattgaaga tactatcaat ggcaaaacac tcatacctcc 3540 cagatgtgaa gtaattagaa aacttgataa gttaaaagaa ttagatacag atgcggtagt 3600 atgcgcagag caactgcaag aagacgttct tgtaggtaac tgcattgtaa ataaaaacta 3660 cccatttatt aaaataatca atacttccaa taaagctaaa ttagtaaaca ttagccatat 3720 caaaacaata cctttaaatg aatttgaaat agtaaaaact agcaatcata aggatgaaaa 3780 taggttagca atcataaagg atttaatccg aaaggaaaat atttccgaag atacagataa 3840 atcttttgaa caattactgt taagctacaa tgatattttt catctaccta acgatcattt 3900 aactacaaat aatttttatg aacaagatat aaaattagaa gataaaagac ccgtgtacat 3960 accaaattac aaacaaaacc attcccaagg accagaaatc aaaaagcaaa ttgaaaaaat 4020 gcttcaagat gatgtaatag aacactcggt gtcacattac aattcaccca tcttactggt 4080 tccgaaaaag tcctcagatg agaaaaaatg gagattagta gtcgatttta gacagcttaa 4140 caaaaagctg ctccccgata aatttccact acctagaata gactccatat tagatcagct 4200 agggcgagca aaatttttta gcacattaga tctcatgtca ggattccatc aaataccact 4260 ggaagaatcg tctaaaaagt atacagcttt ttcaagcacg gatggtcact atcaatttaa 4320 acgattacct tttggattga acatttctcc aaatagtttt cagcgaatga tgaccatagc 4380 catgacaggc ctcacgccgg aatgcgcttt tgtatatgtc gatgatattg tagtagtagg 4440 agcttcagaa aatcaccatc taaagaattt agaaaaggtt tttgaaagac taagacacta 4500 caatcttaaa ctaaacccag aaaaaagttg ctttttcaaa aaagaagtta cttatcttgg 4560 acataagata accgacaaag gcatccttcc agatgattcc aaatacgaca gcataaaaaa 4620 ttacccgata ccacaaaacg cagacgatgc gagaagatac gtagcattct gcaattatta 4680 cagaaagttc atcccaaact ttgctttgaa agcaaaaccg cttaacagct tattaaagaa 4740 aaatacaaaa tttgaatgga cacaagagtg tcaagaagca ttcgaatatt taaaaaacac 4800 actgattagt ccacagatat tacaatatcc tgacttcagc aagcaattta tactaaccac 4860 agatgcttca actatagcat gcggagcagt tttagcacag gaacatgatg gtatagatat 4920 gccgatatgc ttcgcaagta gaaccttcac gaaaggggaa gcgaataaag caatcatcga 4980 aaaggaacta gccgcaatac attgggctat aatgcatttc aagcattacc tatacggtac 5040 aaagtttacc gtcaaaacgg accatagacc actagtctat ctgttcggaa tgaagaatcc 5100 gtcatcaaag ttgacgagaa tgagactaga tttggaagag ttcgatttta cagttgaatt 5160 tgtaaaaggg aaacagaacg ttgtagcaga cgctttatcg cgcatcaaga tcacctcaga 5220 tgaaattaaa tccatcaatg tgattacgag aagcatgaac aaacctgtta cttccgataa 5280 tgttttagga aacacgtcag agtctgatca actcaaaatg ttccatgcct tagcatacga 5340 cgaagtaaaa gacttaccaa aactagaaac atcagtaaag agaaatgaaa acactatcga 5400 gttgatagga aaaatcctaa acaaaagaaa gtccaaggag ctcttatcag taagagacat 5460 ccatctgaat acagatatag gactacagga gcctttatta gtaaaggatt tccagaaaag 5520 gaaggaaaaa tctgccatag tgcaatttat caaaaatata gaaaagaagc tcgtaatgaa 5580 aagcattacc cagctagcag tctctgaaac agacgaaata ttcaaggagg taaatccgaa 5640 tgaattcaag caaatcgcta acaatcacct gaaaaatatt cagatactaa tatatactaa 5700 accacaaacg ataaatgacg aaaagacgat aaatgacata ctcgacaaag tgcacaacac 5760 accgacagga ggacacattg gacagtatag aatgtataag aaaatcagaa aggaatatgt 5820 atggaacaaa atgaagaaat caatcaaaga ttttctagac aaatgtataa cctgtaaact 5880 aaataaacat cttatcaaga ctgtagaacc ttttgttaaa acagatacac ccaacattcc 5940 attcgaagta gtatcaatcg atacagtagg accatttcaa aaaacaaata acaataatag 6000 atatgcggta acacttcaat gtaatttaac gaaacacgtt acggttatag caattcctaa 6060 caaagaagca aatacggtag ctagagcagt aatagaaaaa tttatgttaa tatatggcac 6120 aaatattaaa gaattcagaa ccgatatggg tacagagtac aaaaatgaaa tatttaaaaa 6180 tatatcagaa atccttcgaa tagaacacaa attttcaacg ccatatcatc cacaaacgat 6240 aggagcttta gaacgtaatc acagatgtct aaacgaatac cttagaattt ttacaaacga 6300 acacaaagat gattgggacg attggataaa ttattattca tttgcatata atacaacgcc 6360 taatttagac cacggttaca caccatttga actagttttc ggaagaaacg agaaaatttc 6420 gacaaatata acggaaaaat ctacaccatt atataattac gatgattact caaaagaatt 6480 taaatacaga ttaaaactag ctcacgatag gactcgaaaa catatagaac aagaaaaaat 6540 gaaactactg aaagagcaac aaaacataaa ccaagttaat tttcaaatag gagatcaaat 6600 agcattgaca aatgagaaca gaacaaaact agatccggta tataaaggac cgtataaagt 6660 aaaagagatt aacggaccta acatgataat tgaaaacact gatggtgtca cacagaatat 6720 tcacaaaaat agagcaatta aaatatgaca gaataacttc atttcattac gttattcttc 6780 cgaagggtgg agg 6793 // ID GYPSY21-LTR_AG repbase; DNA; ANG; 167 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY21-LTR_AG is an LTR of retrotransposon GYPSY21_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY21-I_AG; GYPSY21-LTR_AG; GYPSY21_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-167 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY21_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 8-8 (2004). XX DR [1] (Consensus) XX CC GYPSY21-LTR_AG is a long terminal repeat of GYPSY21_AG (its CC internal CC portion is deposited as GYPSY21-I_AG). XX SQ Sequence 167 BP; 48 A; 33 C; 32 G; 54 T; 0 other; tgtagtaggc tatgactact atgactacca aacagtggga ttttgcctca gcagaatttc 60 acttcctagg catagtagcc tatgccatgt ttagtaataa agcagttcat agttaaccac 120 caaactagca agttggtttt tatttgctct ctgtgtgaac tgtaaca 167 // ID RETRO60_AG_LTR repbase; DNA; ANG; 235 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO60_AG DE retrotransposon - a consensus. XX KW LTR Retrotransposon; Transposable Element; Long terminal repeat; KW RETRO60_AG_I; RETRO60_AG_LTR; retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-235 RA Jurka J. and Drazkiewicz A.; RT "RETRO60_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 14-14 (2002). XX DR [1] (Consensus) XX CC 5 bp target site duplication. XX SQ Sequence 235 BP; 83 A; 37 C; 50 G; 65 T; 0 other; tgttggaaac tgttgtaagg tttccaacga ctgtcagctg aggcgatgtt ttgtgtgtgc 60 gcaaacagat aaaaagggag aaaaaccgcg gaaactggtt cattgcgatc aaaaaaacat 120 acgaacgaag atacgaagaa aataaagaaa aaaactaatt ttttggcgca ccttaagtgc 180 tagcctaaaa aatttacttt gttttctttt ttagttagag cgccagtttc taaca 235 // ID RT1 repbase; DNA; ANG; 8037 BP. XX AC M93690; XX DT 28-SEP-1995 (Rel. 1.08, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae RT1 retroposon. XX KW R1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; Nucleic acid binding protein; retroposon; KW RT1. XX NM RT1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-8037 RA Besansky J.N., Paskewitz M.S., Mills-Hamm M.D. and Collins H.F.; RT "Distinct families of site-specific retroposons occupy identical RT positions in the rRNA genes of Anopheles gambiae."; RL Mol. Cell. Biol 12, 5102-5110 (1992). XX DR GenBank; M93690; Positions 18 8054. XX SQ Sequence 8037 BP; 2175 A; 1995 C; 2212 G; 1655 T; 0 other; agtgtcaaac gtgaggtcgt actgtaccga cacgttgtgt ttgggctccg aaaaaaggac 60 ttaaaaacag tgcaaaatcg tcgttaatcg actttaatag tgtgaaaaac gtgctcaggg 120 cacctgattt accctttaag agcgttatca agtgttttta aggcgaaaat tcaacaaaaa 180 tcgtgcattt gtgtttgttg cctatgaacc ccccaccatg tgtttgtgtg acatggtgta 240 aaagcagaaa atcgaaaaaa gtgcccaaaa agtgccagaa ttgcacgatt ttagtgtaaa 300 cagtgcatct gagcacgaaa aaaggtgtta agaagtgatc cgcgcatgaa aaaaccacaa 360 aattgtgaaa aaaaagtttt gcgtcatcca gcccatactg ggccaagtaa tcgccggatc 420 tgtgttttca gcgatataat ttgcccagat tctcgccggg tgtggttttt aaggccatcg 480 ggtggccaaa gaacgttcat ccagtgaaaa ccgatcgaaa atcggccgtt tgtgaccgaa 540 aaaaagtgaa ataatattta atattattcg gtgaacaaat ttacgaaaat tcggttccag 600 gtgaaattgg gtgaaattgg gtgggttgac tgtacggccc ctgatggccc ctgagttaag 660 gtgccgttac acgagtaacc atagtttagg tgaccatcga cgtgcggaaa atttttgtga 720 catcggccta ttgcgaatgc tcaccgtgtg tttgtgtgac gtggtgcaaa agaagaaatt 780 cgaaaaaagt gcccaaaatg tgcctgagtt gcactatttt agtgtaaaca gtgcatctga 840 gcacgaaaaa aggtgttaag aagtgatccg cgcatgaaaa aaaccacaaa attgtgaaaa 900 tagttttgcg tcatccagcc catactgggc caagtaatcg ccggatctct gtttgcaggg 960 atataatttg cccagattct cgccgggtgt ggtttttaag gccatcgggt ggccaaagaa 1020 cgttcatcca gtgaaaaccg atcgaaaatc ggccgtttgt gaccgaaaaa aagtgaaata 1080 atatttaata ttattcggtg aacaaattta cgaaaattcg gttccaggtg aaattgggtg 1140 ggttgactgt acggcccctg acggcccctg agttaggcat cggttgtcaa acggcctgac 1200 agctcggacg gaccgatcgt taaggtgccg ttacacgagt aaccatagtt taggtgacca 1260 tcgacgggcg gaaaatgttt gtgacatcgg cctattgcga atgctcaccg tgtgtttgtg 1320 tgacgtggtg ctaaagaaga aattcgaaaa gtgcccaaaa tgtgcctgag ttgcactatt 1380 ttagtgtaaa cagtgcatct gagcacgaaa aaaggtgtaa agaagtgatc cgcgcatgaa 1440 aaaaccacaa aattgtgaaa aaagttttgc ggtcatccag cccatactgg gccaaggttt 1500 cgccggatct ctgttttcag cgatataatt tgcccagatt ctcgccgggt gcggttttta 1560 aggccagcgg gtggccaaag aacgttcatc cagtgaaaac cgatcgaaaa tcgcccgttc 1620 tgtgaccgaa aaaagtgaaa taatatttaa tattattcgg ggaacaaatt tatgaaaatt 1680 cggttccggg tgaaattggg tgggttgact gtacggcccc tgacggaccg attgttaagg 1740 tgccgtcaca cgagtaacca tagtttaggt gaccatcgac gtgcagaaaa tttttgtttt 1800 tttttgtgac atctgccaat tgcgaatact caccgtgtgc ttgtgcgacg tggtgcaaaa 1860 acagaaattc gaaaaaagtg cccaaaatgt gcctgagttg cactatttta gtgtaaacag 1920 tgtatctgag cacgaaaaaa ggtgtgacat cagtttgaca gctaagcaga ggaatttcaa 1980 agtggcataa cttcggcacc cgccaagcta tcccgacggg aaagtgttgc ctgtggaagc 2040 ggtatcgaaa tctttcaggg cagggtgaaa tatttcactt tggaaatatt tcacattttg 2100 aatttttttt tttttttttt tttttttttt tttttttttt tttttttttt tttttttttc 2160 ttctcttcaa aaaaaaacca tcttccatgt gcgcgggcgt caggcagagt gcgtcgggct 2220 gccatcaaac cctcaagtag gtgcgcgttt gtgaaagcgg ccctgaaatc tttcagggag 2280 ggtgaaatat ttcacttttg aaatatttca aaatttgaaa tttttcaaca tttcaaaaat 2340 ccgttctcca tcggttttct tgacgaacct aaaaccgggg gtcggtttct gacgcctcgt 2400 tggcgagagt gcgggcgtta ggcatagtgc gtcgggctgc caacaaaccc tcaagtaggt 2460 gcgcgcaacg ttgaaaccaa catgtcgggg ctgggcggtg atgcccatcc ccaagggagc 2520 tcgggacgag tacttcgccc gagggctcga tcggtgtccc tcaaccgggt cgacgcatta 2580 aaagtgtctg actcaacacc ggtggagcca ccacccaccg caggcatcgt ctacttgagt 2640 gatgatgaag aagaagaatt gaactgcacg atcctcgcgg gcccgtcggg attagcagtt 2700 ccaccaatgg ggaaggtgcc attggtagtg ctggacaagt tgcccagcca gagtcaacag 2760 cgcgaggaga tgacggtacc ggccacctca acaccaaagg ctggtaagtg ttcttctgct 2820 gaaccatctc tctcagagat gaacgagagc ttgaagctcc tggccatgca ggttgctcag 2880 ctctcaaagg agcttagcct ctgccgtaag gagctccagg aaagtttgat gaaaaatgcg 2940 gcgcttgaac gggagctcga aacgtacagg atgggcgccc gttcggtcat cgagctgcag 3000 cagcaagcag cagcagcccc aatgatgaca gcccagggag cccacagctc tcgcaaccgt 3060 cgcggtcgcc aaggaccaca gcagcaggag cagcggcagc agcagcaaca gcatcagcag 3120 cgggaacagc agcagcagca gcagcagcag cagcagcagc agcagcagca acaacaacag 3180 cagcagcgga accagcagcg tgaatggcag cagcagcagc agcagcagca gcatcaacag 3240 cgagaacagc agcagcagca acgggtgcag caacagaatc agcagcacca acgtcagcag 3300 cagcagcagc agcagcagcg gcaacagcaa cagcagcagg agcagcaaga attatggacg 3360 acggtagtgc gccgccgtca aaatacacag cagcagcagc agtctaacca accgcagcaa 3420 caacaacagc agactgggcg gtatcagccg ccgcaaatga ggcagcagct acagcagcaa 3480 cagcagcaac gacagccaca gcgatatgtg gtcgcaggct cgtcgcaaca gcagcagcag 3540 cagcatcaac agcagcagca gaagcgtaag cgtcctaagc ccgaactgat agagatctct 3600 cctggtcaga acgagacttt cgagagcgtc tccttgaaaa tccgtaaagc cgttgacgat 3660 aatggcacac ataaggagtt aaaggatttc atcatcatgg gccggcgcac agataaggcg 3720 ttgctacgac tgacgcttgc tagatccgca aacgcgacct taattctcca gcagatccga 3780 acgatcatcg gcgaggctgg aacttgtcga cacgtgacgg aaatggcggc cttggtagta 3840 aacgacatcg accccctagc caaggaggaa gagcttacag ctctccttga aaacaagatc 3900 gagggtgggg caggcatcgt ctcaacgagc attaggacaa tgccggatgg cacccagcgg 3960 gcacgcgtcc gtctgccagc caaggccgcc aaagcgctgg atggtacgaa gcttcgcttg 4020 ggcttctgca tttccagagt gaagatggct cctccaacac ccaaagagca tcttcgctgc 4080 taccgatgcc ttgagcacgg ccacaacgcc cgcgattgtc ggtcacctgt agaccgacaa 4140 aatgtttgca tccgttgcgg acaggaaggt cacaaggctg gtacatgcat ggaagaaata 4200 cgctgcggca aatgcgatgg cccccatgtt atcggggacc ggacatgcga tcggtcggcc 4260 acccaatgac gcagctaaaa gtcctccaag tgaacctggg tggaggcagg atcgcccaag 4320 atctggtcct gcaaaccgcc cgacaaatgg aagtggacgt gctggttctt tcccacacgt 4380 atcgaccacc cgagaacaac ccaagatggg cagttgatgc ctccaaaaag gtggcagtcg 4440 tggccacagg acgataccct ctacaaggac aatggagcag tgatgttcca ggccttatag 4500 ctgccaaggt gggtggcatc accttcctaa gctgctacgc gccacccagc ctgtcgcggg 4560 aaggatttgc ggaattcgtt gaagcaattg aattggaagc ccaatcccac cctcaggtag 4620 tagttgccgg agactttaac gcttggcatg aggagtgggg aagccgacgc agcaatgagc 4680 gtggggaagt actgctcgag gcgtcccagc aattgggcct gctgctgatg aatcgaggga 4740 atgtggcaac ctttgttgga aacggtgtgg cgactgccag cgttgtcgac gtgaccttcg 4800 ccagctcgtc catagctcag ccgagcactt ggttggtaag aaacacggac acgcgatctg 4860 accataggta tatcacctat tcggtaggcc cagcgtcagc agaccagcag cgtaaccaag 4920 gacagtcacg tcaacggggc cagcgagagc gttttcaaca tgcaggcacg cgatttaaga 4980 cgaaacagtt ctcgaaagag aatttcctgg ccacgctaca tggcgaggga ttccgagaga 5040 aggcagtcaa tcaccaggga atgatctcgg caatgatatc ggcctgcgag aaaaccatgc 5100 aaaggatgac gtcgtctttc cccgaccctc atcgggacgt ttactggtgg acgccactga 5160 ttgctctgct taggcaaaac tgcgagcaga cgagagatcg catgcagcag acaagtgatc 5220 tccagaaccg aagtctggct gcagcccaat accgaacagc taaagctgag ctggatagag 5280 ctatacgtgc cagcaaaaag gccgccttcc aggaattgat cgatgctgcg gaggaaaacg 5340 ttttcggagc cgggtactta gtagtcctct cccgtcttcg cggtggaagg gccccacccg 5400 agacggagag agcgaggctt gaaagcatcg ttacagagct tttcccgcaa catccgccct 5460 tcaactggcc cagcatcagt tccgaggaag aacaggaaca gcctgcagac cagcagactc 5520 catggaccca agtcacgatc ccggaactcc gtctgatagc tagcaccatg ccgaacaaaa 5580 aagcgccggg ccttgacgga attccgaacg ccgctgtcaa ggctgcaatc cttgcgtaca 5640 cggacgtttt ccaggcgttg taccaaagct gtctggaaac ggctacattt ccagcaccat 5700 ggaaacgaca gcggttggta ctgctcccca agccggggaa accaccgggt agcaacgggt 5760 cataccgacc tttgtgcatg ctggatgcct tagggaaagt gctggagaag ctcattctaa 5820 acagacttca caaccacctg gaagatcctg ctgcggtgag gctgtcagac aggcagcatg 5880 gcttccggag ggggcgatct acaattggcg ccattcgaac agtgatcgag gctggtcaga 5940 gcgcgatgag attccgccgc acgaacgggc gggataacag gttcctgctg gtcgtgtcaa 6000 tggatgtcaa gaatgccttc aatacggcaa gctggcaggc catcgccact gcactgcaga 6060 tgaaaggagt acctgctggt ctgcaaagaa tcgtgaggag ctacttcgag aaccgagagt 6120 tggtcttcga gacatccgac ggcccagtaa ctcggtccat cacggctggt gttccacagg 6180 gttcgattct tggccccacc ctgtggaaca ccatgtacga tggagttctg gacgtcgccc 6240 ttccgcagga ctgcgagatg gtggcgtatg ctgacgactt ggtgctgctg attccgggca 6300 tcgacgtaaa tgcagtgaag gctgcagccg aggaggcggt cgccagtgtc tctcactgga 6360 tggctcaaca tcatctccag attgcgccgg aaaagacgga gtgcgtgctt atctccagca 6420 cgaagaaccc tacgcaggtc accataagag taggggacgt ggaggtgaca tcttcccgca 6480 cgatgcgtta ccttggggtg acccttcacg atcacctatc gtggctgccc catgtccgag 6540 aggtaaccac tcgggcaagg aagatagcag atgccgttac ccgcctcttg cgcaaccaca 6600 gtggaccaaa gaccagcaaa gcgcgactgt tggcctcggt cgcagagtcg gtcatccgat 6660 atgctgcgcc catctggcat ggcgaggtga cgaagagaga gtgtcgtcga cttttggaga 6720 gggttcagcg agtctcagca cggagggtgg cacgcacctt ccgtaccgta aggtatgaga 6780 ccgccaccct cctcgccggg ctgaccccca tctgtctctt gatagaggag gatgcgcgag 6840 tgttcgagcg tgttaatgat ccgggtcggt cgataacgaa ggcagccatc cggttggagg 6900 agaggcagcg caccatcacg atgtggcaga gccaatggga cgccgaggcc gacacctcca 6960 gatacacccg gtggacgcat cggataatcc gcgacatcag cgcttggcaa ggccggagac 7020 acggggagat gaccttccac ttggcccagg tgctctctgg acatgggttc ttcagagagt 7080 acctggccat caacggcttt acggaatccc ctgactgtcg cagctgtgcg ggtgttccgg 7140 aaaatgccca tcacgccatc ttcgaatgcc ccaggtttgc tcgagtgagg atggagtact 7200 ttggtgaact ggggccgaat ccggtcacgc cggacagtct tcaggacttc cttatgggca 7260 gccaagacaa ctggagcagc ttctgtgagg cggcacgtcg aataacaaca acgctgcagc 7320 gtgactggga cacggaacga gaacagcggg ctgcttccaa tcgtgaggaa gctgaaatac 7380 aacagcagct acagcgagag gaagacgaac gccgcactga agaacggcgg cagctccaca 7440 atgaggcaaa tcgtgcctac cgccagcgca ataggagaag tcaacctact ccaccagctc 7500 caccaccaac acccagggag gccgctcgcc tcgaagatgg gcgacggaga gtggcccgct 7560 ggcgggaaag acaacgaatg attcggaacg gtggaatcca aatgctacga gcgctgttcg 7620 gccatgatgc ctggtcgagc gaaagtgacg atgaacccga tgacgtcgag cgaggcggac 7680 ttgatgctgc ccagcaggca gcagcagccg aagccgaaag agcggcacgc taaagctaac 7740 tttagtaaaa agaacaacga aagtgtaaaa aaaaaaaaga gaaaggtgct tcggtacaga 7800 tatagggctc cccttaaggg aatgaattca gaggaacagt ttggaaaata gaatttcaac 7860 ttaaattaaa cggaaaggtg cgaacgcacg gctaaaaagt taggctccct atgagcatca 7920 cgtccaccct taaatccctt cgcagggcat aaggggcgga ttatgagagg gctgggtttt 7980 cttttcgatg taacatacga tcaataaaaa tccactcgat ataaaaaaaa aaaaaaa 8037 // ID BEL17-LTR_AG repbase; DNA; ANG; 276 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL17-LTR_AG is a long terminal repeat of the BEL17_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL17-I_AG; BEL17-LTR_AG; BEL17_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-276 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL17_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 42-42 (2003). XX DR [1] (Consensus) XX CC BEL17-LTR_AG flank an internal portion of BEL17_AG (deposited as CC BEL17-I_AG). XX SQ Sequence 276 BP; 85 A; 46 C; 68 G; 77 T; 0 other; tgttaggaaa aggcagtgct attgatgagt ccaacccgga cgtataaacg agatgaagtg 60 aaatgacagc tagggaaagt gacagatgaa atgacaggga aacggttgca cgcgcttcaa 120 accagtagcg agtgaagcga gcagcaggtg acaccgtttt aaagttttat tgctatttta 180 tttggattta tttgtatttg tcttcaggag aataaatagt taaactaaaa ctttcgcctt 240 gtggctgtac gctctcgcac tcaaactgtg tttaca 276 // ID HATN5_AG repbase; DNA; ANG; 205 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE HATN5_AG is a hAT-like nonautonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW 8-bp TSD; HAT1_AG; HATN5_AG; nonautonomous DNA transposon; KW hAT superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-205 RA Kapitonov V.V. and Jurka J.; RT "HATN5_AG: a family of nonautonomous hAT-like DNA transposons RT from African malaria mosquito."; RL Repbase Reports 3(3), 60-60 (2003). XX DR [1] (Consensus) XX CC HATN5_AG is a young family of nonautonomous DNA transposons that CC belongs CC to the hAT superfamily. HATN5_AG copies are less than 1% CC divergent CC from the consensus sequence. Some copies are 100% identical to CC each CC other. It is likely that this family is currently transposable. CC HATN5_AG has imperfect 23-bp terminal inverted repeats (10 CC mismatches). CC The genome harbors ~100 HATN5_AG elements. XX SQ Sequence 205 BP; 52 A; 50 C; 40 G; 63 T; 0 other; caaggataat gtatttagaa ggttgatttt tatcgctttt gctgggcgac attgtggggg 60 acatctgtca ctctctgcat acaaatttca ttaccccgca ccagccctcc ccgattttta 120 taccggagcc cccggaagag aaatgtcaag tcgagaaatg tcattttgct ctgaacaact 180 tcaccttgtc atttataaca ccttg 205 // ID GYPSY12-I_AG repbase; DNA; ANG; 5759 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY12-I_AG is an internal portion of retrotransposon GYPSY12_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY12-I_AG; GYPSY12-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY12_AG; mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5759 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY12_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 166-166 (2003). XX DR [1] (Consensus) XX CC GYPSY12_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY9_AG, GYPSY10_AG, GYPSY11_AG, GYPSY13_AG, CC GYPSY14_AG, CC GYPSY15_AG, GYPSY16_AG and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY12-I_AG consensus was reconstructed after multiple CC alignment of 12 copies. CC The consensus encodes the 414-aa GYPSY12_AG1p gag-like protein CC (pos. 807-2048) and the 1233-aa GYPSY12_AG2p (pos. 2000-5698). CC The sequence of the LTRs flanking GYPSY12-I_AG is deposited as CC GYPSY12-LTR_AG. XX FH Key Location/Qualifiers FT CDS 807..2048 FT /product="GYPSY12_AG1p" FT /translation="RMKQKIELLVERLKQLHDNLKHIDRCYRQCAISTYEE FT SAKETFKKLQEKLVKYENEISEEELTYTSKIARTLYSDITKFIAIHKQKFT FT VSINNVSLGTQTMATFNLKEASAVVPTYNGSADQLQSFLRAIHFAKRMFPD FT DQEGLLVDFIYTRLSGKAETGIKPNLTTVQQVAEDIKSRCEEKVEPSKVIA FT NMKSLKTKDTKTLCKEIEELTEKLKVLYLQKQIPTEVANEFAIREGINSLI FT DKVSNREAKTALIIGKFTTINEATKVVDECESRQDKAQVLAYQGSYNNSRF FT NNYRQQNRNYQYNNRPNSNQSYRRNDNYRNNNNFNNRNYNSDGYNSNQRNN FT NRSGNNRDNQNQRYNNNGYQNRPNRSPRAITNRETGQQSRNVYHTTAQVQE FT DENNFLDQKESTQTLDQYTR" FT CDS 2000..5464 FT /product="GYPSY12_AG2p" FT /translation="FFRPKGINSDSRPIYSLDLNGVDYIKIKLSFANTETS FT ILLVDTGASVSLFKSSKLKKNHSPIRSNSISLTGISNTPIYSKGITTCTIF FT FNNLELEHDFVLVPDEFNIGADGILGRDFYKLYRCSINYELLLLTFTCQGE FT EIQHNIEEDDGKGFILPIRSEVVRRIYLPNITEDTIVFAQEIQPGVFCGST FT IISKDNQVVKFINTKQRNIYITHAEFKPITEPLSNYEAKQVNNKAGEVNND FT RLQKLLQKIKIDKIPTSEIYNLRKIVTEYNDIFCVEDDPITTNNFYPQKIE FT LKDNIPTYIPNYKQIYSQTDEIQNQVDKMLKNDIIEHSVSPYNSPILLVPK FT KSTDGNKKWRLVVDFRQLNKKVIPDKFPLPRIDSILDQLGRAKYFSTLDLM FT SGFHQIPLENDSRKFTAFSTGSGHYQFKRMPFGLNISPNSFQRMMAIAMAG FT LTPELAFVYIDDIIVTGCSARQHISNIVKVFDRLRHYNLKLNPEKCSFFKT FT EVTYLGHKITDKGIYPDDSKFETIRNFPIPKNADEVRRFVAFCNYYRKFVH FT NFANTAKPLNNLIKKKVKFIWTDECQNAFNSLKQSLLSPTVLKYPDFKKEF FT ILTTDASDVACGAVLSQITDGEDHPIAYASKSFTPGEKNKPIIEKELTAIH FT WAINYFKPYLYGRKFTVKTDHRPLVYLFGMKNPTSKLTRMRLDLEEFDFKI FT EFLAGKTNVVADALSRIVTDSDELKASIPKNKTDLANPILLVNTRAMTRKN FT ITADKKEKEKDKEETKVKYDQTNMYETDRPSETTKMLKMKSNIIENEEFIE FT LVIYNHNYYKALGKFKIPITSAKQSQTLEFVLHESCLIARKYHKNLAIASN FT DKLFEFYSMSTIKEIINKMITDVHVIVYTQPKWIEEKEEQFQIMSNFHTTP FT VGGHLGQFKLYSKIKDKYKWKNMKADIIKYVKSCKACATNKILKHTKEETV FT VTTTPSKPFNIITIDTVGPLPKTANNNRYAVTIQCELSKYIVIIPIQNKEA FT NTIAKALVENFILTFGNFLEMRSDQGLEYNNEILTQISKILEIKQTFAAAY FT HPQTIGALERNHRSLNEYLRSYTNEHHDDWDQWTKFYEFVYNTSVHSITNL FT TPFELVFGRQANLPQELYKTKVDPVYNIEQYYNEMKFKLQKAHAIAK" XX SQ Sequence 5759 BP; 2423 A; 999 C; 912 G; 1425 T; 0 other; tggcgaccgt gaattttaag ctgcaatctt cggatgtgca aaaaaaaaag taatgaatga 60 aaccccaaac acggacaaaa agtgcaaagt ggaaacgttt tataaaaatc gcaagtgctt 120 ctgaaatgac ttggaaagtg attaattacc aacaatagta aactgcgagt gaaaaccaat 180 cattttcaaa atgggaggca aggctgcaaa acctgaaaca aatatcaaag gagaccatga 240 tcttacaata gtccaaactc agaatattca tacagaatat catctgactc aggatttaaa 300 actaaacatt attttagggc tgctaatcac cctgtgcatt gttaaaatag cgaaaacttg 360 ttacaaacac cttcgtaacc aagcgcaaaa acacgcttta aaagtgctta cgctaccaaa 420 gtagcaacgt aaacattgaa tcgagaacag tgaatgaatg atatacgaaa aaggtgatat 480 gctatttgac ccacaaaaat aggtaaaggc tgtggaaaag taccgtaagt tcaccaatgc 540 agctattgac cgcgattatg aaaaaaaaac aattatgcgc gtatgagaac ctaccgcaac 600 gtgtaaaaac actgtgacca gcggtatggt ttggaacagc cgtgcccaac gcggaagacg 660 agaagaataa ggcaagaatg gacatacccc gtcccaaccc ggacgagaca tagagactga 720 tgatgagatc gagacatcgg gcatgactcc agctacgaga actccaaggt actgcataac 780 ccttaattta taaattaata tgatgaagaa tgaaacaaaa aatagaacta ctagtagaaa 840 gattaaaaca actgcatgat aatctaaaac acatagacag atgctacaga cagtgtgcaa 900 taagcacata tgaggaaagc gctaaggaaa cttttaaaaa gctacaagaa aagttagtta 960 aatatgagaa tgaaataagt gaagaagaac taacgtacac atctaaaatt gctagaacac 1020 tatatagcga cataactaaa ttcatagcaa ttcacaaaca aaagtttaca gtttcaatta 1080 ataacgttag tttaggaacc caaacaatgg cgacgtttaa cctaaaagaa gcttcggctg 1140 tagtaccgac atataatgga tccgcagacc aattacaatc cttccttagg gcaatacatt 1200 ttgcaaagag aatgttcccg gatgatcaag agggactatt ggtagacttc atttacacta 1260 gattatctgg taaggcagaa accggaataa aacccaacct taccacagta cagcaagtag 1320 cagaggatat aaaatcacgc tgtgaagaaa aggtagaacc aagcaaagta atagctaaca 1380 tgaaatcatt aaaaactaag gatacaaaga cactttgcaa agaaatagaa gagcttacag 1440 aaaaattgaa ggttctctac cttcaaaaac aaattccaac cgaagtagca aatgaattcg 1500 caattcggga aggaataaat tcattaatag ataaagtgtc aaatcgagaa gctaaaacag 1560 ctctcatcat tggaaaattc acaactataa acgaagctac taaagtagta gatgaatgcg 1620 aatctcgaca agacaaagca caagtacttg cttaccaggg ttcatacaac aatagtagat 1680 tcaataatta tagacaacag aacaggaatt atcagtataa taaccgtcct aactcaaacc 1740 aatcatacag acgcaatgac aattacagaa acaacaacaa tttcaataac cgcaattata 1800 acagtgatgg ctacaacagc aatcaaagaa ataataaccg cagtggcaat aatagagaca 1860 accaaaatca aagatacaat aataatggct atcaaaacag gccaaataga agtcctaggg 1920 ctataacgaa tcgagaaaca gggcaacaat ccagaaacgt atatcatact acagctcagg 1980 tacaggagga tgagaataat tttttagacc aaaaggaatc aactcagact ctagaccaat 2040 atactcgcta gatttgaatg gagttgatta cataaaaata aaattgagtt ttgctaatac 2100 cgaaacatct atattattag tcgacacagg agcatcagta tcattattca aatcaagcaa 2160 attaaagaaa aatcacagtc caataagatc aaattcgatt tcattgacag gcatttctaa 2220 cacaccaata tattctaagg gtattacaac ttgtactatt ttttttaaca atttagaatt 2280 ggaacatgac tttgtattag ttccagatga atttaacata ggagcagacg gtatattagg 2340 tagagatttt tacaaacttt acagatgttc tattaactat gaattactat tgcttacatt 2400 cacatgccaa ggagaggaaa ttcaacataa tattgaggaa gacgacggaa aaggatttat 2460 tttacctatc agaagtgaag tagtacgaag aatatacctt ccaaatataa ctgaggacac 2520 catagtattc gctcaagaaa tccaaccagg agtattttgt ggtagtacaa tcatttctaa 2580 agataatcaa gtagttaaat tcatcaacac aaagcaacga aatatctaca taactcatgc 2640 agaattcaaa cctattacag aaccattatc gaattatgaa gctaaacaag taaataacaa 2700 agcaggagaa gtaaacaatg atagactgca aaaactttta caaaaaatta aaatagataa 2760 aattcccacc tcggaaattt acaatctaag aaaaattgtt acagaataca acgatatttt 2820 ttgcgtagaa gatgatccaa ttactacaaa caatttttac cctcagaaaa tcgaattaaa 2880 agacaatatc ccaacttaca taccaaacta taaacaaatt tattcacaaa cagacgaaat 2940 acaaaatcaa gtagacaaaa tgcttaagaa tgatataatt gaacattcag tttcaccata 3000 taattcacca atcttacttg ttccaaagaa atcaacagat ggtaataaaa aatggagact 3060 tgttgtagat ttcagacagt taaacaaaaa ggttatacca gataaatttc cattgccaag 3120 aattgactca atactagatc aactcggcag ggcaaaatat tttagcaccc ttgacctcat 3180 gtcaggattt caccaaatac ctctagaaaa tgattcaaga aaatttacag ctttttcaac 3240 cggatcaggg cattaccaat ttaaacgtat gccgttcggt ttaaacatta gccccaacag 3300 ttttcaacgt atgatggcta tcgctatggc aggattaact cctgagctag catttgtata 3360 tatagacgat ataatcgtta ctggctgcag tgcacggcag cacatcagta acatagttaa 3420 ggtttttgat agattaagac attacaattt aaaattaaat ccagaaaaat gttcattttt 3480 taaaacagaa gttacctatt taggtcataa aataacagat aagggaatat atccagacga 3540 ttctaagttc gaaacaatta gaaattttcc aatacctaaa aatgcagacg aagtacgaag 3600 atttgtcgca ttttgtaatt attatcgtaa gtttgtacat aatttcgcta atactgctaa 3660 acctttaaat aatctaatta agaaaaaagt aaaatttatt tggacagacg aatgtcaaaa 3720 cgcatttaat agcttaaaac aaagtctttt atcacctaca gttttaaaat atccagattt 3780 taagaaagag tttatattaa caacagacgc ttctgatgta gcatgtggag cagttctttc 3840 acaaataaca gatggagaag accacccaat agcatatgca agcaaaagct tcacaccagg 3900 agaaaaaaat aagcctatta ttgaaaaaga attgacagca attcattggg caattaatta 3960 tttcaaacca tatctatacg gtagaaaatt tactgtaaaa acagatcata gaccattagt 4020 atatttgttt ggtatgaaga atccaacatc aaaactaact agaatgagat tagatttaga 4080 agagtttgat tttaaaatag aatttttagc aggaaaaact aatgtagtag cagatgcgct 4140 atccagaatc gtaacggatt ctgacgaact taaagcatct atccctaaaa ataagacgga 4200 tttagctaat cctattttat tagtgaatac tagagcaatg acaaggaaaa atataactgc 4260 agataaaaaa gaaaaagaga aagacaagga agaaacgaaa gtgaaatatg atcaaactaa 4320 tatgtatgaa acagataggc catctgaaac aactaaaatg ttgaaaatga aatcaaatat 4380 tattgaaaat gaagaattta ttgagctcgt gatatacaat cacaattatt acaaagcgct 4440 gggaaaattt aaaataccca taacttctgc gaagcaaagt caaacactag agtttgtact 4500 gcatgaatca tgcctaatcg ctagaaagta tcacaagaat ttagcaattg catcgaatga 4560 caagttattc gaattttatt caatgtcgac cataaaagaa attattaaca aaatgataac 4620 agacgttcat gtcatcgtgt atacacaacc taaatggata gaagagaaag aagaacaatt 4680 tcaaataatg tccaatttcc acacaacacc agtaggaggt catctaggac aattcaaact 4740 atatagtaaa ataaaagata aatacaaatg gaaaaatatg aaagcggata tcatcaagta 4800 tgtcaaaagt tgtaaagcat gcgcaactaa caagatccta aaacatacta aagaggaaac 4860 tgttgtgacg acaacaccct ctaaaccttt taacatcata acaattgata ccgtaggacc 4920 gctaccaaaa acagcaaaca ataatcgata cgcagttacc atacaatgcg aattatcgaa 4980 atatatcgta ataatcccga ttcaaaacaa agaagcaaac actatagcaa aagctttagt 5040 agaaaatttc attcttacat ttggaaactt tttagaaatg agatcagatc aaggacttga 5100 atacaataat gaaattttaa cccaaatatc aaaaatatta gaaatcaaac aaacattcgc 5160 agcagcatat catccccaaa caataggagc tttagaacga aaccacagaa gcttaaacga 5220 atacttacga agttatacca atgaacacca tgacgattgg gatcaatgga ctaaattcta 5280 tgaatttgtt tacaacactt cagtacacag cataactaac ctcacaccct tcgaattagt 5340 atttggaaga caagcaaatt taccacaaga actatacaaa acaaaagtag acccagtgta 5400 caatatagaa caatactaca atgaaatgaa atttaaatta caaaaagcac atgcaatagc 5460 caaaaacaaa ctaatcgcat caaaaataca acgtcaagcc aaacttaatg aaaatctaaa 5520 caaattaaac atacgcatag gagattacgt ttacctaaca aacgaaaata gaaaaaaatt 5580 agatccagtc tacataggac catttacaat cgtagaaatc acggatacaa attgcgttat 5640 aaaacataat cagacaggaa aaatcacaac ggtacataaa aaccgattaa aacagtttta 5700 gtgaataatg cactcattcg taaaaaactc aatcgtacat tattcaaaaa gggtggagg 5759 // ID GYPSY49-I_AG repbase; DNA; ANG; 5457 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY49-I_AG is an internal portion of retrotransposon GYPSY49_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; CsRn1 lineage; GYPSY49-I_AG; GYPSY49-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY49_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5457 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY49_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 90-90 (2004). XX DR [1] (Consensus) XX CC GYPSY49_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the CsRn1 CC lineage of other organisms. CC GYPSY48_AG, GYPSY50_AG, GYPSY51_AG, GYPSY52_AG and GYPSY53_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY49-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 294-aa GYPSY49_AG1p gag-like CC polyprotein (pos. 1438-2319) and the 1034?aa GYPSY49_AG2p CC pol-like polyprotein (pos. 2277-5378). CC The sequence of the LTRs flanking GYPSY49-I_AG is deposited as CC GYPSY49-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1438..2319 FT /product="GYPSY49_AG1p" FT /translation="MSLKAGAEGKGEESSSSAGTTKVSAPGDPPLGVERIS FT TKLPTFWEDVPEVWFAQAEAEFELSKISRERTQYTYLIAAMSKEVLAKVLD FT IVKSPDPVKPYSHLKGEILKRLTSSEETRLSKLLYHVEIGDRTPSDFYRYM FT LQIAGDSTDLSVTLVRKLWKSRLPKSIEVALVAVDTKDHAEQLRIADRLWE FT CTQSSGLSEVNTRARVTCDSASDEIRREISELRDMIKKMSNLSQRQHRDHR FT RSFSRERRNFQSPSRDRKPFCWYHYRYGNGATKCVKPCNFSTTRVKHSEEE FT KN" FT CDS 2277..5378 FT /product="GYPSY49_AG2p" FT /translation="FFHNTSQTLRGRKKLRSSRELADLPRAYSTSPRLFVN FT DMMTSFTFLVDTGADVSVIPFHLAGKAYKPTDLVLQAANGSTIVTYGTKLI FT QVSLGFKRVFTHIFIIASVNRPIIGADFLYKTGILVDIKGRKLIDSTTNLA FT IIGSLALVDTPSPKHFVLEAGDFGRVLHQFPSLVEPPDYRKPVCHNVFHYI FT ETKGPLPFAKPRRLDPQKRKIAHVEFQQMVNLGICRPSSSPGSSPLHMVPK FT GDNDWRPCGDYRRLNAITTPDRYPIPHIQDFTMQLQGCNFFSKLDIVRAYH FT MIPVAEEDIHKTAITTPFGMFEFLRMPFGLKNAGQTFQRFMNSLFNDLSFV FT FVYIDDILIASSTKETHLEHLRIVFNRLAEYDLKIKPSKCVLGVSTINFLS FT HTISANGIIPCADKVEAIKTFPTPKSVKQLQRFVGMVNYYHRFLPSVSNTL FT RPLYKLIAQHTKQKLKTFDWSEECDKAALQVKSDLAKATILSHPRSDAVYS FT LTTDASNFAVGAVLEQHFEGSCQPLAFFSRVITPTEQRYSTFDRELLAIVL FT SIKHFRHFLEGRSFIIYTDHKPLTTALTSKTEKSPRQNRHLDFISQFSSDI FT RYIKGDSNIVADTLSRVAETDSISNLDYKIFSECQRTDEQLKTLLENKKNK FT NSAYKLDKLLLNNTELYFEVSTGKNRLYVPKSLRKETYDGVHNASHPGIKA FT SRKMMTDRYFWPAINKDVANWTRACSACQRAKVVRHTKSPLESFSLPKGRF FT DHIHMDIVGPLPPSEDKIYLLTVVDRFTRWPECYPLTNISAATVAKTFVEQ FT YVSRFGCPLRITTDRGRQFTSRLFEELTKLLGTHHIATTAYHPQANGMVER FT FHRRLKEALKASGNSPRWTVRLPLILLGIRLSYKDEIRGCPAEMVYGQSLR FT LPAEVFVPSSDKIDDDTSDLIVDLKKAFKSVEPSSPKDNNNTNNYIPKALE FT NCRQVFVRVDKVKTGLQAPYDGPYTVVRRFRKFFVIQMKDRNESISIDRLK FT PAFECEAHTKATKPLERSPVTKKHVHFR" XX SQ Sequence 5457 BP; 1603 A; 1157 C; 1186 G; 1511 T; 0 other; ttggtgaccc cgacgtgata aaaatcgcgg aacgtgaaaa attgctgacg tgatcaatcg 60 cccaggagaa cattctccgt cgccgtcatc ggttaagtat ttgccgccga cgtaatccat 120 tatcccggta catcgctatc ccgtgtgctt gaagtgcatt ttggagtgaa ggagaaatct 180 accaccgacg cgatttgtgg tttcgccgta tccagtagga taatattaca caaccacgat 240 tgttcggtcg atcgactact attcaccgtg attgccctgc tgaagcccca tccgttttcg 300 gagccacgac gtcccgtaag agtacttttt tttctcgggc ccacgtcatc gctgtattga 360 aggagtaccc tggggttttc ggccagcatc atcgttcccg gagccgtaga gcatttgaca 420 tcagtagata caaagtgcca ccaccgtgcg ttgtggtcac gagataagca gaacgatcac 480 atatatcacg tttgatactg tgatcacgga gagcccattg tcatcgtgtt tttcatctgc 540 aagagctaca agttcgacaa cgtgtccgtg taagcacgga aaccacacga ccaaggggta 600 gagtggaagc cacccgattt cccgtaagga gcatcactgt cgctgtttgt gaccagaaac 660 accaccgtgg aagcagtagt gccatcgtgt atttggagcc acgcggtttt ggaagcagta 720 acgccatcgc gtttttggag ccacgcggtg ttggaagcag tagcgccatc gcgtttttgg 780 agccacgcgg tgttggaagc agtagcgcca tcgcgttttc ggagccacgc ggtgttggaa 840 gcagttgtag cgccatcgca ttttggaagc agtagcgcca tcgcgttttg gaagtagtag 900 cgccatcgcg ttttcggagc cacgcggtgt tgaaagcggc ctcgtatcca gtgatcatcg 960 atcgacaacc ggtgagcccg ttccttatta ttattattat tattattatt tttcattcgg 1020 tgatacttgt gccagaaact cagtgtgttt ttatgcgttt tatacgacgc gtgttttttt 1080 ttctcttttt cggtgcgttt ataagtaaat atttcattaa gtgcaacgtt gttttacgag 1140 ttttacgaga aaatacgtgc aatttcttac gtgcaatttc attagtacat ttttatcgta 1200 tacaatttag cgcagtattt ttttttttca cattttcttc gttataaatt atactgaaat 1260 actacgcgtt ttgaattttt ccttgcattt tataaaaacc ttgtatttat tgaactgaat 1320 tttgcgatca gtattatagg tcatttttat ccttggtacg tttaaaaatt taagtgcgtg 1380 tagtgaatta gaaaacaggt tagatttctt tgtattggtt ttgtaaacca cagagtgatg 1440 tctttgaaag ctggtgccga aggaaagggt gaggagagtt caagttcggc aggaacgaca 1500 aaggtgtcgg caccgggcga tcctcccctg ggggtagagc gtatatctac gaagttacca 1560 actttttggg aggacgttcc agaggtttgg ttcgctcaag cagaagcgga atttgagctt 1620 tcaaaaattt cacgggagcg cacgcaatac acatatttga tcgctgcaat gtccaaggaa 1680 gtattagcta aggtgctgga cattgtgaaa agtcctgacc cggtaaaacc atactctcac 1740 ttaaagggag aaattttaaa acggttgaca agtagtgagg aaacacgatt gtcaaaattg 1800 ttatatcatg ttgagatagg agatagaact ccatcagatt tctatcgata catgctacag 1860 atagctgggg attctaccga tctttccgta actcttgtga ggaaattgtg gaagtcaaga 1920 ctgcctaaat cgattgaggt agccttagtt gcagttgata caaaagatca cgctgagcaa 1980 ttgcggatag cagacaggtt gtgggaatgc acacaatcta gtggactatc cgaagtgaat 2040 actcgtgcta gagttacatg cgattcagcg tctgatgaaa tccggcgtga aataagcgaa 2100 ctacgcgaca tgatcaagaa gatgtcaaat ctttcccaac gacaacatcg tgatcataga 2160 cgcagcttct ctagagaacg aagaaatttc caaagtccat ctcgtgatag gaaaccattt 2220 tgttggtatc attacagata cggtaacgga gcaaccaagt gtgttaagcc ctgtaatttt 2280 tccacaacac gagtcaaaca ctcagaggaa gaaaaaaact aagaagctca cgggagttgg 2340 cggatctccc tagagcatat agcacttctc ctcgtttgtt cgtaaatgat atgatgacat 2400 catttacctt tttagtggac acaggggctg atgtatccgt cattccattt catttagctg 2460 gtaaagctta taaacctaca gacttagttt tgcaagcagc caacggaagt accatagtga 2520 cgtatggcac gaagctgatt caagtaagtt taggtttcaa aagagttttt acccatattt 2580 ttatcatagc atccgtaaat cgtcctataa taggagcgga ttttttatac aaaaccggaa 2640 tactagtgga tatcaaaggg cgtaagctaa ttgattcgac cactaattta gcgatcatag 2700 gatcattggc acttgtggat actccttccc ctaaacattt tgtacttgaa gcaggagact 2760 tcggacgagt tctacaccag tttccatctt tggtagaacc accggactac agaaagccag 2820 tatgtcataa tgtgtttcat tatattgaaa ccaaaggccc tttgcctttt gcgaagccac 2880 gtcgccttga cccgcaaaag cgaaaaatcg cgcatgtgga atttcagcaa atggtaaatt 2940 tgggaatttg tcgcccatca tcatcaccag ggtcatcacc cttgcatatg gtgccgaagg 3000 gtgacaatga ctggagacca tgtggggatt accgaaggct taatgccatt acgacaccag 3060 atcggtatcc tatccctcac atccaggact ttaccatgca actgcagggt tgcaattttt 3120 tttcaaagct tgacatagta cgagcatatc atatgatccc agtagcagaa gaggacatac 3180 ataagacggc cattactacc cctttcggca tgttcgaatt tttacgaatg ccttttggtt 3240 taaaaaatgc cggtcaaaca ttccagaggt tcatgaacag tttattcaat gacctcagtt 3300 tcgtattcgt ctacattgac gatattctaa ttgccagttc gacgaaggaa acgcaccttg 3360 agcacttgcg catagtgttc aatcggctgg cagaatacga tttaaaaatt aaaccgtcca 3420 agtgtgtgct aggtgtatct accattaatt tcttaagcca cacaatttca gctaatggaa 3480 ttatcccatg cgcggataag gtagaagcga taaaaacatt cccaacgcca aaatcggtta 3540 aacaactaca gcgatttgtg ggaatggtca actattatca ccgtttctta ccgagcgttt 3600 caaatacgtt gagacccttg tacaagttga tagcacagca caccaagcag aaattaaaga 3660 cattcgactg gtcggaggag tgcgacaaag cggcccttca ggtcaagtca gacctagcaa 3720 aagctacgat actgtcacac ccaaggtccg atgcagttta ttctttaacc acggacgcat 3780 ctaatttcgc cgtgggagcg gtgctggaac agcattttga aggaagttgc caaccactag 3840 cattcttctc tagagtaatc accccaaccg aacaaagata ctctacgttc gatagagaat 3900 tattggcaat cgtgctgagc atcaagcatt ttagacattt tttagaaggc aggtcattta 3960 ttatatatac tgaccataaa ccgctcacca cagcgctgac gtcaaaaacc gagaaatcac 4020 ctcgacaaaa tcgtcattta gatttcattt ctcagttctc aagcgacatt cgatacatca 4080 aaggagacag caacatagtt gcagatactc tctcgcgtgt ggcagaaacg gattcaatta 4140 gtaacttgga ttacaaaatt ttctctgaat gtcaacgtac tgatgaacaa ttgaaaacat 4200 tattggagaa taaaaaaaat aaaaactctg cttataaact tgataagctt ttgctaaaca 4260 acacagaatt gtattttgag gtctcaacgg ggaaaaatag actgtacgtc ccgaaatcct 4320 tacgtaagga aacatacgat ggtgtgcaca atgcctccca tcctggcata aaagcatcta 4380 ggaaaatgat gactgatcga tatttttggc cagccataaa taaagatgta gctaattgga 4440 ctcgcgcatg ttctgcttgc caaagagcta aagtagttag gcacacaaaa tcacctttgg 4500 aatctttttc acttccaaag ggaaggtttg accatattca tatggacatt gttggaccct 4560 tacctccatc tgaggacaaa atatatttat taacagtagt agatcgtttc accagatggc 4620 cagagtgtta tcctctaact aacatctctg cggctactgt tgccaaaact ttcgtggaac 4680 agtatgtgtc taggtttggt tgtccgttaa gaattacgac agatagagga agacaattca 4740 cttcaagatt attcgaggag ttgaccaaac tattgggcac acaccatatt gcaaccactg 4800 cctatcatcc acaagccaac ggtatggtgg aacgtttcca ccgtcggcta aaggaagcct 4860 taaaagctag tgggaactca ccacgatgga cagttcggtt accattaata cttctcggca 4920 ttcgtctttc ttataaagat gaaatcagag gttgcccagc agaaatggtc tacggtcaaa 4980 gtctccgatt acctgcagaa gtttttgtac catctagcga caaaatcgat gatgatacat 5040 ctgacttaat cgttgattta aagaaagcat tcaaatccgt agagccatcg agccctaaag 5100 ataacaataa tacgaacaac tacataccta aagccctaga aaattgtaga caagtatttg 5160 tcagggttga taaggtcaaa acaggacttc aagctcccta tgacggtcca tataccgttg 5220 ttcgaagatt cagaaaattt tttgtcattc aaatgaagga taggaatgaa tcaatttcta 5280 ttgatcgact caaacccgct tttgagtgtg aagctcatac taaggcaacc aagccgctgg 5340 aaaggtcccc cgtaacgaaa aaacatgttc attttcgcta agaataagac cgcctagtcc 5400 gcaataagcg tttgacggca gatataaaag cgtgactacg tcactggagg gggatcg 5457 // ID GYPSY5-LTR_AG repbase; DNA; ANG; 226 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY5-LTR_AG is an LTR of the GYPSY5_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY5-I_AG; GYPSY5-LTR_AG; GYPSY5_AG; Gyspy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-226 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY5_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 80-80 (2003). XX DR [1] (Consensus) XX CC GYPSY5-LTR is a long terminal repeat of GYPSY5_AG (its internal CC portion is deposited as GYPSY5-I_AG). XX SQ Sequence 226 BP; 76 A; 47 C; 32 G; 71 T; 0 other; tgtagaaatt agaaattcga atttgtatat acctcacaca tacaccttga catccttaca 60 catacacctt ttctatatgt acatcttgtt tgtgcgatca cacctaggtt taagtatctc 120 aggcaatact gtcaaataga gatgtcattc gtgattgagc gctaaaacag aaaacacgca 180 attcttcagc cttcagttaa ttcttccgaa attaagaaag attaca 226 // ID COPIA4-I_AG repbase; DNA; ANG; 4122 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA4-I_AG is an internal portion of the COPIA4_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA4-I_AG; COPIA4-LTR_AG; COPIA4_AG; Copia clade; integrase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4122 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA4_AG, a family of autonomous, copia-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 53-53 (2003). XX DR [1] (Consensus) XX CC COPIA4_AG is a young family of Copia-like LTR retrotransposons. CC COPIA4-I_AG, an internal portion of COPIA4_AG is flanked by >99% CC identical COPIA4-LTR_AG LTRs. The consensus was reconstructed CC from 6, CC >98% identical, copies of COPIA4-AG internal sequence. CC The consensus sequence encodes the 1322-aa COPIA4_AGp protein CC (positions 112-4077). XX FH Key Location/Qualifiers FT CDS 112..4077 FT /product="COPIA4_AGp" FT /translation="MSTEEANSAGPSRESSAGAAALSGHAVGSNNLAMNVG FT IEKLKGRENYVSWAFAMKMMLCRERCWDIVTARDDKAVDKDMDMRALSTIA FT LSLEKHNYSLVMDANTAKEAWEKLKAAFTDDGVFRRISLLQELVSFKLNNF FT SSTEAYVDAIMSTCHKLREIGFEVSDIWVSSILLMGLPKYYAPMVMGLEAS FT GMAMKADAIKLKILQEVKTTCHKDDEALFSRGNPNLRKGGGAMKRSTKEVT FT CYNCQKLGHFAINCPEKQKHKKNNKTRAMSSVLAMGDVSECEWYFDSGASS FT HMAKSGVDFSERQHICHEVSTANNASMKAITKGTVSVNCQEGAVNLLNVLE FT VPDLATNLLSVSKICKNGFKVVFTERKCEVFDENGEVFASGIAENGLYRLN FT ENRVRTFLSYEIWHRRLGHLNFQSIQRLKGMADGIQTKQTNTYNCVACIEG FT KHARESFPTSQERCKEKLELIHSDLCGPFEVESIGGSKYFMTFIDDATRKV FT FVYMLKSKDEAKTVYEKFKSMVQRQSGRQIKLFRSDNGREYVNASMKASME FT RDGICHQTTCTYTPEQNGVAERMNRTIVEKVRSMLNDAQLPKRFWAEAVNT FT AVYLINRSPTRALNDITPEEAWSGKRPHLGHLKIFGSTVMVHKPKQKRVKL FT DPKSERCIFLGYAHNTKGFRVFNVATNEIIISRDIIVVDEGQCEGFGKEQT FT TPVEFLELLFAEGKDESNSTRNNPINISPTEEASDGQTEETPTRFDQSNET FT PRRSQRQHKLPSKYKDYVINRKFVPSSTLANEAENVSSDSDYTTPESESDE FT ALVVFSQREDPRNYAEAMKSEDAKQWMDAIQEELQSIEANNTWSLVDLPPG FT RKAIGSKWVFKTKRDVDGNLLRYKARVVAQGFSQQFGTDYDEVFAPVVKQT FT TFRVLMGIAAKRGMAVKQYDIKTAFLYGDLEEEIFMKVPQGVKVEDNKVCR FT LKKGLYGLKQSARSWNQRLDQELKRQGYTNCLADSCLYRKRCGKEWCYVLV FT YVDDLIVAGDNLDMIESLLAELKKSFEVNILGDIRFFLGIEVEKNKQGDYF FT VNQRNYIKDVIISSGLTDAKPSSIPLDPGYIKIEAEEIELSDNKEYQQLIG FT KLLYIAINTRPDISAAVSILSQKISKPTQRDWCELKRVVRYLKGTINYRLR FT LSEKGCDNGIIGYCDSDWAENRIDRKSNSGYVFKVNGGTVSWTCRKQSCVT FT LSTAEAEFVAISEGIQEALWLKLLLEELNDVQEVIIHEDNQSCLKILSGEK FT LSNRTKHIATRYHFTKDLIKKGQISCVYCSTEEMIADLLTKPLARIRIQKL FT VSLIGLSVSL" XX SQ Sequence 4122 BP; 1415 A; 674 C; 1007 G; 1026 T; 0 other; ggttatgggc ccagacccaa tccaaaatta agtaattaat attctatcga agataaaagt 60 gccacacttc acatcacccg aacaaagagt gaaattttgt gtacgagaga aatgagcacg 120 gaggaagcga attccgcagg accgagcaga gagagcagcg caggagcagc agcactaagc 180 ggccacgcgg tcgggagcaa caacctcgca atgaacgtag gtatcgaaaa gttgaaagga 240 cgggagaact acgtttcgtg ggccttcgcg atgaagatga tgttgtgtag agaacggtgc 300 tgggatattg taacggctag agacgataag gcggtagaca aagacatgga catgcgggca 360 ttgtccacta tcgcgcttag cttagaaaag cacaattaca gccttgtgat ggacgctaat 420 acagcaaagg aagcttggga aaagcttaaa gctgcgttca ctgatgatgg tgtgtttaga 480 cgtatctctt tattgcaaga gcttgtttcg ttcaaattaa ataatttttc ttcgaccgaa 540 gcatatgttg atgcaataat gtctacgtgc cacaagctta gagaaatagg cttcgaagtg 600 agtgatattt gggtttcgtc aatcctgctg atgggattac ctaaatatta cgcgcccatg 660 gttatgggat tggaagcatc gggaatggcc atgaaggcag atgcgataaa gttaaaaatt 720 ttgcaagaag tgaaaaccac gtgccacaaa gatgatgaag cgctctttag tagaggtaat 780 cctaacctgc gtaaaggggg aggagcaatg aagagaagca caaaagaggt tacatgctac 840 aattgccaaa agctagggca ttttgcgatt aattgtcctg aaaagcagaa gcacaagaaa 900 aataataaaa cacgtgcaat gagttctgtg ttggcaatgg gtgatgtgag tgaatgtgaa 960 tggtactttg attcaggagc aagttcgcac atggcaaaat caggtgtaga tttctcggaa 1020 agacaacaca tatgtcacga ggttagtacg gcaaacaatg ctagcatgaa agctattacg 1080 aagggcacag tttccgtaaa ttgccaagaa ggtgcggtaa atttattaaa tgtgttagaa 1140 gtaccagact tagctacaaa tttactatca gttagtaaaa tatgtaaaaa tggcttcaaa 1200 gtggtattta cagaacgtaa atgcgaagtg tttgatgaaa atggagaagt gttcgcatcg 1260 ggtattgctg agaatggatt ataccgattg aatgaaaata gagtgagaac atttttatcc 1320 tatgaaatat ggcacaggcg actaggacat ttgaatttcc aaagtattca aagattaaaa 1380 ggcatggccg atggcattca aactaaacaa actaacacgt acaattgtgt agcgtgcatt 1440 gaaggcaaac atgcaagaga gtcgtttcct acaagccaag aaagatgcaa agaaaaatta 1500 gaactgattc attcagatct atgtggacca tttgaagttg aatcaattgg tgggtcaaaa 1560 tacttcatga ctttcataga tgatgcaaca cgtaaagtat ttgtgtatat gctcaagtct 1620 aaagatgaag caaagacagt atacgagaag tttaaatcga tggtacaaag gcaaagtggt 1680 cgacaaataa aattatttag aagtgacaac ggtcgtgaat atgtaaatgc cagtatgaaa 1740 gcgagtatgg aacgtgatgg aatatgtcat caaacgacat gcacgtacac tccagaacaa 1800 aatggagtag cggagcggat gaaccgcact attgttgaaa aggttcggag catgctaaac 1860 gatgcgcagt taccaaagcg attttgggcg gaagcagtta acactgctgt atatttaatt 1920 aatcggagtc ctacgagagc gttaaatgac attactccag aggaggcatg gtcaggcaaa 1980 agaccacatt tgggacatct caaaatattt ggctctacag ttatggtaca caagcctaag 2040 cagaaacgag taaaactcga tccaaaatcc gaacggtgca ttttccttgg ttatgcacac 2100 aacacaaaag gatttagagt ttttaacgtt gccaccaacg aaataattat cagtcgtgat 2160 attattgtcg ttgatgaagg tcaatgtgaa ggttttggca aggaacaaac aactcctgtt 2220 gagtttctgg aactgctttt tgctgaagga aaggatgagt caaatagtac acgtaataat 2280 ccgattaata tttcaccaac agaagaagca tcggatggtc agacagaaga aaccccaaca 2340 aggtttgatc agagcaatga aactccaagg cgcagtcaac gacaacacaa acttccaagc 2400 aagtacaaag attatgtcat taatcgcaaa tttgttcctt catcaacatt agctaacgaa 2460 gcagaaaacg tatcgagtga ctcggactat acgacaccag agagcgagtc tgatgaagcg 2520 ttagttgttt tctcgcagcg agaagatccc agaaactatg ctgaagctat gaagtctgaa 2580 gatgctaaac agtggatgga tgctatccaa gaagagcttc agtctattga ggctaacaat 2640 acatggtcac tggtagacct accaccggga cggaaggcga taggcagcaa gtgggtcttc 2700 aagactaaga gagatgtgga tggaaatttg ttgcgttaca aagctcgtgt tgttgcgcaa 2760 gggtttagtc agcagtttgg aactgactat gatgaggtat ttgcaccagt tgttaagcag 2820 acgactttcc gtgtgctaat ggggattgcg gcaaaaaggg gaatggcggt aaagcagtac 2880 gatattaaga ccgcgtttct gtacggcgat ttagaagaag aaatctttat gaaagttcca 2940 caaggtgtga aagtggaaga taacaaggtt tgtagattga agaaagggct atacggctta 3000 aaacaatcag caagatcgtg gaatcagaga cttgatcaag aactaaagcg ccagggatat 3060 acgaattgct tagcagacag ttgcttgtac aggaaaagat gcggaaagga atggtgctac 3120 gtcttagtat atgtagacga tttgatagtt gcgggagata atcttgacat gattgaatca 3180 ttgcttgctg aattgaaaaa gtcgtttgag gttaacattt tgggcgacat aagattcttt 3240 cttggaatag aagtagagaa aaataagcaa ggagattatt ttgttaatca gcgcaactac 3300 atcaaagatg taattatctc tagtggatta acagatgcaa agccttctag tattcctctt 3360 gatccagggt acataaagat cgaagcagaa gaaatcgaac tttctgataa taaggagtat 3420 cagcagttaa taggcaaatt gctttatatt gcgattaaca cgagaccaga tatatcggca 3480 gccgtatcaa tacttagcca aaagataagt aaacccacac aacgcgattg gtgtgagtta 3540 aagagagttg tgagatattt aaagggaacc attaactatc gtctacgatt gagcgaaaaa 3600 ggatgcgata atggtataat tggctattgt gactcagatt gggccgagaa cagaatagat 3660 agaaaatcca acagtggata tgtttttaaa gtaaacggcg gtacggttag ttggacttgt 3720 agaaagcaat catgcgtaac gttatcaaca gccgaagcag aatttgtcgc aatatcagaa 3780 gggatacagg aagcgctgtg gttgaaatta cttcttgagg aattaaacga tgtacaggaa 3840 gttattattc acgaagacaa ccagagctgt ttgaaaatct tatcaggcga aaagttgagt 3900 aatagaacta agcatattgc aacgcgctat cattttacta aggatctaat taagaaagga 3960 cagatcagct gcgtctattg ttcaacagaa gaaatgatcg cggatctatt aactaaacca 4020 ttagccagga ttagaataca gaagttagta agcttgatag ggttaagtgt ttcactgtga 4080 gatatacaac gacagcgtaa gggaacttgc gttgaggagg ag 4122 // ID Mariner-N16_AG repbase; DNA; ANG; 1701 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Putative nonautonomous Mariner DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW Mariner-N16_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1701 RA Jurka J.; RT "Putative mariner/Tc1-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 641-641 (2009). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 1701 BP; 569 A; 273 C; 290 G; 567 T; 2 other; tacagtggcc ggcaatataa agtggccact acttacaatt ttacattttt tacatttttt 60 ccgtgaaaaa tgaagttatt gtactgcttt attttttgaa acattgttcg tgatatgctt 120 aagaatgcgc agtaccatag catgacaaaa atgagaagtt tgcttcaaga aaaaatattt 180 tttccaaaat tgatcaaaat cactgtgcca aaataaagtg cccactttat tcattctaca 240 gaaaatattt ttctgaacaa atataacacc aaacttataa tttatatgcg ttcctctcgc 300 attctttaca tgttcgagga tctttggcgt attttttttc taatttttaa aggtttatct 360 agttctgcat ctagtttttg catctggttc ttctcaagcg ttatcaagag ctttaaattt 420 aaatttattt gtttgtgaca ccagttttat ccaccttcgc acatagaatc tgccacaaat 480 tctcgatgga accaagattt agacttcatg gaggacatgc aaggtgtttt atacgatact 540 agttgagaaa tgttttttgt attgattgtg tgtgttgaga tgttagcctg taggaacatg 600 tttttagttt tcaaacccca tttttttaaa agaaatatcc caaagtattg atgaagaatg 660 ttaaaaaagg tatcagccat agttgtactt tcgattcaca ccagttttcc cgctactcct 720 aaagataaac accccctagc tatcacattg caattttcgt aatttactgc ccgttagatg 780 tggcgtcctt caagttcctc gcccgagcta cagcaagact aacaaagcca attggcttca 840 atatgatcaa gaacatgcag atatatcgtt agattgttga aaaactgrta tttggacaga 900 tgagtccaaa ttagaactga tgttcaaaaa tatctagacg cgatccaaca aaccaatatg 960 tatgaagagt gcaatgtgat gaccagggag gggatttcta tacgagaagt ggggagcatg 1020 gtwtgaatcg acggcataat gaaagctaat acctatataa gcatcctgtg tgaatttatc 1080 tgttttctgt gaaaaatacg gagttagtaa ataaattgat gttacaacag gaagacgact 1140 caagttatac acgaagaatt ttttaatttt tttttttcaa tcaaatcgaa taaaacttct 1200 ttaatggctc ccacttagtc ttaactttaa tcctatcgag atttcgtgtg caaattttga 1260 tgcaaatgtg gtgagaatga cgtgattagg aaaaacaaaa tatgttaaag ctctgaaata 1320 ctcctaggaa gaacaaatgc caaatatgtg caaataaaaa caaaataaat tatgtcaaag 1380 atcctcgaac atgtaaagaa tgcgagagga acgcatataa attataagtt tgatgttata 1440 tttgttcaga aaaatatttt ctgtagaatg aataaagtgg gcactttatt ttggcacagt 1500 gattttgatc aattttggaa aaaatatttt ttcttgaagc aaacttctca tttttgtcat 1560 gctatggtac tgcgcattct taagcatatc acgaacaatg tttcaaaaaa taaagcagta 1620 caataacttc atttttcacg gaaaaaatgt aaaaaatgta aaattgtaag tagtggccac 1680 tttatattgc cggccactgt a 1701 // ID AgaP13MITE412 repbase; DNA; ANG; 412 BP. XX AC DQ301489; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP13MITE412 P MITE, complete DE sequence. XX KW P; DNA transposon; Transposable Element; Nonautonomous; KW AgaP13MITE412. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-412 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-412 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301489; Positions 1 412. XX SQ Sequence 412 BP; 137 A; 78 C; 68 G; 129 T; 0 other; caaagtgtat gaatatcagg tggtctcaaa gcctgttttt catagcaaaa gttgaacgtc 60 actttttgac agcatgggta aaaacatttc cccaaacaag ctttctgacg acttgcctta 120 cacacaaaca atcttagaat cttcgttatt tgcactgaat tactttaaaa agcactcaaa 180 atatatgtta ctgtaagcta aaccaaatct aaactaatta aaatccattt ttgtatgaaa 240 atgatgccgt cgatgaggac accaatgtgt actacgtatg tgttgctaag aacaaagcga 300 acggttccta tggaaattca cttcacccat tctgtcaaaa ggagattttt ttgatttcga 360 aattagttgt caatttgtat gggacgagac cacctgatat tcatacactt tg 412 // ID Clu-47C_AG repbase; DNA; ANG; 938 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-47C_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-938 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1445-1445 (2010). XX DR [1] (Consensus) XX CC 2bp TSD. XX SQ Sequence 938 BP; 257 A; 196 C; 224 G; 257 T; 4 other; ccctacaaac aacgagcggg atattttatc atggtgtgtg taaaaaccta gagttaagcg 60 gagaatcggt gtgaaataca cggaggcgag agcgcgtttt ggtttcgaca cagttttggc 120 actaacacag ttacacatgt gganatgtgt gcgttnactt tcngccagng cgctcagttt 180 gatctgctcg cagcgctgtg ctgatgtcgg tgtctcgctt gcactgctgc accgaaagtt 240 cctctgtgtg ctctcgttct tgctaccaca cgaaatcgtc agtacagctc aaggatcggc 300 agcgagcggc cgcacgtgcg tcagcctaag tcctggacga ttgatgttat gattttgtaa 360 agtgttcaag tgttcaagtg ttgtgcgaat tcaaatgaag ctgcgcgaat gattcgtaat 420 gcatgttatg tacatcgcgc agcttcattt cgtacctttt tacagtaatc aacgtgtaca 480 aaaggatcta cacagatatt cggccatttc aacgtttgca tcgatggagc gatcaatttt 540 attttagtca aatattgtgt atcaacttac gtaaatgtgt gaaatcgtgt aaagttaaaa 600 tgtaaatgtt tcaataaagt gatgaaaaaa agatctacgc gtgtttgttt ttttgtttag 660 ctttagataa caaaaaaata cgcagcgagt gtgagcgatc gctgaacgaa agaaacgcac 720 acagcgcaac acgaaagaaa caaacacaag cacacgaaag gatgctcgcg catgggcgcg 780 agcattgaga gagcgcaccg tgtattttgt atgggagcgg gagagaaccg tgctacccgt 840 tcccgccccc ttttttaagc ctaggaaatc ttccatcggc ttaactctag gtttttacgc 900 acaccatgat aaaaattacc cgctctttgt ccgtaggg 938 // ID Clu-166_AG repbase; DNA; ANG; 1670 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-166_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1670 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1452-1452 (2010). XX DR [1] (Consensus) XX CC TA TSD. >94% identical to consensus. XX SQ Sequence 1670 BP; 504 A; 328 C; 292 G; 542 T; 4 other; tacagtcagt ccaaaaagta ttcgtctagc tgaatatttt gcagaaaaag gcgaattatg 60 gccgcaatag gcatggggtg gtaaaagtaa tcatggggtt tgataaaggg accatcaata 120 cacacgttct gctaaaaaat aagcattaaa atagtgcaat aacaactaaa ntacttaaaa 180 aactaaaaaa agtcgtttga tgggcgcgca aaaagtattc gtctatcaga cttttggcgt 240 aatatgcgta tgaaaacaaa ggttatggaa tcaaaaaatg tcctgatctt gtttcttatt 300 gcatttatga ctattggtga ctacttgcac aacgcaagaa ctcatttgaa cctttaaaaa 360 ggcgtatttt tgatccactc atcctacaag acttgttcca attattctct ggtagaaata 420 ttgcatttcc ggactacgtg gccaagttcc attgaaaaaa tggcaactga tatcatattg 480 agagtctact aaatcatttt catcagtttg gtttcagctg gtttggcgtt tcgggcaaat 540 gacaatgccc tgtatgttct ccagctcgac tgttcttgaa acatgtgata cattttaagt 600 anaatttctc agaactggga ctcaaagtac ttaaaaactg cattaatatg tttccccatc 660 tcgtctccta tcantttgaa tcatttttct agctcacctt ncctcactga gccctagtat 720 catgactcca tggtgctacc gttacacgag attttgatgc tgcatcttag tagactgttt 780 tcgctacggt agacgtcggc cgtatgacct tggcgattta aacatgtttt aatatgttag 840 tagagcctaa tttgaaacaa attcctgaat tatttgatgc accttaacag atttggccac 900 tttctgtgta tggtattatt caaacaaaac atgtttgtaa caactcactc attcaagcta 960 ttacatgacc agctacgatg aaatgcagca tgacccatca accaaatgtg acccatctct 1020 tttcctgatc ttattggcca tcacaaaatg cactgattcc ttggtaaaat ttgatttttt 1080 ggacattaaa ccgcttcttt cgttagatta accgtcaagg atggctgctt acatggcagt 1140 ttgtcacaca attagtgtca gaaaaatggc caaactctga tcaaatgaag gaacgaggcc 1200 tcccacacat tttgtaacct tcaaaaatgg gtaggatagg ttatgcataa gttaaagata 1260 cgattttcgc catcatgctc caatcattta actcccccgc ctttagtgcc tctctgaccg 1320 cttgaagtgc aattgaaaca acagaaatgc attattttag aaggaagtca ccagtagatt 1380 gataataata aatttaattc acgtctacgt agtgtttgtc cagcgtaaat aattaaaaag 1440 ctagatggac gaatactttt tgcgcgccca tcaaacgact ttttttagtt ttttatgtat 1500 tttggttgct attgcactat ttaaatgctt attttttagc aaaacgtgtg tattgatggt 1560 ccctttatca aacctcatca ttacttttat cgccccatac gtattgcggc cataattcgc 1620 ctttttctgt aaaatattca gctagacgaa tactttttgg actgactgta 1670 // ID GYPSY58-I_AG repbase; DNA; ANG; 4564 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY58-I_AG is an internal portion of retrotransposon GYPSY58_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY58-I_AG; GYPSY58-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY58_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4564 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY58_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 155-155 (2004). XX DR [1] (Consensus) XX CC GYPSY58_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY59_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY58-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 425-aa CC GYPSY58_AG1p gag-like polyprotein (pos. 121-1395) and the CC 1084?aa GYPSY58_AG2p pol-like polyprotein (pos. 1296-4547). CC The sequence of the LTRs flanking GYPSY58-I_AG is deposited as CC GYPSY58-LTR_AG. XX FH Key Location/Qualifiers FT CDS 121..1395 FT /product="GYPSY58_AG1p" FT /translation="MSTSENGASDFPMEGEQRRSSLLRSTGFATGQQIFPP FT ANENVVPTVNLPSQNAAGQPSLQHFAASSSVPAQSPDSAMLMQMMQLMQQQ FT MQQQQQQVQQQMQQQQQLITQVLQQSQVTNQQSQAFPTQVIAPSNPELIID FT ALSGSITEFRYEAESEITFDTWFARYEDLFAQDASRLDDAAKVRLLVRKLG FT PAEHARYASFILPSVPREIPFDETVKKLKALFGRAETLVSKRYKCLQLTKS FT RTEDFVSFVCRVNRSCVNFQLSAMSEEQFKCLILVCGLKDDADADVRTRLL FT SRIEERTDVTLEQLSAECERISSLKVDSAMIATQAEEQILALRSRSNAQQQ FT ASKWKQRKQYQHTRNSDNVNKPPGPCWLCGEAHWASECSYALHKCRDCNVT FT GHREGFCNQQKRGNKKGKYKKKNARRWTCVR" FT CDS 1296..4547 FT /product="GYPSY58_AG2p" FT /translation="RNRSPRRFLQPAEKRQQKGEVQEKKRTSVDMRTVTVN FT VCNVQQARKFVNIFINSNRVRLQLDTGSDITVIGRETWQQLGKPTLKPVTV FT HAKTASGSRLELDGEFEAQITIGDRTRSAVIRVMDSALHLLGADMIATFEL FT GAVPMDQFCNKVEAEGVKWESRFPALFKGNGLCTKANVQIQLKPNHRSVFC FT PKRPVAYAMRATVDKELDRLEDLGVITPVDYSDWAAPIVVVRKQNGSVRIC FT GDYSTGLNAALQSYEYPLPLPEDIFAKLAQCKYFSKIDLTDAFLQVQIKEE FT YRPLLTINTHRGLYHYNRLPPGIKIAPAAFQQLIDAMLSGLKCTSGYMDDV FT VVGGKTEREHDENLLNLFRRIKEYGFTIRAEKCSFKMPRIEYLGFVIDRQG FT LRPNPAKIDAILKMPAPTNVSEVRSFLGAVNYYGKFVPKMRELRYPLDALL FT KNDTKFVWTRECANAFNRFKDLLASDLLLTHYDPNAEIVVSADASSVGLGA FT TISHRYADGSLKVVQHASRALTKAEANYSQIDREGLAIIFAVKKFHKMLFG FT RHFRLQTDHRPLLRIFGSHKGIPVYTANRLQRFALQLLMYDFTIEYVQTDK FT FGNADVLSRLIHEHAKPDPEYIIASAELENDVSSIASYCINIFPLNFRDVA FT KATESDPVLKKVYGYIMEGWPQNVAYAAELACFYHRSEALTTVRGCILFGE FT RVVIPSKLQQRCLKQLHEGHPGIQRMKSKARSYVYWPSVDKDITEHVKGCH FT ACAIAAKTSPREKPVPWPATQKPWERIHIDFAGPIDGDYFLIVVDAFTKWP FT EVIRTRSTTSAATIAILRSIFARFGYPETMVSDNGPQFVSAEFSEYCSSRG FT IQHVTTAPFHPQSNGQAERFVDTFKRSMRKIQEGGTTQDEALDVFLASYRS FT TPNAILPDMQSPAEAMLGRKMRTALELLKPPPAAQAEPANLDRRFQRGDLV FT YAKFYARNSWKWVPAQIIRELGSVMFEVQTNNQRVHRRHVNQLRKRDSAVG FT ASSEDVDIDHLPFDLLTDSAVAESSSVPLTDHQTSASPTQPSPRPIPFATT FT KRQQPIRSPRRSSRLRRLPRRLEGYRL" XX SQ Sequence 4564 BP; 1225 A; 1078 C; 1251 G; 1010 T; 0 other; gtggcgacga gtaaaaaaaa aagcgacgta aaaaaaaaag gatttttttt attgcatttc 60 gactgtgagt gaagttgtgc tgttcgaaaa attacgcgcc cgcgagcaaa acgcagaagg 120 atgagcacca gcgaaaacgg tgcatcggat ttccccatgg aaggagagca gcgaagatca 180 tccttgctgc gttcgactgg ttttgctacg ggacagcaaa tttttccacc cgcgaacgag 240 aatgttgtgc caacggtgaa tctgccatcg cagaatgcag cggggcagcc atcgctacag 300 cattttgctg cgtcatcatc agtgccggcg cagtcacccg attctgcgat gctgatgcag 360 atgatgcagt taatgcaaca gcagatgcag cagcagcagc agcaggtgca gcaacagatg 420 cagcaacagc agcaactaat cacgcaagta ttgcagcagt cacaagttac aaaccagcaa 480 agccaggcct tcccgacgca ggtcattgcc cccagtaatc cggagttgat tattgatgcg 540 ttgtcgggta gcatcaccga gttccggtac gaggcggaat ccgaaataac tttcgatacg 600 tggttcgcgc gctacgagga cctgttcgca caggatgcct cccgcctcga cgatgcagca 660 aaggtgcgct tactggtgcg caagttaggc cctgcggagc acgcacgcta cgctagtttc 720 atcctgccca gcgttcctcg ggagataccg ttcgatgaaa cggtgaaaaa gctgaaggcc 780 ctttttggga gagctgaaac gcttgtgagt aagcgctaca agtgtttgca gctaacgaaa 840 tctcgtacgg aggatttcgt gtcgtttgtt tgccgcgtga accgctcctg cgtcaacttt 900 cagctgtccg caatgagcga ggagcagttc aaatgcctga tattagtgtg cggcttaaag 960 gatgatgctg atgcggacgt gcgcacgaga cttctctctc gtatcgagga gagaaccgac 1020 gtcacgctag agcagctctc cgctgagtgt gagcgaatct ccagcttgaa ggtggatagc 1080 gccatgatcg ctactcaggc ggaggagcaa atcctagcac taagaagtag aagcaatgca 1140 caacagcagg cgagcaagtg gaaacaaaga aagcagtacc agcatacgag gaacagtgat 1200 aatgtaaaca aaccacctgg gccgtgttgg ttatgtggtg aggctcattg ggcaagtgag 1260 tgctcatatg ctctgcacaa gtgtcgcgat tgtaacgtaa ccggtcaccg agaaggtttt 1320 tgcaaccagc agaaaagagg caacaaaaag gggaagtaca agaaaaaaaa cgcacgtcgg 1380 tggacatgcg tacggtgacg gtcaatgttt gtaatgtcca gcaagctaga aaatttgtta 1440 acattttcat caacagtaac agggtccgcc tgcagcttga cactggatcg gatattacag 1500 taatcggacg agaaacgtgg cagcagctgg gtaagccaac actgaagccg gtaacagtcc 1560 acgcaaaaac agcatctggg tcccgccttg aattggatgg tgaattcgag gcacagataa 1620 cgatcggtga cagaacgcgg tctgctgtta tacgcgttat ggactcggcc ttgcatcttt 1680 tgggggccga tatgatagca acgttcgagc tcggtgccgt ccccatggac cagttttgca 1740 acaaggtgga agcagaggga gtgaaatggg agagccgatt ccctgcattg tttaaaggca 1800 acggattgtg caccaaggcg aatgtacaga ttcagctcaa gccaaatcac cgttctgtgt 1860 tttgtcctaa gcgcccggta gcgtacgcga tgcgagcaac ggtggacaag gagctcgatc 1920 gcttagagga tctaggggtg attactccgg tggattattc ggactgggcg gccccaattg 1980 tggtggtgag gaagcagaac ggcagcgtgc ggatttgtgg agattattca actgggttga 2040 atgcagctct tcaatcgtac gaatatccac tcccactccc agaggatatt tttgcaaaac 2100 tggcacaatg caagtatttt tcaaaaattg atctaacgga tgcattcttg caggtacaaa 2160 ttaaagagga atatcgtcca ctcctgacga ttaacacgca tcgcggattg taccattaca 2220 accgtctacc acctggcatt aaaattgctc cagccgcgtt ccagcaactc atcgatgcaa 2280 tgctctccgg cctaaaatgt acatctgggt acatggatga tgtagttgtt ggaggcaaaa 2340 cagaacgcga gcacgatgag aatttgctca accttttccg gcgcataaag gagtacggat 2400 ttacaatccg agcggaaaag tgttcgttta aaatgccaag gatagagtac ctgggttttg 2460 ttatcgacag acagggtctc agaccaaacc cagcgaaaat agatgcgatc ctgaagatgc 2520 cagcaccgac gaacgttagt gaggttcggt cctttctggg tgctgtgaat tattacggca 2580 aatttgtacc aaagatgcga gaattgcgat acccgctgga tgcattgctc aaaaatgaca 2640 ccaaatttgt ttggacacgg gagtgcgcaa atgcttttaa caggttcaaa gacctgttag 2700 catccgactt gctgctgacg cactacgacc cgaatgcaga gatagtggtc tctgccgatg 2760 catcgtcggt tggactaggc gcgaccataa gccacaggta cgcggatggc tcgctgaagg 2820 ttgttcagca cgcatcgaga gcccttacga aggctgaagc aaattacagc caaattgacc 2880 gcgaaggttt ggcgattata tttgctgtga aaaagtttca taagatgctg tttgggcgcc 2940 acttccgact gcaaaccgat catcgaccct tgttgcggat tttcgggtca cataagggta 3000 taccagtgta cacggcaaat aggttgcagc ggtttgcatt gcagttgctg atgtatgatt 3060 ttaccatcga gtacgtgcaa accgataagt ttggcaatgc ggacgtgctc tcaaggctga 3120 tacacgagca cgcaaaaccg gatccggaat acataatcgc aagcgctgag ttggaaaatg 3180 atgtaagttc catagcatca tactgtatta atatatttcc actcaatttt agagacgttg 3240 cgaaggccac ggagtccgac cctgtcctga agaaggttta cggatacatc atggaagggt 3300 ggccccaaaa tgtcgcgtat gctgcagagc tggcttgctt ttaccacaga agcgaagccc 3360 taaccacggt gcgtggttgc attttgtttg gggaaagagt ggtgatccct agcaagctgc 3420 agcagcgttg cttgaagcaa ctacatgaag ggcaccccgg tatccagcga atgaagtcga 3480 aggcccggag ttacgtgtac tggccatccg ttgataagga cataacggag cacgtaaagg 3540 gatgccatgc ttgtgcgata gcggcaaaga cttcacctcg cgaaaaacct gttccctggc 3600 cagcgacaca gaaaccttgg gaacgtattc acatcgattt cgctgggcca atcgatggtg 3660 actattttct gatcgtggta gacgcgttta ccaagtggcc agaggtgata cgaacgcgaa 3720 gtaccacatc agcagcaacg atcgcgatac tgaggtcaat ctttgcgaga tttggatacc 3780 cggaaacgat ggtgagcgac aacgggccac aatttgtcag tgctgagttt tcggagtatt 3840 gcagtagtcg cggtatacag cacgtcacaa ctgcgccatt ccatccgcaa tcgaatgggc 3900 aggcggagcg ctttgttgac accttcaagc ggtcgatgag gaagatccag gaagggggaa 3960 caacgcagga cgaagcgctt gacgtttttc tggcgagcta ccggtcaact cccaacgcta 4020 ttttgccgga catgcagtcg ccagctgaag ccatgctagg cagaaaaatg cgaacagcgt 4080 tggaattgct gaaacctcca cctgcagcgc aagcagaacc ggccaacttg gatagacgat 4140 ttcagcgcgg tgacctcgtc tacgccaagt tctatgcgcg gaattcatgg aagtgggttc 4200 cggcgcaaat catacgagag ttgggcagcg ttatgttcga ggtgcagacg aacaatcaac 4260 gggttcacag gcgccatgtg aaccagctgc gaaaacgtga ttcggccgtg ggggcttcca 4320 gcgaggacgt ggacatagat catttgccgt ttgatctgct gacggacagc gcggtagcgg 4380 aatcgagttc tgttcccttg acggaccatc aaactagcgc gtcacccaca caaccatctc 4440 cacgcccaat cccgttcgct actacaaaac gccagcagcc gatacggtcg ccacgccggt 4500 cttctaggct tagaagactt ccgcgtaggc tcgaggggta tcgtctgtaa ttaaaagggg 4560 gaga 4564 // ID BEL1-LTR_AG repbase; DNA; ANG; 290 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE BEL1-LTR_AG is a long terminal repeat from the BEL1_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL1-I_AG; BEL1-LTR_AG; BEL1_AG; Bel clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-290 RA Kapitonov V.V. and Jurka J.; RT "BEL1_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 9-9 (2003). XX DR [1] (Consensus) XX CC BEL1-LTR_AG is a long terminal repeat from BEL1_AG CC retrotransposon. CC There are more than 100 copies of BEL1-LTR_AG in the genome. CC See comments for BEL1-I_AG. XX SQ Sequence 290 BP; 102 A; 62 C; 60 G; 66 T; 0 other; tgttgatgac agcgccgcat aattaggtca tcacggcctg ggaactgacc gaccgaacga 60 aaacaattgt aagcaaatct aatgtaaaca atatcgacca aataaaaact gctgatcgcg 120 atcgatcagc agaaaattag tatataagtc aggaataagc atgaataaat cgactttcaa 180 gcacaagaac tcaaagacta agttgtttgt ctgttccaag ccaagtcgga cggtaaaatc 240 ctcctgtctt ggcgcttccc aacgagttca gactcggtgc ggaaaatata 290 // ID BEL-20_AG-LTR repbase; DNA; ANG; 213 BP. XX AC . XX DT 27-SEP-2010 (Rel. 15.09, Created) DT 27-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Bel-20_AG-LTR. XX KW BEL; LTR Retrotransposon; Transposable Element; BEL-20_AG-LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-213 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1432-1432 (2010). XX DR [1] (Consensus) XX CC LTR from Bel-20_LTR-I. XX SQ Sequence 213 BP; 85 A; 40 C; 41 G; 47 T; 0 other; tgttgggaat atgtaacgat caaccgtcaa agttagcgaa ttcaaaatat taacgaactt 60 aatgacattt aacgaacgaa atgtcaaagg taggaagacg gctgcttggg aagtgtcaga 120 ttgaaaaccg catggaacag aacacaacgt agggcagaca acacaacaac acattcctga 180 ataaaagaat tctattttca cttaaatcac cac 213 // ID AgaP12 repbase; DNA; ANG; 5649 BP. XX AC DQ301494; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP12 transposon P-like, DE complete sequence. XX KW P; DNA transposon; Transposable Element; AgaP12. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5649 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-5649 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301494; Positions 1 5649. XX FH Key Location/Qualifiers FT CDS join(529..2334,2417..3244) FT /product="AgaP12_1p" FT /note="transposase." FT /translation="MPVPCSVSLCGNNPRNVKKRALEISFHIFPNYDPLRH FT AWVQFCNREENWEPSKRDVICSAHFQESDFQMPESGIVGGKVLRRTLFPNG FT KFLLAYFPNDDTTNILCFCWLLAIPSIHSVLSSEIENREEVSSENFVPSTS FT DTLLVSDVSNIPQNTSTAFLKENIKLREANKRLKATLSHIASENYQLKKKV FT QDLEKELNSIKNSSIPPSDLIPKMKNMLKSTLTSNQVDLIIGEKKRVRWTK FT EEISRALTLRYFGKKAYDYVKDDLKIYLPSPSTLQKYARTFKLREGILEDV FT LVFMGNFVSSLSSRDAECILSFDEMKIKNVMEYDPSADEVIGPYNYIQVVM FT ARALFKNWKQPIYIGFDKKMTKEILMNIIIKLDEKKINVAGIVSDNCSSNI FT SCWRDLGAHDYTKPYFEHPITKKNIYVSPDAPHLLKLLRNWFIDHGFVFNG FT TIVTAQPLRDLVEGRLGAEITPLFKLNTGHLELSSQERQNVRRAAELLSRT FT TAVSLRRYMPKEKELADIIEMVDLWFSVSNSFHPNAKLHYKRSYTASEDQL FT KALDDMFNFIANMIVVGKKNMQVFQKSILMQINSLKMLFVDMKAKHDINYL FT STHKLNQDLLENFFSQLRQKGGTYDHPSPLNCIYRIRLMILGKSPSVLKSV FT TDNENNKKNIEPDVFVTSADFENETDQLNEICVSQEHYVSATIFHEAEIVP FT SLLDFDATDADDEFSATNFDSLFILNEQEKDGLEYIIGYIGRKFRDKYPYL FT GQYTCNLTEDHCYSQPQSYVEHISAGGLYKPSASFLDQGLKMEEIFQRTHK FT DGKLIQSKRIVNTLTDLLQTHFPDFPIEILKTFAKLRIIMRMRFFNIKKEE FT IRRNKKRKTPMERQAAKKFRRIVN" XX SQ Sequence 5649 BP; 1904 A; 916 C; 1065 G; 1764 T; 0 other; caaggtgtat atatttagaa ggttgagtta tttagagaaa tttgacattt ctcaacatga 60 cacttctctt ccatggtcct cgcggggtag gaggaggtgg ggaagaacaa aacaaggaag 120 ggggtgtggg tgttggtgtt gcgagcatta tttacgttca aagcttttgg ttccttcgtg 180 cagttcggct ggtgtttatt tgatctctgt tggctttgaa ttgaaattgc taggaagtgt 240 gttttcgttg gttatttttt agttatagtg gaatagaatt atcgctttat atcgttctta 300 ggttttcgga taagtgataa tagctataac ttgcttagat gtgtagattc tttatgaacc 360 gcatcaacag ttctttgaat taaatttttg ctcataattt gtttggattt ttgtgccttt 420 tttccagtta ttttcaatta agttagttag gtttttagtc ttttaagatt tacgttaagt 480 gtagttttat ttaatttcaa aagaacttaa aatttttcaa tttttatcat gcctgttccg 540 tgttccgttt cattgtgtgg taacaatcca cgcaatgtaa agaaacgtgc attggaaatt 600 tctttccaca tttttccaaa ctatgaccct cttcgacatg catgggttca gttttgtaac 660 cgggaagaaa attgggaacc ttcgaaaagg gacgttattt gctcagcaca ctttcaagaa 720 agcgattttc aaatgcctga gagcggtatt gttggaggta aagtgcttag aagaactctt 780 tttcctaacg gtaagttttt attagcttat tttcctaatg atgatacaac taacatttta 840 tgtttctgtt ggcttttagc gattccatca atacactctg tattatcatc tgaaatagag 900 aatcgggaag aggtttcttc tgaaaatttt gttccttcaa cttccgacac tttattggta 960 agtgatgttt cgaacatacc acaaaacacc agcacagcgt ttttaaaaga aaatataaaa 1020 ctgcgagagg caaacaaaag gctaaaagcc acattgagcc acattgcttc tgaaaattat 1080 cagctaaaaa aaaaagtaca ggacctagaa aaggaattaa attcaattaa aaatagtagt 1140 atcccaccga gcgatctaat ccccaaaatg aagaacatgt tgaaatctac tctaacttca 1200 aatcaagttg atttgataat aggggagaaa aagcgtgtta gatggacaaa ggaggaaatc 1260 agccgtgctc tgacattgag gtattttgga aaaaaggcat acgattacgt aaaagatgat 1320 ttaaaaatat atttaccatc accctccact ttgcaaaaat atgccagaac ctttaagttg 1380 cgtgaaggta tcctagaaga cgttttagtt ttcatgggga attttgttag ttctttgagc 1440 agtagggatg ccgaatgtat tcttagtttt gatgagatga agatcaaaaa cgtaatggaa 1500 tatgatccgt cggcagatga ggttattggc ccatataatt atatacaggt agtaatggcc 1560 agagccttgt tcaagaattg gaaacaacca atctatattg gttttgataa aaaaatgacg 1620 aaagaaattc tcatgaatat tataattaaa ttggatgaga agaaaattaa cgttgctgga 1680 attgtgagtg ataactgctc aagcaatatt agttgttgga gagacctcgg tgctcatgat 1740 tatactaaac catacttcga gcacccaatt acaaaaaaaa atatttatgt atcgccagac 1800 gcgccgcatt tgcttaaatt attgcgaaat tggtttattg atcatgggtt tgtatttaac 1860 ggaacaattg tcactgcaca acctttgcgc gatctagtgg aaggcagatt aggagcggag 1920 ataactcctc tttttaaatt aaacacagga catttggaac tatctagtca agaacgccaa 1980 aatgtacgca gagcagcaga actgttgtct cgaactacag ctgtatcatt aagacgatat 2040 atgccaaaag aaaaagagtt agctgacatt atagaaatgg tagatttatg gttcagtgta 2100 tccaactcat tccatcccaa tgccaaattg cattacaaaa gatcgtatac tgctagtgaa 2160 gaccaactaa aagctttaga tgacatgttt aattttatcg caaacatgat agttgtaggc 2220 aaaaaaaata tgcaagtatt tcaaaaaagt atattaatgc agattaattc tttaaaaatg 2280 ctgtttgtcg acatgaaagc aaagcacgat atcaactatt tatccacgca taaggtaaga 2340 tcataaaaaa tatttaaata attacaccaa ccaatttaaa taatggtttt tatcgttttc 2400 gttcgtatct tttcagctta atcaagacct gttagaaaac tttttttctc agttgagaca 2460 aaaaggtggt acgtatgacc atccatctcc tttaaattgt atatacagaa tacgtctaat 2520 gattctgggg aaatcgccaa gcgtattgaa atcggttaca gataatgaaa acaacaaaaa 2580 aaatattgag cctgatgtat ttgtcacatc agctgatttt gaaaatgaaa cagaccagct 2640 aaatgaaata tgcgttagtc aagaacacta cgtatcagct accatatttc atgaagctga 2700 gattgttcct tcccttctgg attttgatgc gacagacgca gacgatgagt ttagcgcaac 2760 aaattttgac tcactgttta ttttaaatga acaggagaag gatggccttg aatatataat 2820 aggctatatt ggtcgaaaat ttagggataa atatccatat ttaggacaat atacgtgtaa 2880 tttgacagag gaccattgct acagtcaacc acagtcttat gttgaacaca tatcggctgg 2940 aggtttatac aagccatccg ctagcttcct tgaccaagga ttgaaaatgg aggaaatttt 3000 tcaaagaact cacaaggatg gaaaactgat tcaaagtaaa cgaattgtaa atacacttac 3060 agacctcctc caaacacatt tccctgattt cccaatagaa attctaaaaa catttgctaa 3120 acttagaata atcatgagaa tgagattctt caatataaaa aaagaggaaa ttcgaaggaa 3180 caaaaaaaga aagacaccaa tggaaagaca agcggctaaa aaatttagaa gaatagtgaa 3240 ttagtcgttg tatgcattgc aaactgttgc tcattttatt tatttattta tttatttttg 3300 ttcttgtaac tttgatttta gttttgttaa ctgttactta gttctgtttt ttgatattat 3360 ttatgttaac ttattatttt gaatggcatc attcttgata gctttattgg aacaaactgt 3420 gatattttat taatagatta caaaactcat taatttaaaa gatcctgtgc taaaatactg 3480 tggtagtccc tagctggtgg agtaatcgga accggctgga gctgctggaa tctttacaaa 3540 ctttctaaac tatttgatat agtgtgacct caaaacgtat caccgaagtt agtgtaatgc 3600 cgttggcatt tatattttgg tattaagtat agacaaaata tagtattaag tgcaacttaa 3660 tacagcacta aaaataataa caaaatttag agccggtctg aaggtatata cgatacacgc 3720 gctggctcca aaaagcccgg acgtgtatga ctacaataca agtaagaaat ggctatctgg 3780 ttgtgtcata cagtaatctg tgaagacttt ataagccaac atgacctcgc agagaagaaa 3840 gaagagcaca ggcggaaata atgtcaaagt ctatatgtat agatatgtac agagtgcaga 3900 gaaacatact atacaaactg aactatatca caccaaagac attacactaa cctaagtcac 3960 acgtaaagat gtcgatctgt attattcatt ttagaatatt tgtaacgctg tcatcggctc 4020 cgaacggttt atattattcc aacgtgttgg tagaatttcg ctaaccttcc aaagtagctt 4080 tcgattgtgg tattataaca cgggtacgga ggtagttatc atttaaatat agtctattaa 4140 aagtaaaatt tcatcaaact ggatattgct gtaggcggag taaaatgccg caagattctt 4200 ataatgaact atatgcaacg aaaacgaaca aaataagtac aagctacctt tctagtctga 4260 ccattgtgat atgatacatg ttaatcataa agtgctgaac ttgtagctat acattagatt 4320 tcaaacgtag gatttattaa attcaaaaaa caggtattga cctgttttgt gaacgaactc 4380 aaattgaaag gtaacaatac tacaaataaa ttattttttc gaatcagatc gaattccaaa 4440 accaaatgtt tatgtctaca attcattact ttcacttctg agatataaat actgtaagta 4500 tatgtttcgt ctttgtttgg gggtgatgaa gggttataaa aatatataat ggaaatttaa 4560 actcacgtcg atcgggtgcg cggaataagc acaaatgcat tgatttcttt tgagaattat 4620 atcatctcca attattcttt cctgaacgga agtttccgtt tacttctttt tgtactattt 4680 ggctaccatt tgtgtgttta tgactattgc tttcggtcga tcgagtttgt gttctgatat 4740 accattacat ccctctcatt atctgtatat actctacgct caagaaagaa tgtgtttctg 4800 cgtataataa agcgtttttt tcactgcacg tttttatctg taaagaagta aagaaaaaac 4860 ttgaaagaac aaataaaagg aactatctgt cgatttgagc gagatagaaa ccactcaccg 4920 gaaggagatc actgttttca caccaccaac cgaaatatca caatgtagga taccaatctg 4980 cagcgccact ggtcttcaat cactaccggg taaaaataca ttagcagtat aaaactgctt 5040 ctacgtaacc atgcaatgac gataacaata ttggtggatg cttaagcttt aaccaggaaa 5100 ttacataaaa cctttaaaga tgggaaatat atagttcatt tgtagaaaaa cagacgaaac 5160 cgttggagag ctaaaagaga agaaagaacg actgatgaac tgaaatatta acacggaagc 5220 tacttgcgga taatggcttt gtggctggaa aatcgttctg gtccttaagt tgggtcaaca 5280 gatacgttcg gtggccaaat cgaactaggt agctgtagta gacgaaggtg cgagaccatg 5340 agcagtttcg gaaactcttg cagcagctca agccgcaaag cggttttaac acaggctacg 5400 tgataagctg gcgaaatttt aacctatttc aacggaaaat cgtcgttcat gtgattgatt 5460 aggtcgaatg atggataaat aaaatcacaa gttaattaaa atgaactgaa ccgaaactat 5520 tcatcggctg cttgtttctt cttgttttgg gcgaaccatg cgagcttgta aacaaatatt 5580 gacaggtctc ccccacaatg gcgcccagaa aaagctataa aaatcaacct tctaaatata 5640 tacaccttg 5649 // ID GYPSY27-I_AG repbase; DNA; ANG; 4117 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY27-I_AG is an internal portion of retrotransposon GYPSY27_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY27-I_AG; GYPSY27-LTR_AG; GYPSY27_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4117 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY27_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 19-19 (2004). XX DR [1] (Consensus) XX CC GYPSY27_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, CC GYPSY24_AG, GYPSY25_AG, GYPSY26_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY27-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 1361-aa GYPSY27_AGp gag-pol like CC protein CC (pos. 22-4104). CC The sequence of the LTRs flanking GYPSY27-I is deposited as CC GYPSY27-LTR_AG. XX FH Key Location/Qualifiers FT CDS 22..4104 FT /product="GYPSY27_AGp" FT /translation="MSKPPAKESSEQWIIEMFNQQKLLNSVLSVMKTKNDD FT GSEKAIDAIAGQIKEFHHTEDTNFEAWYSRYEGLFLDDAKRLDDGAKLRLL FT LRKIGVTEHERYISSIMPKQPKDFDFKNTVEKLKKLFGDRESIVCKRFKCL FT QLVKEPHEEYGAYACRVNKKVVEAKLAGIAEEEIKCLLFVCGLKRDADADV FT RVRLLSKMEDNSEITLDQLTGEAQRLLNLRQDSNLISNTAHEVNAVRKNET FT KQQNIKGGGKEAKAVSCWLCGGQHFARECSFTSHKCEDCNEIGHKEGFCET FT AYRNRSRYRNNKAKVRVVTVNKIQAGRKYVTAALNGRELKLQLDTGADITI FT ISQGKWRQIGGPKLSPASVSARTASGTPLKLLGEFCCKMTINKQQKEALVR FT VVEEELLLFGADSMDVFGLWKQPLDAFCNIVGCVKDAAAAVIQQFPSLFSN FT ELGRCDKMKISLQLKDDVKPVFRPKRPVAYAMQSMVEDELTRLERNGIITP FT TDSSEWAAPIVVVRKANGTVRICGDYSTGLNDALQPHQYPLPIPQDIFTAI FT GKSAVFSQIDLAEAFLQVEVCERSRELLTINTHKGLYRFNRLPPGVKTAPG FT AFQQIVDSMLSGLEGVAGYMDDIIVGGADEESHLKNLRAVLIRIEEFGFKL FT RAEKCSFLKPQIRYLGHLLDRQGIRPDPAKIEAILKMPAPTQLSEVRSYLG FT AINYYGKFVFQMRDLRYPLDLLLKKGGEFQWTAECEKSFRRFKEILSSDLL FT LTHYDPSKEIIVSADASSVGLGATISHRFPDGSIKVVQHAARALTKVEMNY FT SQPDREGLAIVFAVTKFHRMLFGRRFILQTDHQPLLRIFGSRKGIPLYTAN FT RLQRWALALLLYDFAIEYVATDNFGNADILSRLMDRHEKPEEDYVIASIQL FT EDDINHLVTSATQALPLNFKDLERGTQADPLLRKVFQFVQNGWPKGEEKTE FT DLKQFFARREALSTVGKCLLFGERVVVPANLRNRVLKLLHKGHPGVCRMKA FT LARGYVYWPKLDSEIENAVKTCQSCASAAKSPEHCTPVDWPKTTGPWQRIH FT VDYAGPMEGEYYFLVVDSYSKWPEIIPTRQTTAQATVQMLRGLCARFGIPE FT TVVSDNGTQFTSSEFRDFCMENGIHHVRTAPYHPQSNGQAERFVDTFKRAV FT KKIREGRGEMHEALDEFLFTYRNTPNRILDQKSPGEMMLNRKVRTSLEMLR FT PSFRDIRPADPVVADEATRRSFSINDRVYAKRYHRNGWEWVSGVVDKRVGN FT VMYMVKIDGKTLRYHVNQLRMRNAETNEGRTQQPMPLNVLLDAWDMSGTTV FT SASSSETSAATTTNLAEREMTPADGTEASLRRSTRVRKQPAWLEGYHRS" XX SQ Sequence 4117 BP; 1196 A; 844 C; 1101 G; 976 T; 0 other; actggcgacg agaataccac gatgtcgaaa ccacctgcaa aagaaagctc agaacaatgg 60 ataattgaaa tgttcaacca gcaaaagctc ttgaacagtg tattgtcggt gatgaaaaca 120 aaaaatgatg atggatcgga gaaagcgatc gatgccattg cggggcaaat aaaagaattt 180 caccacacgg aggatacaaa cttcgaggct tggtatagca gatacgaagg tttgtttctt 240 gacgatgcga agcggttgga tgacggagca aaattgcgtc ttcttcttcg taagattggc 300 gttacggagc acgaacgcta tataagttcc atcatgccga aacaaccgaa agacttcgat 360 ttcaaaaaca ccgtggagaa actgaagaaa ctgtttggag accgtgagtc catcgtgtgc 420 aaacggttta aatgcttgca gctagtgaaa gagcctcacg aagagtacgg agcgtatgcg 480 tgccgcgtaa ataaaaaggt tgttgaggcg aagttggcgg ggatagcaga ggaggaaata 540 aaatgtctgt tgtttgtctg tggtctaaaa cgggatgcgg atgctgatgt acgagttcgg 600 ctgctctcaa agatggagga caacagtgaa atcacgttag atcagctaac cggtgaagct 660 caacgcctgc ttaacctgcg ccaagatagc aaccttatct caaacactgc gcatgaggta 720 aacgcagtga gaaagaatga gacgaaacaa caaaacatca aaggaggtgg caaggaagcg 780 aaagcagtaa gttgctggtt gtgcggaggg caacactttg cacgagaatg ttcttttacg 840 tcacacaaat gcgaggactg caacgagata ggtcacaagg aaggattttg tgaaacggcg 900 taccgtaatc gttctcgcta tcgcaataac aaagcgaagg taagggtggt gacagtgaac 960 aaaatccagg ctggccgaaa gtatgtcact gctgcgctaa acggacgcga gctgaaactg 1020 cagttggata ccggtgcaga tatcaccatt atatcgcagg gaaaatggcg tcagatagga 1080 ggtcccaaat tgagtccagc atccgtttca gcgaggacag ctagtggaac accgcttaag 1140 ctactaggtg agttttgttg caaaatgacg attaacaaac aacagaaaga agcgttagtt 1200 cgtgttgtag aagaggaact gttacttttc ggagcagaca gtatggatgt tttcgggctt 1260 tggaaacagc cacttgatgc cttttgcaac atcgttgggt gtgttaaaga tgccgcagca 1320 gctgtgattc agcaatttcc atcgttgttc tcgaacgaac ttggtcgttg tgataagatg 1380 aaaattagtc tacagttgaa ggatgacgtg aaacctgtat ttcgcccgaa acgtccagta 1440 gcgtatgcta tgcagtcgat ggttgaagat gaattgacca ggttagagcg taatggcata 1500 ataacaccca ctgattcttc ggagtgggca gctccgattg tggtagtacg gaaggcgaac 1560 ggaacagtcc gaatatgtgg cgactattcg acaggtctga atgacgcgct tcagccgcat 1620 cagtacccgt tgcctatacc acaagacata tttacagcca ttggaaaatc agctgttttt 1680 agtcagattg acttagcaga agcttttctc caagtggagg tatgtgaaag gagccgtgag 1740 ttgttgacga taaacactca caaggggctt taccgtttca atagattgcc accgggagta 1800 aagacagcac ctggtgcatt tcagcaaatc gttgactcaa tgttgagcgg attagaaggt 1860 gtcgcgggtt atatggatga catcattgta ggcggtgctg atgaggaaag tcatttgaaa 1920 aacttacgcg cagttttgat tcgtatcgag gagtttggtt tcaagctacg ggcagaaaaa 1980 tgttcctttt tgaagcccca aatccgatac cttggacatt tgctagaccg gcagggtata 2040 cgaccagacc cagcaaagat tgaagcgatt ttgaagatgc cagcgccaac gcagctcagc 2100 gaagttcgtt cgtatttggg agcgataaat tactacggta aatttgtgtt tcaaatgaga 2160 gatttgcgtt accccttaga ccttcttttg aaaaagggcg gagaattcca gtggacagct 2220 gaatgcgaaa aaagttttcg ccgctttaag gagattttaa gttcagacct tctcctgacc 2280 cattatgatc cttccaagga aattattgtg tcagcggatg cttcatcagt aggtcttgga 2340 gctacgatca gtcatcggtt tccagatggt agcataaagg ttgttcagca tgctgctcgt 2400 gctctaacga aggtggaaat gaactacagc cagccggatc gcgagggtct tgcaatcgtg 2460 tttgcagtaa caaagttcca ccgcatgttg ttcggacgcc gcttcatcct ccaaaccgac 2520 caccagccgc tccttagaat ctttggctca cgtaaaggaa ttcccttgta tactgcaaat 2580 agactgcagc gctgggcatt agcactactt ctttacgatt tcgcaatcga atatgttgcg 2640 acagacaatt tcggaaatgc cgacatactg tccaggttaa tggaccgaca cgagaaaccg 2700 gaggaggact acgtcattgc tagtatccag ctagaggatg atattaatca tctagtaacc 2760 agtgctactc aagccctacc gttgaacttt aaagatttgg agcgtggtac acaagcagat 2820 cctttgttaa ggaaggtgtt ccagttcgtc caaaatggtt ggccaaaggg tgaagagaag 2880 actgaagacc taaagcaatt cttcgcacga cgagaggcct tatctacggt tggaaaatgc 2940 ctcttattcg gagaacgggt agtagttccc gcgaatcttc ggaatcgagt gttaaaactg 3000 ctacacaagg gtcatcctgg agtatgccgc atgaaggcgc ttgcaagagg gtatgtgtat 3060 tggccgaaat tggacagcga aatcgagaat gcagtgaaga cgtgtcagtc gtgtgcaagt 3120 gcggcgaaat ctccagagca ttgcacacca gtagattggc caaaaactac aggcccctgg 3180 cagcgtattc acgttgacta cgctggacct atggagggag agtattattt tctggttgta 3240 gattcctact cgaaatggcc ggaaataata ccaacgcgtc aaaccacagc acaagcgaca 3300 gttcaaatgt tacggggatt gtgtgcacgg tttggtatac cagaaactgt cgtaagcgat 3360 aatggaacac agtttacaag ttcggagttc cgtgactttt gtatggagaa tggtattcat 3420 cacgtgcgga ctgcgccata ccacccgcag tcaaacggcc aagccgaaag gtttgttgac 3480 acgtttaaga gagcggtcaa gaagattcgg gagggtagag gtgagatgca cgaggctttg 3540 gatgaatttc ttttcaccta caggaacacg ccgaatagaa tcttggatca aaaatcgccg 3600 ggggagatga tgttaaatcg caaggttcga acgtctctgg agatgttgcg tccatcattt 3660 cgcgatattc gaccagcgga tccagtggtg gccgacgaag ctactaggag gagtttctcc 3720 ataaacgacc gtgtctacgc aaagcgatac catcgaaacg gctgggaatg ggtatctgga 3780 gtagtcgaca agcgcgtggg taacgtaatg tacatggtca aaatcgatgg caagaccttg 3840 cgctatcatg tgaaccagct tcggatgcgt aatgcagaga ccaacgaagg gcgaactcag 3900 caaccaatgc cactgaatgt gttattagat gcctgggata tgtctggaac aacggtgtca 3960 gcatcctcat ctgaaacatc ggcagcgact acaacaaatc ttgcggaacg tgagatgaca 4020 cctgcggatg gtacagaagc aagcctacgc cgctcgacga gagtgaggaa gcagcctgcg 4080 tggttggagg ggtatcatcg cagttaaagc gggggaa 4117 // ID CR1-3_AG repbase; DNA; ANG; 4693 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 03-SEP-2010 (Rel. 15.1, Last updated, Version 3) XX DE CR1-3_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; CR1 clade; DNA/RNA-binding; PHD finger; KW AP endonuclease; CR1-3_AG. XX NM CR1-3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4693 RA Kapitonov V.V. and Jurka J.; RT "CR1-3_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 14-14 (2003). XX DR [1] (Consensus) XX CC CR1-3_AG is a family of CR1-like non-LTR retrotransposons. CC The CR1-3_AG consensus sequence was reconstructed based on CC multiple alignment of ~50 copies identified in the CC sequenced portion of the genome. Given the ~3% divergence CC of these copies from the consensus sequence, transposition of CC CR1-3_AG occurred less than 2 million years ago. CC The 3' terminus of CR1-3_AG is composed of the ATAA CC microsatellite. CC CR1-3_AG encodes two proteins: a 418-aa CR1-3_AG-ORF1p CC (positions 1057-2310) and 910-aa CR1-3_AG-ORF2p (positions CC 2314-5043). CR1-3_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 5-40). CR1-3_AG-ORF2p is composed CC the AP endonuclease and reverse transcriptase domains. CC Putatively, the last protein is translated through the ribosomal CC frameshift. XX FH Key Location/Qualifiers FT CDS 265..1518 FT /product="CR1-3_AG-ORF1p" FT /translation="MAGICSACANDIVAADRIVKCQGWCNSEFHFSCSGLS FT EELSATIESCAQLFWACKACVKFHKDPRTAVLRSSTPYTHTSVDLLSSIAD FT LKAGLRSELSQHTTAIKLELLEVLKAEIRSCLRSTHAATDLPSQRPIHHNS FT APKLFNSVVKNIPGTITPTLQQYPSLAASLSVSANEPSSFTNMPPXYNPQT FT PLLRGSGSPLDSDTLDTIPHTDTRMWLFFTRFSPSVTTEQISLMVQVRLAL FT DKRDVFVHRLTKLGADTSTLSFISFKVGIPATLRNKALSPKTWPSALTYRE FT FRDYRTNNYNTNTCATTETNSMLDQNVTLATDTYHNPLSHSGGSAAMPTTT FT TDTLAQPVTISSPAMTTTDTALFTNEPHLDLNERMLVTTTPTRSPPECLAA FT PKKRPKRGNAKRTDESADAGPSDE" FT CDS 2314..5043 FT /product="CR1-3_AG-ORF2p" FT /translation="QSCTASGDAPRSAHPSLSIYYQNVRGLRTKTTNLRLA FT LSESEYDFIILTETWLTQSIPSSLLTDDHYHIYRCDRNLSNSALSRGGGVL FT IACSSSIPTCEIASPNTILEQLWIKTLLPGVSVYIGVVYIPPSHANDPAVM FT NALHDSVREISSRIKESDLLYVFGDFNKPDIRWELTNTSEATDCSPCYSVM FT HYAPLCNSVANTDFVDGLHSTGLFQLSGIANQSGRQLDLVFANLAATNILC FT DSITPLHSVNGTSALENSLPYVTHCSIPLLSEDFHHPSLDMMIYYPVQLSH FT TTNSHTRSTVNRNFFKTNVERMNSLIVSFDRNFDCSNFATIDEATDFFSVF FT MRSAINSCVPVAQRKSGPDWSNASLRRLKKIKSKAYADYSRTRSSLHRRIF FT FDALNNYRRXNRVLYRSFIRRTERQLFSKPTRFWSFWNKRRNIRSIPPSMS FT YNGXTSIDTSDICNTFANRFADAFTLPVHNPNTLAEATRNTPSDAIDFIIP FT TIDEALIARTLNDIKPSTSSGPDNIPAYILKHCRQSLAPILAKIFNDSLMR FT GTYPASWKHARMVPIHKKGSRLHASNYRGIVSLCACAKVFELILYNPLLTA FT VQNYMSPSQHGFLPRRSSTTNLAEFVGYCFDNMDRGTQVDAVYIDFRAAFD FT SISHDILLSKLKKLGFLDWHITWLRSYLTGRSYYISIGSHRSHSFTSSSGV FT PQGSNLGPLLFLIYINDLSFVLPPGQHLMYADDVKIFAPVRNDSDCVRLQT FT ILENLDSWCSRNALQVCADKCQCISFSRARHPITFTYTMLNTALARTTCIR FT DLGVLLDQKMSFRPHIDSVVAKGNQLLGLITRTCSEFTDPMCVKSIFCAIV FT RSCLEYCCPIWCPLGVGDINRLEAIQRRLTRYAVRLLPWQSHHARPTYHQR FT CLLLGH" XX SQ Sequence 4693 BP; 1219 A; 1248 C; 899 G; 1320 T; 7 other; gatctgtgat gtacatatta taatgtacac actctcattc caaacgtcac tgtgactggt 60 cgaagattct ctgcgctccg actattttca atattattcc gagaaaacct gtatctttgc 120 tgaacctgtt gctggtgtag gtgacttcat tgcctggttt tgtttaacaa ctggattatt 180 caccgtcaat cacctggaaa cgttttccgg cccacaatca ttacgtcgct tcaacatcac 240 aatcaatcgc tcatccacaa agcaatggct ggtatttgtt ccgcctgcgc taacgacatc 300 gtagctgctg atcgcattgt gaaatgccag ggttggtgca actctgagtt tcacttctca 360 tgcagcggac tttctgagga actgtccgct actatagagt cctgtgcaca acttttttgg 420 gcctgtaaag cctgtgtaaa gtttcacaag gatccgcgta cggccgtgtt gaggtcatcc 480 accccgtaca ctcacacttc tgtcgaccta ctgtccagca tagccgacct taaagcgggc 540 ctccgtagcg agctgtcaca gcataccaca gctattaagt tagagcttct ggaagtttta 600 aaggcggaga tccgttcctg cttgcgatcg acgcatgccg ccaccgattt accgtctcaa 660 cggcccattc atcacaattc agcgcctaaa ttgtttaatt cagtggtcaa aaatatccct 720 ggtactatta cacccacact acaacagtat ccatcattag ccgcttctct tagtgtgagt 780 gcgaacgaac cgtcatcctt caccaatatg ccaccasttt acaacccgca aacaccacta 840 ctcagaggat cgggatcgcc gctcgattct gacacactag acaccatccc acacactgat 900 acgcgaatgt ggctattctt tacgcgtttc tccccatcgg ttaccactga gcagatttct 960 ctcatggtgc aagtacgtct agcactcgat aagcgggatg tgtttgtaca ccgtctgacg 1020 aagcttggtg ccgacactag tacactctca tttatctcat ttaaggtggg cataccagcc 1080 actctacgca acaaggctct ctcacctaag acatggccct ctgctcttac ctaccgagag 1140 ttccgtgact atcggaccaa taattataac actaatacct gtgcaacaac tgaaacaaat 1200 tcgatgctcg atcaaaacgt tactctcgct accgacactt atcataatcc tctctcacat 1260 tctggaggga gtgccgcgat gccaactact acaacagata cgctcgcaca gcccgttacg 1320 atttcctcac ctgccatgac cactacagac acagctttgt ttacaaacga accacatctt 1380 gaccttaatg agcgaatgct tgttaccacc acacctacca ggtcacctcc cgaatgctta 1440 gccgctccta aaaaacgacc gaagcgcgga aacgctaaac ggactgatga atctgctgac 1500 gctggcccgt cggatgaata gcaatcttgc actgcatccg gcgatgcacc tcgctcagct 1560 catcccagtc tctctatcta ctaccaaaat gtacgtggtc tacgaacgaa aactacaaat 1620 cttcgcctgg cgctatcaga atcagaatat gattttatca ttctcaccga gacttggctt 1680 actcagtcca taccttcttc gctcctcact gacgatcatt atcatatcta caggtgcgat 1740 aggaatcttt ccaacagtgc cctctcacgc ggtgggggtg ttttaattgc atgttcctct 1800 tcaataccga catgtgaaat cgcatcgcct aataccatac tggaacaact ttggatcaaa 1860 acattgctgc caggtgtctc tgtttacatc ggcgttgttt acattccgcc tagtcatgcg 1920 aatgaccccg cagtgatgaa cgctttacat gatagtgtac gtgaaatttc aagccgcatt 1980 aaagagagcg atttattata cgtcttcgga gatttcaata aacctgatat cagatgggag 2040 ctgactaata catcagaagc caccgattgc tctccatgtt attctgtcat gcattatgca 2100 cctttatgca attccgtggc taataccgat ttcgttgatg ggttgcatag taccggatta 2160 tttcagttga gtggtattgc aaatcaatct gggcgtcaat tggatctggt cttcgcaaac 2220 cttgccgcaa ccaatatttt gtgcgactca atcacacctc tgcactctgt gaatggtact 2280 tcggctctag agaactccct cccatacgta acacactgta gtattccact tctcagtgag 2340 gactttcatc atccttcatt ggatatgatg atttattatc ccgtacaact atcccacacc 2400 accaacagtc acactcgcag tacagtcaat agaaatttct tcaaaacgaa tgtggaacgt 2460 atgaattctc ttattgtgtc gtttgaccgc aattttgact gctccaactt tgccactatc 2520 gacgaagcca ccgatttctt tagcgttttt atgcgctcag cgattaattc ctgcgttcct 2580 gttgctcaac gaaagtctgg ccccgattgg tctaatgcat ctttaagacg gttgaaaaaa 2640 ataaaatcaa aagcctacgc ggattacagt agaacgagat catcgctgca taggagaatt 2700 tttttcgatg cactgaacaa ttatcgtcga cawaatcgtg tgctctaccg ctccttcatt 2760 cgccgtactg aaaggcagct gttttctaag ccgacacggt tctggagctt ctggaacaaa 2820 cggcgcaata taagaagtat ccctccgtca atgagctaca atggcsaaac tagtatcgat 2880 acatccgata tttgcaacac tttcgccaat cgtttcgctg atgcattcac ccttcctgtt 2940 cacaatccta acacactagc agaggccacy cgcaatactc catcggatgc tatcgatttt 3000 attataccca caattgacga agcattaatt gcgcgcacac tcaacgatat aaaaccatct 3060 acatcatctg gacctgacaa tattcccgca tacattttga agcactgccg tcaatcactc 3120 gcacccattc ttgccaaaat atttaatgat tcccttatgc gtggcacgta tcctgcgtcc 3180 tggaaacacg cgcgaatggt tcctatccat aaaaaaggca gtcgacttca tgctagtaat 3240 tatcgtggca ttgtttccct atgcgcttgt gcaaaggtgt ttgagctcat tctatacaat 3300 ccgctactca cagcagttca aaactatatg agccctagtc agcatggatt tctcccaagg 3360 agatcttcca ccacaaatct tgctgaattt gttggytact gcttcgacaa catggatcgt 3420 ggtactcaag ttgatgcagt atatatcgac ttcagggctg cgttcgatag tatytctcat 3480 gatattctac tctcgaagct aaaaaaactc ggtttcctcg actggcacat cacctggctg 3540 cgttcatatt taactggtcg ttcgtactac ataagcatag gatctcatcg ttctcactcc 3600 ttcaccagct cctccggtgt gcctcaaggg agtaatttgg gaccgctact cttcctcatc 3660 tatataaatg atctatcttt cgttttaccg ccaggccaac acctaatgta cgccgacgat 3720 gtaaaaatat tcgctccagt tagaaacgac agtgactgtg tacgccttca aacgatcctt 3780 gagaatctgg atagctggtg cagcagaaac gccctccaag tgtgtgctga taaatgccag 3840 tgtatatcat tcagcagagc ccgtcacccc atcacgttta catacactat gctcaacacg 3900 gctttggctc gcacgacatg tatccgtgat ctgggggtgc tactcgatca gaagatgtca 3960 tttcgccctc acattgatag cgttgttgcg aagggaaatc agctacttgg tttaattacg 4020 cggacctgta gcgagtttac cgatcccatg tgcgtcaagt cgatcttctg tgccatcgta 4080 aggtcgtgcc tggagtactg ctgtccgatc tggtgcccgc ttggcgttgg tgacatcaat 4140 cgcctcgaag ccattcaacg gagactcacc aggtacgcgg ttcgactcct tccatggcaa 4200 tcccaccacg ctcggcccac ctaccatcag cggtgtctgc ttctcggaca ttgaaccact 4260 ctgctctcga cgtaaaatat gacgcccaat gccttttcat attccggctc cttaaachcg 4320 gagagatcga ttccccggca tcgaatacta gccagcatca atttgttcgc tccctgtcgg 4380 attcttagat ccaatttcca tctccgtgta ccgcgtaccc gcaacaacca ttagccaggg 4440 acaccctatt atacgtatgt ccctcgagtt caatgaagtg ttagatttgt tcgattttag 4500 tatgtctact tctacgttca aggagaaatt gcgtctacgt cacatttaat gtttgcttat 4560 aactagaggt ctatttatat gctataaatg atcattgtta tacatacctt aatttaatta 4620 taaggttgcc gattagacac gatggtccgt cggtttatat atgaaataaa taaataaata 4680 aataaataaa taa 4693 // ID IKIRARA1 repbase; DNA; ANG; 610 BP. XX AC U55049; XX DT 15-SEP-1998 (Rel. 3.08, Created) DT 01-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE Anopheles gambiae transposon Ikirara1. XX KW IKIRARA1. XX NM IKIRARA1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-610 RA Romans A.P.; RT "IKIRARA1."; RL Direct Submission to Genbank (15-APR-1996)Zoology, University of RL Toronto, 25 Harbord Street, Toronto, Ontario M5S 3G5, Canada. XX RN [2] RP 1-610 RA Romans P., Bhattacharyya R.K. and Colavita A.; RT "Ikirara, a novel transposon family from the malaria vector RT mosquito, Anopheles gambiae."; RL Insect Mol Biol 7(1), 1-10 (1998). XX DR GenBank; U55049; Positions 1 610. XX SQ Sequence 610 BP; 181 A; 122 C; 123 G; 184 T; 0 other; cagggtttca cactttatct caagaccgcg ggaccccttc ccgaatctgt cttagccaaa 60 gccaagactg cggtatgatg cttgaaaacg ttgatgatac ctgaattcat cggatgcaat 120 gctgaatctg cggcacattt ctcgaaacac actggtccgc gccgaggaca tgcggtcgaa 180 tattttttta cacggataac ctttgtgtac atcagtgtac tgaaaacagt taattatttt 240 ttcgtgtttt taccagaaaa gattatttaa acagttcaaa ctaatttcta aacacttgta 300 aatattttaa ttaaacaaat atgcattcac tcattttatt aaattgtctt tttttggtaa 360 aaaaacgaaa aggataattg agtgtttcta atagactgat gtacacaaag gttatccgtg 420 taaaaaaata ttcgaccgca tgtcctcggc gcggaccagt gtgtttcgag aaatgtgccg 480 cagattcagc attgcatccg atgaattcag gtatcatcaa cgttttcaag catcataccg 540 cagtcttggc tttggctaag acagattcgg gaaggggtcc cgcggtcttg agataaagtg 600 tgaaaccctg 610 // ID Ag-L1-5 repbase; DNA; ANG; 4468 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE An L1 clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW L1; Non-LTR Retrotransposon; Transposable Element; Ag-L1-5. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4468 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4468 RA Kojima K.K. and Jurka J.; RT "L1 clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 6 CC sequences with >98% identity. XX FH Key Location/Qualifiers FT CDS 121..1086 FT /product="Ag-L1-5_1p" FT /translation="MSATGQSSIKLDFSHVPNKPTTNEVIDFITGKLGISM FT IQVNRIHLRTSSVSCHVDIKEQSIALDIVEKNDGKHTIQCKGVEYPIPITM FT DDGSTIVKIHDASAQVTNTHIKNHMQRYGDVICVTEGVWSDAFACAGIPNG FT YRYVRMVIKKQIPSYIHINGETTLATHRGQQHTCRKCDHPVHHGMTCLSNR FT KLNSQKTNLMDRLKSYADVTTSGIQQKQNTLNTTTTTNTIPVEQPPQPQPG FT TSGLKFIPIKQTDAMTNDNSTVTETAVNLETNKGYRSRSPSVKRQAEKKLT FT DFITVESKRWRTRYTKSAEDKITKGRRQSR" FT CDS 1159..4377 FT /product="Ag-L1-5_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="MANTTTHSNITEFKKSYRIGSINICNISNTTKTDALI FT ATVKKLDLDIICMQEVNDSSLEIIQGYELITNIDERQRGTAIALKQDISFT FT HVEKSLDGRIISLCFGRTKLINIYAPSGSQNRIQRERFFQQSLAYHLRHPV FT ENIIIVGDFNCIIDKKDALGTSNYSPALRNAIDNLNLVDTWNLLKNDPEFT FT YITSNSASRIDRIYISRNLKENLRSVDIHSVSFSDHRAYVIRLALPNNNQQ FT IITRQKLWSYRKYITTQEIQNEFKENWIRWTRQKKHYGNWLTWWIKYAKPK FT IKSFFLWKTKMHYKEYNQTHDRLQSRLNESYSKLVNNPDEKITINRLKAEM FT LTLQRKLTQNFIKENNNYIAGEPMSTFHLFKRRKNRNFIKKLRLNNRTLEN FT EKEILDAIQEQFKLLFSGNDNCLNSTPIIPRKIPEACVLNNDLTKEIETVE FT IFHTIRTATKNRSPGLDGLPYEFYETFFDIIHKELNLIINEALDNDIPPEF FT VEGIIVLIPKDNNTFEIENWRPITLLNADYKIFSRILKFRMKNILDTHNIL FT TSSQKCGNGKNNIFQALLSIKDRIMNLKEKKIAGKLISFDMEKAFDRVDHT FT YLYKTLTAIGFKENFITLIKKIYSVSTSRLYINGSLQEKLHIKRSVRQGDP FT MAMHLFTLYIHPLLCKLENICNSQNDLVNCYADDISIITTDLNKIFNIKKA FT FEEYETESGAKLNTNKSLAMDIGVVNNQNRINTEWIENREKIKILGILFTN FT SFRLTMKLNWEATTHKFAQLVRQHRMRELNIQQRVILANTFLTSKLWYIGS FT VMTISKYHLGRLTFYLGIFIWSNGGPRVKMQQLALNKAKGGLNLLIPELKL FT KTLLINRHLIEKQHMPFTKSFLNQIENPPNIKDLPSFYPCLVPIRTELAYL FT NDRYKNNPSAKEIYKKVTDSLPLPHIQSKTPSLHWNKIWRNINNNKMNSKQ FT RSYYYLLVNEKIMTEEIKFKFGMKTNPNCEKCGNIEDLEHKFKDCKNIKNI FT WILFTQIVKDELHIKPNFHELIKPDMNNIKKEKRAKIMSLFINYLIFIYKE FT NFTNSIGKNQLKNIIC" XX SQ Sequence 4468 BP; 1803 A; 808 C; 734 G; 1123 T; 0 other; tgtacagttc ggttctagag ctcgagcagt acagacgtag atgcagagtg ctcagcaaaa 60 caaaaaaaaa actaaagaca agtgtgcgtt aaaaatacta gcgtgtggca gaccggcacc 120 atgtcggcaa cgggacaatc cagcatcaag ttggactttt ctcatgtccc aaacaagcca 180 acaactaatg aagtgattga cttcataacc gggaaacttg gaatttcgat gatacaggtg 240 aacagaattc acctccggac gtcgtccgtt tcatgtcacg tagacatcaa ggagcaaagt 300 attgccttgg acatcgtaga aaagaacgat ggcaaacaca ctattcagtg taaaggtgtg 360 gaatatccca taccaattac tatggacgat gggtcaacga tcgtgaagat tcacgacgct 420 tcagcgcaag ttacaaacac acacataaaa aaccacatgc agagatacgg agacgttatc 480 tgcgttaccg aaggtgtgtg gagcgacgcc ttcgcatgtg ccggtattcc gaacggatat 540 cgctacgtta ggatggtaat caagaaacaa ataccatctt atatccatat caacggtgaa 600 acaactctcg ctacccatag aggacagcag cacacatgtc gcaaatgcga tcatcccgta 660 caccacggaa tgacctgttt gagcaacaga aagctcaaca gccaaaaaac aaacctaatg 720 gataggctaa aatcgtacgc agacgtaacg acgagtggga tccaacagaa acaaaataca 780 ctaaatacaa caactacgac gaatacaatc ccagtggaac aaccaccaca acctcaaccg 840 gggacatcgg gactgaaatt tataccaata aaacaaactg atgcaatgac gaacgataac 900 agcaccgtca cagagacagc ggttaatttg gagaccaata aggggtacag gtcacgatcg 960 ccttcggtga aaagacaagc tgaaaaaaaa ctgactgact ttataacggt agaatcgaaa 1020 cgatggcgaa cgaggtatac gaaatcggct gaagacaaaa ttacaaaagg taggagacaa 1080 tctcgataaa ttataattac taacacagct gaaaaaaaaa acgattcgac acatcgcaaa 1140 gtaaaagata atttcgaaat ggccaacacc actacacatt caaacataac agaattcaaa 1200 aaatcctatc gtataggaag tatcaatatc tgcaatatat ctaacacaac caaaacagac 1260 gctcttatag ccacagtaaa aaagctagat ttagatataa tatgcatgca agaggtcaat 1320 gattcgtcat tggaaataat tcaaggctat gaattaataa caaacataga tgaacgacaa 1380 agagggactg caatagcttt gaagcaagac atttcattta cccatgtaga aaaaagttta 1440 gacgggagaa ttatatccct atgctttggc agaacaaaac taattaatat ctacgcacca 1500 tctggtagcc aaaacagaat acaaagagaa cgtttctttc aacagtcgtt agcatatcac 1560 ttaagacatc ccgttgaaaa tataattatt gtaggagatt tcaactgcat tattgataaa 1620 aaagatgcac ttggaacctc gaactatagc ccagcattac gtaatgcaat tgataatttg 1680 aatttagtag atacttggaa tttgttaaaa aacgatccag aatttactta cattacttca 1740 aactctgctt cacgtataga taggatctat atatcacgta atctaaaaga aaatctccgc 1800 tcggttgaca ttcacagcgt ttcgtttagc gatcaccgtg cttatgtcat acgattagca 1860 ctgccaaata ataatcaaca aataataaca aggcaaaagc tatggtcata taggaaatat 1920 attacaacac aagaaattca aaacgaattc aaagaaaact ggataagatg gaccaggcaa 1980 aaaaaacact atggaaactg gttaacatgg tggataaaat atgcaaaacc aaaaatcaaa 2040 agttttttcc tatggaaaac taaaatgcat tacaaagaat ataatcaaac gcatgatagg 2100 ttacaatcta gattaaatga atcatattcg aagcttgtaa ataatcctga tgaaaaaata 2160 actataaatc gtttaaaagc tgaaatgctt acactacaaa ggaaactgac acaaaacttt 2220 atcaaagaaa acaataacta cattgcagga gaaccaatgt ctacttttca tctatttaaa 2280 cgaaggaaaa atcgtaattt cattaaaaaa ctccgtttaa ataatcgaac attagaaaac 2340 gaaaaagaga tactcgacgc cattcaggaa caattcaagc tactcttttc aggcaacgat 2400 aattgtttaa attcaactcc aattattccc agaaaaatac ctgaggcatg tgtgctcaat 2460 aatgatctaa caaaagaaat tgaaactgta gaaatatttc ataccattcg aacagcaact 2520 aaaaatcgta gtccaggatt agatggttta ccgtatgagt tctatgaaac attttttgat 2580 attattcata aggaacttaa tcttataatc aatgaagctc ttgataatga tattccaccc 2640 gaatttgttg aaggaataat tgttttaata ccaaaagata ataatacctt cgaaatcgaa 2700 aattggaggc ctataacact gctcaacgca gattataaga tttttagtcg aattttgaaa 2760 ttcagaatga aaaatatcct tgacacacac aatatcctaa catcttctca gaaatgtggc 2820 aacggaaaaa acaatatctt tcaggcactt ctttcgatta aagatagaat tatgaacctc 2880 aaggaaaaga aaatagcagg aaaactcatc tcttttgaca tggaaaaggc atttgatcga 2940 gtcgaccata catatttgta taaaaccttg acggcaattg gatttaaaga aaactttata 3000 acgctcatca aaaaaattta ctcagtatct acttcaagat tgtacataaa tggatccctc 3060 caagaaaaac ttcatataaa aagatcggta aggcaaggag accccatggc aatgcattta 3120 tttaccttgt atatacaccc gcttctatgc aaattagaaa acatatgcaa cagtcagaat 3180 gatctagtca actgttatgc agatgacatt tctattatca caacggatct taataaaata 3240 ttcaacataa aaaaagcatt cgaggaatac gaaactgagt caggagcaaa gcttaacaca 3300 aacaaatccc tggcaatgga cataggagta gtcaataatc aaaatagaat caacacagaa 3360 tggattgaaa acagagaaaa aatcaaaatc ttagggattc tattcacaaa ctcatttcgt 3420 ttgacgatga aattgaattg ggaagcgaca acacataagt ttgctcaatt agtaagacag 3480 cacagaatga gggagctcaa cattcagcaa cgggttatac tagccaatac tttcctaaca 3540 tcgaaattgt ggtacattgg atcagttatg acaataagca aataccatct tgggaggctt 3600 acattctatc tgggaatatt catatggtca aatggagggc caagagtaaa aatgcaacaa 3660 cttgctctaa acaaagccaa gggaggtctt aacttgctaa taccagaact caaacttaaa 3720 acattattga tcaaccgaca cctaattgaa aaacaacaca tgccatttac aaaaagcttc 3780 ttgaatcaaa tagaaaatcc acctaacatt aaagatcttc catcattcta tccttgttta 3840 gtgccaataa gaactgaact agcctattta aatgatagat ataaaaataa tccttcagca 3900 aaagaaattt acaaaaaagt tacggattcg cttcctctac cacatattca atctaaaaca 3960 ccatcactgc attggaacaa gatatggcgt aatattaata ataacaaaat gaattcaaaa 4020 caacgaagct actattatct tttggtaaat gaaaaaatta tgacagaaga aattaaattt 4080 aaattcggaa tgaaaacaaa cccgaactgt gagaaatgtg gaaatatcga agatctcgag 4140 cataaattta aagattgtaa aaatattaaa aacatatgga tattgtttac acaaatagtg 4200 aaggacgagt tgcatataaa accaaatttt catgaactta ttaaaccaga tatgaataat 4260 ataaaaaaag aaaaaagagc taaaattatg tcattattta tcaattattt aatatttatt 4320 tacaaagaga atttcaccaa tagcataggc aaaaatcaat tgaaaaacat aatatgttaa 4380 acaagaatta ttatagattt aagctagcga ataagaattt tttgttataa gtgtgaataa 4440 actgttcttt gttttataaa aaaaaaaa 4468 // ID GYPSY57-LTR_AG repbase; DNA; ANG; 296 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY57-LTR_AG is an LTR of retrotransposon GYPSY57_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY57_AG; GYPSY57-I_AG; GYPSY57-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-296 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY57_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 154-154 (2004). XX DR [1] (Consensus) XX CC GYPSY57-LTR is a long terminal repeat of GYPSY57_AG (its CC internal portion is deposited as GYPSY57-I_AG). XX SQ Sequence 296 BP; 85 A; 58 C; 84 G; 69 T; 0 other; tgttatgggc caacctgatg atcccgtaga tcgctaccga ggacactcat cgggcgtaga 60 tcgaagtgtg tgtgcgatct atcgggaggt cagcgaacgc gcgctcggca cgatgcgtga 120 gtgccgatcg aggggaaaga tatggcgcgc gcttgagaaa ggtcgtggaa caaagtggtg 180 gcaaaattgt ttacgagtta ttaataataa acaaggactt tccaatccca tcgttttgaa 240 ggaagtaacg cgttaacgtt tcttagcgtt cttcgaaaga aaacgaaaat ataaca 296 // ID GYPSY52-I_AG repbase; DNA; ANG; 5341 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY52-I_AG is an internal portion of retrotransposon GYPSY52_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; CsRn1 lineage; GYPSY52-I_AG; GYPSY52-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY52_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5341 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY52_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 96-96 (2004). XX DR [1] (Consensus) XX CC GYPSY52_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the CsRn1 CC lineage of other organisms. CC GYPSY48_AG, GYPSY49_AG, GYPSY50_AG, GYPSY51_AG and GYPSY53_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY52-I_AG consensus was reconstructed after multiple CC alignment of 7-8 copies. CC The consensus encodes the 327-aa GYPSY52_AG1p gag-like CC polyprotein (pos. 922-1902) and the 1137?aa GYPSY52_AG2p CC pol-like polyprotein (pos. 1906-5316). CC The sequence of the LTRs flanking GYPSY52-I_AG is deposited as CC GYPSY52-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1906..5316 FT /product="GYPSY52_AG2p" FT /translation="RNAGDVQLLVDSISSASHRLIIKDYKTNQPFLIDTGA FT DVSVIPRQHSSVPCKPSTMKLFAANSTPIQVYGESLYTLDLGLRRAFLWNF FT VIADVGTAIIGADFLQHFHLLVDLRDKCLVDALTNLRTPGVPDPHPGEPTV FT KVCDSTSPIATLLREFPGLTARTAPGTLLQSDVTHRIETTGQPTFARPRRL FT SPEKYAAARAEFESLVQLGVCRPSNSSWASPLHMTKKADGTWRPCGDYRAL FT NAKTIPDRYPLPFLQDFTMHLQGKTIFSKVDLHKAYHQIPIHPEDIPKTAI FT TTPFGLFEFTTMPFGLRNAAQTFQRLIHDVLRGLDFVFPYIDDMIVASSSE FT EEHHEHLRQLFQRLEQHHLAINPAKCEFNRSEIAFLGHLVNAEGIRPLPER FT VRAISELSKPATIMELKKFLAMINYYRRFLPHALTTQSILLEMTPGNKKKD FT KTPLKWTPESSEAFDRCKEQLQQAALLAHPALNAELSLWTDASDFAAGAVL FT HQRIDGQLQPLGFFSKKFEKAQLNYSTYDRELTAIYLAVRHFRYQLEGREF FT CIYTDHKPLTFAFRQTLDSSSPRRARQLDFIGQFSTDIRHVSGEENITADL FT LSRIEIVNASPAIDFERLAEEQTNDPELADILSGKTRTDLFLQKTPIPGST FT QSLYADCPGGIIRPYITRSFRNQLLHAVHDLSHPGARATAKLMTERFVWLD FT IKRDAQEFARNCLACQRAKIGRHTKSPLVPYPATQDRFSHINIDIIGPFPI FT SNGNRYCLTIIDRYTRWPEAIPIPDITATTVVSALLYHWIARFGVPSHVTT FT DQGRQFESALFKELTRALGTKHIRTTAYHPQANGLIERWHRTLKAAICCKD FT TSKWSEHLPLILLGLRTTFKNDINASPAELVYGTTLTIPAEFLIEKPQPAM FT VNQSDFAKTLREAMSKIRPSNTAWHTNRTSFVHSDLNKCSHVFVRNDTVRP FT ALTTPYHGPYQVLTRNSKSFQILLNEHPSLVSVDRIKPACTTEGIISSAPQ FT QPSPDQLLTSQGYATPVAQPPSTQQSTMSQRLTTPMPGPSTLDQLPSYNQS FT SLPDQQPTATQSPSTIQLPSTNQPPMSRNRATRTSPMPSLQPARATTSFAP FT PPPILRKDQNVSTGVTRSQRRVVIPLRYR" FT CDS 922..1902 FT /product="GYPSY52_AG1p" FT /translation="MQRSPQQGAAPVASPAVAISPAVAASPAVVASPAAGL FT PTPSPYLIDITPLARDASTDLPEFQVETVNAMRLKPPELDTADIHTFFYAL FT ENWFDAWNISPHHHVRRFNILKTQIPTRIPPELRPILDSVPSTDRYESAKK FT AIIQHFEESQRSRLHRLLSEMSLGDRKPSQLLAEMRRTANGAMTDSMLIDL FT WIGRLPPYVQSAVIASSQNADEKVKVADSVVDSFALYNRSGPYQTIAEVRN FT EEVNRLSRQVAELSQRLETLMNQNQARERSRARSRSRNRNTNQATNPNTNG FT YCFYHDRYGQQARNCRAPCSFNSRPQNNGTPTTSA" XX SQ Sequence 5341 BP; 1383 A; 1681 C; 1197 G; 1080 T; 0 other; actggtgacc ccgacgtgat ccgcaaccga cgccgcgaat tcgtacgtga gtgtgtgtgt 60 gtactacgca tcgcaaactt gtaccgcgtt gaacgtgtta aaacgtgcca ccggcatcgg 120 agatcgacgc agagaagaca acgcgtgtgt gtgtgtgcct acgaccggga gtgctaccag 180 cgttagaggt accggcaacg cggctggcct cctcccccct ttccctgtaa agtgcgcaac 240 gagcgccctt tcatcattca caggacgttg agtgcgggcc gacgaaaaaa cgcagcaacg 300 ccgtgttagt gcgcccgtag ctacacggag cgttcctgca ttcagcgaac aaaagaaccc 360 cccgcacaca cgcgcgccgc ctcgaagcga gacaagcgca cacacaccat catcggcgca 420 gcagaaaatt tccatcgagc tttcgacggc tgacgcaatc gacgcaattc ctggaagcgg 480 atcgacgaca atctccgctg gtggtctcct cgacgccgat caccgcctcg caagctgttt 540 ctacacctcg tgctactggg aattttccgt ttgtgtgtgc ttccatctac gggtgcgatc 600 aagaaggaag acaccaacgt cgagttgcgc acgcaaatta tccagccgta gcagtagtga 660 cacatcgccg ccatcacgat tttttcgctc gaaggcgaac ccgcttggat ccagctgaca 720 agaagaacga cgccgccatt ttctgtgcac catttcatcg caagtgagga aacgccattt 780 tctccacccg agaaggaaga cgacgcacgc tgcttctgaa gcgttttcat accgatcgac 840 cgcaacgacg acaccctggt tacgtgaaca attaaggtaa gatccaccct ttttattcac 900 tactaacccc gacgcggaaa gatgcagcgt agcccgcaac aaggtgcagc gccggtcgcc 960 agccctgcag ttgccatcag ccctgcagtg gccgccagtc ctgcggtggt cgccagcccc 1020 gcagctggac tccccacacc gagtccatac ttaatcgata taacccctct cgctcgtgac 1080 gcctctaccg accttccgga gttccaagtc gagaccgtga acgccatgcg tctcaagccg 1140 cccgaattgg atactgctga catccatacg ttcttctacg ccttggagaa ctggttcgac 1200 gcctggaata tttctccgca ccatcatgtt agacgtttta acatcctgaa aacacagatc 1260 cctacgcgaa ttcctcccga gttgcgtccc attctcgaca gcgtaccaag taccgaccgc 1320 tacgaatccg cgaagaaagc catcatacaa catttcgaag agtctcagcg aagccgccta 1380 caccgtttgc tatccgaaat gagcctcggc gaccgtaagc cttctcaact gctagccgag 1440 atgcggcgaa cagcaaacgg tgcaatgacc gactccatgt tgatcgatct ttggatcggt 1500 cggctgccgc cctacgttca gtccgccgta atcgcttcct ctcagaacgc cgacgagaaa 1560 gttaaggtgg ccgattcagt ggtcgactct ttcgccttgt acaatcgctc cggtccgtac 1620 caaaccatcg ccgaggtacg gaatgaggag gtcaaccgcc tttcacgaca ggtagccgag 1680 ctgagtcaac gtttagaaac tctaatgaac cagaaccaag ctcgtgaacg ctcacgggcc 1740 cgctcacgct ctcgcaatcg caacacgaac caggctacta atcctaacac taacggctat 1800 tgtttctacc acgaccgata cggacagcaa gcgcgtaact gccgcgctcc gtgctcgttt 1860 aacagccgtc ctcaaaataa cggtactcct actacttctg catgacggaa cgcgggagac 1920 gtccagcttc tagtcgattc gatatcatcg gccagtcatc gccttattat caaagactat 1980 aaaactaacc aacccttcct tatcgacacc ggcgccgacg tctccgtaat acctcgacag 2040 cacagctccg ttccttgcaa accatcgacc atgaaactat ttgccgctaa cagtacgcct 2100 atccaagtgt atggagagtc actctacacg ttggatcttg gtcttcgacg cgcctttctg 2160 tggaacttcg tgatcgcaga cgtgggtacc gcaataatag gagccgattt cctccagcat 2220 tttcaccttt tagtggacct gcgcgacaaa tgcctcgtag acgccttaac caacttacga 2280 acgccgggtg ttcctgatcc acatcctggg gagccaaccg taaaggtgtg cgattcgacc 2340 tcgccaatcg ccaccttgtt gagggagttc cctgggttga ctgcacggac cgcgcccgga 2400 acgctcctgc aatccgacgt cacgcaccgc atagagacta ctggacaacc gaccttcgcc 2460 cgtccgcgta ggttgtcccc cgaaaaatac gctgcagctc gcgccgagtt cgagtctctc 2520 gtacagctag gagtgtgccg accatctaac agcagctggg ccagtcctct gcacatgacg 2580 aaaaaagcag acgggacgtg gcgaccctgt ggcgactaca gggccctcaa cgcgaaaaca 2640 attcccgacc gttacccact acctttcttg caggatttta ctatgcattt gcagggaaaa 2700 accatctttt ctaaggttga cctgcacaag gcatatcacc aaatcccgat acatcccgaa 2760 gatataccga agacggcgat cacaaccccc ttcggattgt tcgagttcac cacgatgccc 2820 ttcggcctac gaaacgctgc gcagactttt cagcgactca ttcacgatgt cctccgaggt 2880 ctcgacttcg tgttcccata catagacgat atgattgtcg cctcgtcgtc agaggaagaa 2940 caccacgaac acctgcgcca gctgtttcag cgtctcgaac aacaccacct ggccatcaat 3000 ccggccaaat gcgaattcaa ccgaagtgaa attgcctttc ttggccactt ggtcaacgca 3060 gaaggaatac gtcctctccc ggaacgggta cgagcgatca gcgagttgag taaaccagcg 3120 acgataatgg agctgaagaa attcctcgcg atgatcaatt attatcgacg cttcttgcca 3180 catgccctga cgacacaaag catcctactt gagatgacac cggggaacaa gaagaaggac 3240 aagacaccgc taaagtggac gcccgaatct agcgaagcat tcgaccgatg taaagagcag 3300 ctacaacaag ccgcgctttt agctcaccct gccttaaacg cagagttgtc actctggaca 3360 gacgcttccg acttcgccgc tggagccgtt ctccaccaac gtatcgacgg ccaactccag 3420 cccttaggat ttttctccaa aaagtttgag aaggcacagc tcaactactc aacatacgac 3480 cgcgagttga ccgctattta tctggctgta cgacactttc ggtatcagtt agagggtcgt 3540 gaattctgca tctacacgga ccacaaacct ttaactttcg cttttcgcca gactcttgac 3600 agctcctcgc cgcgtcgagc cagacaactc gacttcattg gacaattctc caccgacatt 3660 cgccacgtat cgggagaaga aaacatcacc gccgacctac tctcacggat cgaaattgta 3720 aacgcatcac ctgcaatcga ttttgaacgt ctcgccgaag agcaaacaaa cgaccctgag 3780 ctcgccgata tactcagcgg taaaacacga accgaccttt tccttcaaaa aacgccgata 3840 ccaggaagca cccaatcgct atacgccgac tgccctggtg gaataatccg accgtacatc 3900 acaagatcat tccggaatca gcttctacac gccgtacacg accttagcca tcctggagcg 3960 agagccactg ccaaattaat gacagagcga tttgtttggc tggacatcaa aagggacgct 4020 caggaattcg ccagaaactg cttggcatgc caacgcgcta agataggcag acacacgaaa 4080 agcccgctcg taccgtaccc ggcaacgcaa gacaggttca gccatataaa catcgatatc 4140 atcgggccat ttccaattag caacggcaac cgatactgcc tcacgatcat agaccgatat 4200 acccgctggc cggaagctat accgattcca gacatcaccg cgactacggt tgtatcagca 4260 ctgctgtacc attggattgc acgattcgga gttccgtctc acgtgacaac cgatcaggga 4320 cgacaattcg agtccgccct gttcaaggag ttgacgcgag ctcttggaac aaaacacatt 4380 cgaacgaccg cgtatcaccc gcaggcaaac gggttgatcg agagatggca tcgcactctc 4440 aaagctgcaa tttgctgtaa agatacatcg aagtggagcg agcatctccc actcatactg 4500 ctcggattac gcactacatt caagaacgac attaatgcgt cgcctgcaga actcgtgtac 4560 ggaacgaccc tcaccattcc cgccgaattc ctcatcgaaa aaccgcaacc ggccatggtc 4620 aaccagtccg actttgccaa gacactccga gaagcgatga gcaaaattcg accttctaac 4680 accgcctggc acaccaaccg aacatcgttt gtgcactccg acttgaacaa atgctcgcat 4740 gtgtttgtgc gcaatgacac cgtccgtccc gcgttaacca caccttatca cggcccatac 4800 caagtactta ctcgaaattc aaagtctttt cagatcctac taaacgaaca cccctcgctt 4860 gtgtccgtcg accgcataaa gccagcatgt acaaccgaag gaataatctc gtcagcccca 4920 caacaaccgt cacccgacca actgctgacg agccaaggat acgctactcc ggtggcccaa 4980 ccaccgtcga cgcaacaatc gacgatgagc cagcgactca caactccgat gcccggaccg 5040 tcgacgctcg accagctgcc atcgtacaac cagtcatcgc tgccagatca gcagccaacg 5100 gccacacagt cgccttcgac catacagctg ccgtcaacaa accaaccgcc gatgtcccgg 5160 aaccgcgcca cacgaacatc acccatgccg tcgctacaac cagccagagc aaccaccagc 5220 ttcgccccac caccgcctat cttacgcaag gatcaaaacg tatcgaccgg tgttaccaga 5280 tctcaacgaa gagtagtaat tcctctgcga taccgataac accgatctag gaggggaatc 5340 c 5341 // ID INVADER1-I_AG repbase; DNA; ANG; 4626 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 19-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE INVADER1-I_AG, an internal portion of the INVADER1_AG Gypsy-like DE LTR retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW Gypsy superfamily; INVADER group; INVADER1-I_AG; INVADER1-LTR_AG; KW INVADER1_AG; endogenous retrovirus; gag; integrase; protease; KW reverse transcriptase. XX NM INVADER1-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4626 RA Kapitonov V.V. and Jurka J.; RT "INVADER1_AG: a family of Gypsy-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 15-14 (2002). XX DR [1] (Consensus) XX CC INVADER1_AG is a member of Gypsy-like retroviruses that belong CC to the INVADER group originally identified in Drosophila (see CC description of INVADER1_DM in drorep.ref). CC Members of this group encode one long polyprotein composed of CC the gag, protease, reverse transcriptase and integrase domains. CC A similar 1363-aa polyprotein, called INVADER1_AGp is encoded CC by INVADER1-I_AG (positions 488-4576). CC Positions of domains in INVADER1-I_AGp: CC gag: 1-350; protease: 360-540; reverse transcriptase: 550-1050; CC integrase: 1085-1363. CC INVADER1-I_AG is flanked by long terminal repeats, CC INVADER1-LTR_AG. CC Solo LTRs or proviral copies are flanked by 4-bp target site CC duplications. CC INVADER1_AG has been integrated into the mosquito genome during CC the CC last one million years. There is 99% identity of INVADER1-I_AG CC copies CC with the consensus sequence. XX FH Key Location/Qualifiers FT CDS 488..4576 FT /product="INVADER1-I_AGp" FT /translation="MATVKQLCEDFTKIGLVRECERLGLETTGGKVEIANR FT IVKYRATVASGDNAGPSGSGHRAEPANATDDVENYAGNQPYESCDDDSEEK FT ADLDLPEEEHGDSFVAEDDDETEEDPFQTAVRISTPKRPQRVYAFRDVEDS FT IETFGAEDGHDVRIWLAHLDSVSKSAGWNDEQKLIMLRKKMTGIARKFVSS FT LRNVQTYAILKKELIAEFAPFVRSSDVHRILANRKKETAETMREYVYEMQR FT IAAQIDLDEPSLCEYIVNGVTDDDFFKSLLYEAQTIRVLKEKLLNFEKVRM FT ARKKKTTDKEENKRVLSSSSRVDKRAEQRCYNCGNKGHQARACAQTQGGPK FT CFSCREYGHKASECARNKSVVPAKINVTEESVGMVDVVLNKTSVKALFDSG FT SNQNLVTIGCYKRIEGSPLIDTSMWFQGFGGMRTKAIGMFTVDVTVDDNVF FT SGVRFFVVPNESMSYDAVLGRDSLNYFEVTMTTAGVKVRPYGSTDEMFSIV FT CDNEDNLDVSPRFSERVKAVISGYKPAGNVNSRVETKIILHDETPVRSSPR FT RFAPGEKAVLEKTIDEWLAAGIIRESESDFASPVTLARKKDGSLRVCVDYR FT ELNRKMVKDCFPMRNIEDQIDRLKSARVFTTLDLKNSFFHVPVEKSSQRYT FT GFVTHTGQYEFLRTPFGLVNSPASFSRFVADVFREFIKSERVLVYVDDLII FT PSLDEESNFQTLKELLNVASENGVQFNWKKSQFLKDEVEYLGYVIRGGCYR FT IAPSKLRSVQLFPEPKNVKQLQRFLGLTSYFRKFIAGYATISKPLTSLLQK FT GVEFVFGEEERSSFDELKRCLVTDPVLKIYDESAETELHTDASKYGYGAAL FT MQKSDDDKFHPVAFMSQQTSNAEKNYSAYHLEVLAVVRAVEKFRVYLLGIK FT FKIVTDCAAFGHTLKSKELSARIARWALMLEEYEYEVVHRPGSSMKHVDAL FT SRAPVMIVKSDPMIEAIRKMQQSDERAKAIIELLKTQSFEDFVMCDGLLMK FT VVKGREVIVVPSGMQSDLIRRIHEKGHLGARKIEGIIEQEFYIPNASEKIK FT QTIECCVKCILAERKRGKVDGLLYPIAKGDVPLDTYHVDHLGPMDITEKRY FT KYLFVVVDAFSKFTWIYPTKTTNSFEVIQRLTTQSEVFGNPRRIISDKGAA FT FTSNDFKRYCEDQDIERVEVTTGVPRGNGQVERVNQVIIAMLRKMSVNDPA FT KWYKHVANIQRWINSSPHQSIGVTPFEAMFGVPMRHEGDLRLGELMEEIRV FT AQHHDQREQTRAGARVSIEKAQEEQRRSYDLRARSATTYREGDLVVIKRTQ FT FGPGRKYAAEYLGPYKVTSVRPHDRYDVEKINGEGPKVTSTASSHMKPYRF FT " XX SQ Sequence 4626 BP; 1323 A; 804 C; 1339 G; 1159 T; 1 other; tgggggctca accgggatac ttttttccca aacgagtgag gcttttgcga tgaaagcaat 60 cggtgcgata ttgtgtgatg tttcgtgagg ctattttgtg attaaatagc aaaaaactgt 120 gaggcttttt tataagcaaa tagtgagacg ttttgtgaaa gagaagaaca gtgaggctta 180 tgtagcaacg cgtgcggatc agtgaggctt tcgtagcaat ccgtgcgaat cagtgaggct 240 tgtgaagcaa ttagtgaggc tttattttta gcaaccagtg agaatccgtg agtgatagcg 300 aggcttttcg aagcacaccg agtgaagaaa ttaagcgagg cttcacagca atctgtgtgt 360 gaaaattctg tgtgaatttt acgggtcaac tagtgtgaca ttttgtgcgt gagtgcggga 420 agtcgcgaat tgcctggtag tgtgtgtcgg ctttgtgagt gaaagtaccg cgagagagag 480 agatagaatg gcaacggtca agcagctttg tgaagatttc acgaagattg ggttggtgcg 540 cgagtgtgag aggcttggcc tagaaaccac tggaggaaaa gtggaaatcg cgaatcgaat 600 tgtgaaatac cgagcgacgg ttgcgtcagg ggacaacgcc ggtccgtccg ggagtggaca 660 cagagcagaa ccagctaacg ccacagacga cgtcgaaaac tacgcgggta accaaccgta 720 cgagagctgc gatgatgatt ccgaggaaaa agcggatctg gatcttccgg aagaagaaca 780 cggtgatagc tttgttgccg aggatgacga tgaaacggaa gaagaccctt ttcaaactgc 840 tgtgcgaatc tcgacgccta aacgaccgca acgcgtttac gcatttcgag atgtggaaga 900 tagcattgag acgtttggag cagaggatgg gcacgatgtg cgtatttggc tggcacacct 960 cgattccgta tcaaagtcag caggatggaa tgacgaacaa aagctaatca tgttacgtaa 1020 aaagatgacg ggaatcgcaa gaaagttcgt gtcgtctttg cgtaatgtgc aaacttatgc 1080 gatattgaaa aaagagctga tcgcggaatt tgctccattt gtgaggtcga gtgatgtgca 1140 tcggattctc gcgaatcgga agaaggaaac ggccgagacg atgcgagagt acgtttacga 1200 aatgcaacga attgctgccc aaatcgattt ggacgaacca agcttgtgtg agtacatcgt 1260 taacggtgtg accgacgatg attttttcaa atcattgctg tacgaggcgc aaacaattcg 1320 agtattaaaa gaaaagttgc tcaactttga aaaagtgcgc atggctcgaa agaagaaaac 1380 gacggataaa gaagaaaaca aacgagtttt gtcatccagt agccgcgttg acaaacgggc 1440 ggagcagcga tgctacaatt gcggaaacaa aggacaccaa gctcgcgcgt gcgcgcagac 1500 acagggtggt ccgaaatgtt tctcgtgtcg tgagtacggt cataaggcga gcgagtgtgc 1560 gcggaacaaa agcgtcgttc ctgcgaaaat caacgtgacg gaagaatcgg tgggaatggt 1620 tgatgtcgtg ttgaacaaaa catcggtcaa ggcattgttt gacagtggaa gcaaccaaaa 1680 cttggtgaca ataggttgtt acaaaagaat cgagggatca ccgctgatcg atacttcgat 1740 gtggttccag ggctttggtg gcatgagaac aaaggcgatc ggcatgttca cggtggacgt 1800 tacggtggat gataacgttt ttagtggtgt gcgatttttt gtggtgccaa atgaaagcat 1860 gtcttacgat gcagtattgg gcagagattc cttgaactat tttgaagtta cgatgacaac 1920 ggcgggtgtc aaagtcaggc catatggttc aacggatgaa atgttttcta ttgtgtgtga 1980 caatgaagac aatttggatg tgtctcctcg attttcggaa agagtaaagg cggttatttc 2040 ggggtacaaa cctgcgggaa acgtgaatag tcgtgttgag acgaaaatta ttttgcatga 2100 cgagacgcct gtgcgttcgt cgccaaggcg ttttgctccg ggtgaaaagg cggtgctgga 2160 gaaaacaatc gacgagtggt tagccgcggg aataattcga gaaagtgaga gtgattttgc 2220 gagtccggta acgttagcga gaaaaaagga cggttcctta cgcgtttgtg ttgattatcg 2280 cgaacttaat cgaaaaatgg ttaaggattg ttttcccatg aggaacatag aagatcaaat 2340 cgatcgcttg aagtcagcca gagtttttac cacacttgac ttgaaaaatt cgttttttca 2400 tgttcctgtg gaaaagtcga gccagcggta cacaggcttt gttacccaca caggccagta 2460 cgagtttctt agaacgcctt tcgggttggt caatagtcca gcgagtttca gccggtttgt 2520 agcggatgtg tttcgggaat tcatcaagag tgagcgtgtg ttggtgtatg tggatgattt 2580 aataattcct tcattagatg aggaaagtaa ttttcaaacg ttgaaggaat tgttaaatgt 2640 cgcgagtgag aacggtgtgc agttcaattg gaaaaaatcg caatttttaa aggatgaagt 2700 ggagtatctc gggtatgtga ttcgcggcgg gtgttatcgc atagcgccga gtaagttgcg 2760 atcggttcag ctttttccgg aaccgaaaaa tgtgaagcag ctgcaaagat ttttgggact 2820 tacgagttac ttccgaaaat ttattgctgg ttacgcgaca atttcgaagc ctttgacaag 2880 tttgcttcag aaaggtgttg agtttgtgtt tggtgaagag gagcgttcga gttttgatga 2940 gttgaaacgg tgtttggtga ccgatccggt gttaaagatc tacgacgaaa gtgccgaaac 3000 cgagctccat acggacgcgt caaagtacgg ttatggtgct gcgcttatgc agaagagcga 3060 cgacgacaag tttcatcctg ttgccttcat gagtcaacaa acatcaaacg cggagaagaa 3120 ttatagtgcg tatcatttgg aagtgctagc ggtggttcgc gctgttgaga agtttcgtgt 3180 gtatctctta ggcatcaagt ttaagatcgt tacagattgt gcagcgtttg ggcatacttt 3240 aaaatcgaaa gaactgtcgg ctagaatcgc gagatgggct ttgatgctcg aagagtatga 3300 atatgaagtg gtgcataggc caggttcatc gatgaagcat gtggatgcgt tgagcagggc 3360 accggtgatg attgtgaaaa gcgaccctat gatagaagcg atcagaaaaa tgcaacaaag 3420 tgacgagcgt gcgaaggcaa ttattgaatt gttaaaaacr caatcttttg aagattttgt 3480 catgtgcgat gggctgctga tgaaagtagt gaaaggtagg gaagtgattg tggtaccatc 3540 ggggatgcaa agcgatttga tacgtaggat acacgaaaag ggtcacttag gagctcgtaa 3600 gatagagggt attatcgaac aggagtttta cattccaaac gcgagtgaga aaataaaaca 3660 aacgattgag tgttgtgtga aatgtatcct cgcagagcgt aaaaggggaa aagttgacgg 3720 tttattatac ccaatcgcga aaggtgacgt tccgttagac acgtatcacg tagaccattt 3780 gggtccaatg gacattacag agaaaaggta taaatatttg tttgtagtag ttgatgcgtt 3840 tagtaagttt acttggatat atcctactaa aacgacgaat tcatttgaag taattcagcg 3900 attaacgaca cagagcgagg tatttggtaa tccaaggcgt attataagcg ataaaggggc 3960 tgcgtttacg tcaaacgatt ttaagcggta ttgtgaggat caggatatcg agcgtgtgga 4020 agttacgaca ggtgttccgc gcgggaacgg gcaggtagag agggtaaatc aagtgattat 4080 tgctatgttg cgaaagatga gtgtaaacga tcccgcaaag tggtataagc acgttgccaa 4140 tattcagcgg tggattaatt ctagcccaca tcagagcatc ggtgttaccc cttttgaagc 4200 gatgtttggg gtaccgatga gacacgaagg agatttacga ctaggtgagc tgatggaaga 4260 aattcgagtg gcccagcatc acgatcaacg agagcagact cgagctggtg ccagggtttc 4320 tatcgagaaa gcccaagaag aacaacggag atcgtacgac ttacgagcgc ggtcggctac 4380 aacttaccgc gaaggcgacc tggtggtgat taagcggacg cagttcgggc ctggaaggaa 4440 gtacgcggct gagtaccttg gaccatataa ggtaaccagt gttcgtcctc atgatcgcta 4500 tgacgtggaa aagatcaatg gcgaagggcc aaaagtgacg tcgacggctt cgtcacatat 4560 gaaaccctat cggttttaat gcggtgagga tccttcgggg cgaaaggatc ggtcgaggaa 4620 aggccg 4626 // ID GYPSY16-I_AG repbase; DNA; ANG; 5891 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY16-I_AG is an internal portion of retrotransposon GYPSY16_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY16-I_AG; GYPSY16-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY16_AG; mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5891 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY16_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 173-173 (2003). XX DR [1] (Consensus) XX CC GYPSY16_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY9_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, CC GYPSY13_AG, CC GYPSY14_AG, GYPSY15_AG, and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY16-I_AG consensus was reconstructed after multiple CC alignment of 13 copies. CC The consensus encodes the 397-aa GYPSY16_AG1p gag-like protein CC (pos. 967-2157) and the 1231-aa GYPSY16_AG2p (pos. 2133-5825). CC The sequence of the LTRs flanking GYPSY16-I_AG is deposited as CC GYPSY16-LTR_AG. XX FH Key Location/Qualifiers FT CDS 967..2157 FT /product="GYPSY16_AG1p" FT /translation="LSLLFNGSNLMSKLKDKLFQLRLIQHKLEYLPKNFRV FT CTVSKYKILARELLTSIQSDLIELGEAISEEYLNKINSETNECFSAIEFRI FT NNYQMTNDSQTEKMAHFDIKIATSLVQTYDGSKENLDIFIDSVNLLKEITP FT NDQKVMLVKFIKTRITGKARIGLPRDLDNIDEILENIKVRCEEKTNPEIVI FT NQLKTIQARQTTKICEEVEILTSKLKGLYLEQKIPEEVANKMAVKVGVDTL FT KEKISNSETKILLKVGHYPTITAATQKVLENEIENNNSNQVLAFNRNQNFS FT TNNFKRFPNSKTNNYNRYNSNPNFQNRTKRRYYESPNSNRTNPNTYTPNRN FT RDRHFYRNPRQYNQFGRQNTQPNNFSNQQARRIFTTEAEEEEEQNNFLGQA FT RQE" FT CDS 2133..5825 FT /product="GYPSY16_AG2p" FT /translation="FFRPSPAGIEQLPVMTLNLSANNSTTIKVEKAANKYS FT SLFIDTGAAVSLFKIGKIRSQHEIDSNNKVRLTGITNNSLITYGTISSNLI FT FNNATVSHKFHLIPDIIEIEADGILGRDFFTKYKCKIDYEYYLLNFDINGI FT SISHPIEDNSLLIPARSEIIHKIDLKNLKKDSVVHAQEITTGIFCANTIVS FT KNNPCIKFINTTEKDTIINTKFFKPTIEPINNFEIYKVAETKTNMHRVKEI FT LSKINLKNYPDIISQEISKLVNKFSDLFCLNEDPITTNNFYKQTITPKDNI FT PTYIPNYKQIHSQSKEIDQQVNKMLENNIIENSVSPYNSPILLVPKKTGND FT KKWRLVVDFRQLNKKILPDKFPLPRIDTILDQLGNAKYFSTLDLMSGFHQI FT PLEKESRKYTAFSTPKGHYQFTRLPFGLNISPNSFQRMMAIAMAGLTPEIA FT FIYIDDIIVIGSSMKNHIENLTQVFNRLRYYNLKLNPEKCKFFKSDVTYLG FT HKITDQGIYPDETKFETIRKFPIPTNADEVRRFVAFCNYYRKFIQNFAKIA FT KPLNDLLKKGTLFEWKKEQQTAFDSLKEHLLSPTILKYPDFSKDFIITTDA FT SNFACGAVLSQITNGQDFPIAFASKTFTKGEKNKSIIEKELVAIHWAVKHF FT KPYVYGRKFTIRTDHRPLVYLFGMKNPTSKLTRIRLDLEEFDFTIQFVAGK FT SNVAADALSRIVYNSDELKELIPKTKEEDIKSIKNNSIMVVNTRAMVKAQK FT KVKDTPTAGRNKEEVTNQVFWKTDRPSEVNKILKIKSEVKRNGIEIKVCSH FT NYNKQLGKRVIPLNNSKGQNGSQEIELALLDICQIAKKFNRDKLAISVEDK FT LFEYYSQLTIKEIADTAIFGYEIILFDPPKWITSEKEKLKIMEEYHAIPTG FT GHIGQYRLYLKIREKFKWNNMKQDIISYVQKCEICKVSKVNKHTKEKSIIT FT STPAKPFTIISIDTVGPLPKTINNNRYAITIQCDLSKYIVIIPIKNKEANT FT IAKALVEHFILTYGSFLELRSDQGLEYNNEVLSKIAQILKIKQTFSTPYHP FT QTIGALERNHRSLNEYLRAFTNEHHDDWDDWIKFYEFVYNTTPHTETKYTP FT FELVFGRTANLPQEIYKHKIDPVYNIEQYYNEMKFKLQKSNEIARRNIIIE FT KEKREKELNQHINPIDVSIGDLVYLKNENRKKLDPLYLGPFIITNIKDPNC FT TIKNKHTQKTTTVHKNRLIKN" XX SQ Sequence 5891 BP; 2534 A; 943 C; 898 G; 1516 T; 0 other; tggcgaccgt gacacagagc caaagcaaag gaatacgaac aaaacactga agtaagcata 60 acgaaaggct gcaggttata aaaatcccgt tggacacggt atgtaaacac tatggtaaaa 120 tttatatttt gaaaacaaaa ataccatagc gcgaaaagcg gacgaaggta tgtgtaaaaa 180 gtatcgcaac aatcgcgagt gcaaatccgt cgtaaacact attatgcaca atacatatac 240 actaagctca caaacactaa agtattcggc atagatgtac gtttagatca cattacactt 300 ataaacacaa aagccgatac gcgtcgtaaa gtgagatgca ataatgtgca gtgacaccaa 360 tacaaattag ttccgtgata caacaactgt gaaaaaactc acccgaaact gcatgaatat 420 cccgaaacat caccaataca gtgctaaaaa tgaaatcaaa tgtgaattaa aagcaacaac 480 aattacacat gcagtgacac ataaataccg aaccgaaata ctaatcagaa tagtgcacaa 540 gaaagtgtga aactaagtgc gctaagtgaa accgaactga acaatcccat aatcgcagca 600 gcagcagcac ttccacaacg aacaacacga ggaaagtgtt aagaacgata gaacgcgagc 660 aacaatacgc gaggaagcag tcaacgacgc catcgcagta tacacatcga gcagctgtta 720 aactacacat cgtgcatgcc gtaggacacg caaacgataa gtttcgttaa gtcaggtatt 780 gtaaaattaa cactatggga tatatatgaa tattcagtaa aaaaaatata tatatataaa 840 aaaaaattaa taaaaaaaaa gtaaataaaa aaaaataata ataataaaaa aaataggaac 900 aattatagtt tttgcgaaat aattttattg gacttaaacg ttaattaatt atgttattta 960 ttatgattgt cgttattatt taatggatcg aatttaatga gtaagttaaa ggataaatta 1020 ttccagttaa ggttaataca acataaatta gagtatttac caaagaattt tagagtttgt 1080 actgtaagta aatataaaat tttggctaga gaactattaa catccataca atcggatttg 1140 atagagctcg gagaagccat aagtgaagag tatttaaaca aaattaacag tgaaaccaac 1200 gaatgctttt cagcgattga gtttcgcata aacaattacc aaatgacaaa cgattcccaa 1260 acagaaaaaa tggcgcattt cgatattaaa attgcaacat ctcttgttca aacatatgat 1320 ggatcaaaag agaatcttga cattttcata gattctgtaa acttacttaa agagattact 1380 ccaaatgatc aaaaggtaat gctggtaaaa tttatcaaaa cacgaataac agggaaagcc 1440 agaataggat taccacgaga tctggacaac atcgatgaaa ttctagaaaa tataaaagta 1500 agatgtgaag aaaaaacaaa tcccgaaata gtcatcaacc aattaaaaac tatccaggca 1560 agacaaacca ctaaaatttg tgaagaggtc gaaatcttaa ctagcaaatt aaaaggactt 1620 tatctagaac aaaaaattcc cgaagaagta gcaaacaaaa tggctgttaa agttggagtt 1680 gacacactta aagaaaaaat ttccaactct gaaacaaaaa ttctacttaa agtcggacat 1740 tacccaacaa ttacagcagc tacacaaaag gtattagaaa atgaaatcga aaacaataat 1800 tcaaatcaag ttttagcctt taatagaaac caaaattttt ccacaaataa tttcaaaaga 1860 tttcctaaca gcaaaacaaa taattataat agatataact cgaacccaaa cttccaaaac 1920 agaactaaga ggaggtatta tgaaagtcca aattcaaata gaacgaaccc aaatacatat 1980 accccaaata gaaacagaga tagacatttc tataggaacc ctagacaata taatcaattt 2040 ggaagacaaa atacacagcc caataacttc tcaaatcaac aagctcgtag aatatttact 2100 actgaagctg aagaagaaga ggaacagaat aattttttag gccaagcccg gcaggaatag 2160 aacagctgcc ggtaatgaca ttgaatttat ctgccaataa ttcaactaca attaaagtag 2220 aaaaagctgc aaataagtat agttcattat ttatagacac aggagcagca gtttcattat 2280 ttaaaatagg aaaaattaga tctcagcatg agatagatag caataacaaa gttagactta 2340 caggaattac taataactct ttaattacct atggtacaat tagttcaaat ttaatattta 2400 acaacgcaac agttagtcat aaatttcact tgattccaga tattatagaa atagaagccg 2460 atggaatttt aggaagagat ttctttacaa aatataaatg taagatagat tatgaatatt 2520 atttgttaaa ttttgatatc aatggaatct caatatctca tcctatagaa gacaattcac 2580 tcttaattcc agcaagaagt gaaattattc ataaaataga tttaaagaat ttaaaaaaag 2640 attcagtagt tcatgcacag gaaataacaa caggaatatt ttgcgctaat acaatagttt 2700 ccaagaataa tccatgtata aaatttatta atactactga aaaagatact attattaaca 2760 cgaaattttt caaacctaca atagagccaa ttaataactt tgaaatttat aaagtagcag 2820 aaactaaaac aaatatgcat agagttaaag aaatattaag caaaatcaat ttaaaaaatt 2880 atccagatat tattagtcaa gaaataagta aactagtaaa taaattttct gatttatttt 2940 gtttaaatga agacccaatt acaactaata atttttataa gcaaacaatt actcctaaag 3000 acaatattcc tacatatata cccaactata aacaaattca ttcgcaaagc aaagaaatag 3060 atcagcaagt aaataagatg ttagaaaata atataataga aaactccgtt tccccttaca 3120 attcaccaat tctgcttgtc cccaagaaaa caggaaacga taagaaatgg agattagtag 3180 tcgattttag gcaattgaac aaaaagatat taccagataa gtttccactt ccgagaatag 3240 acacaatatt agaccaatta ggaaatgcaa aatatttcag cactctagat ctgatgtcag 3300 gttttcatca gatccctttg gaaaaagaat cgcgaaaata tacagcattt tctacaccaa 3360 aaggacatta tcaattcaca agactgcctt ttggactaaa cataagtcct aacagttttc 3420 agcgtatgat ggcaatagca atggctggcc tcacacctga aatagcattc atttatatcg 3480 atgatattat agtaatagga agctcaatga aaaatcatat agaaaatctt acacaagtat 3540 ttaatagatt acgatactac aatcttaaat tgaacccaga gaagtgcaaa ttctttaaat 3600 cagatgttac atatcttgga cacaaaataa cggaccaggg aatataccct gatgaaacca 3660 aattcgaaac aataagaaag tttcctatac caactaacgc agacgaagtt agaagattcg 3720 tagctttctg taattactac agaaaattta tacaaaactt tgcaaaaatt gcaaagccac 3780 taaatgattt attaaagaaa ggcacacttt ttgaatggaa aaaggaacaa caaacagcat 3840 ttgattcatt aaaggagcat ctgctatctc caaccattct taaataccca gactttagca 3900 aagattttat tataactaca gacgcatcaa actttgcgtg cggagcagtt ctctcccaaa 3960 tcacaaatgg gcaagatttt cctattgcat tcgcaagtaa aacttttact aaaggcgaaa 4020 agaataagag tataatagaa aaagaattag tagctataca ctgggcagta aaacatttta 4080 aaccatatgt ttatggacgg aaatttacaa tcagaacaga ccatagacca ctggtttatc 4140 tctttggtat gaaaaatcca acgtcaaaat taaccagaat tagactagat ctggaagaat 4200 ttgatttcac aatacaattc gtagctggaa aaagtaatgt tgcagcagac gcattatcac 4260 gaatcgtgta caattcagat gagttgaaag aattgatccc aaaaactaaa gaggaggata 4320 ttaaaagcat taaaaacaat tcaattatgg tagttaacac aagagcaatg gtaaaggcgc 4380 aaaagaaagt aaaggataca ccaacagctg gaagaaataa agaagaagta actaatcaag 4440 tattttggaa gacagataga ccgtcagaag ttaataagat tttgaaaatt aagtcagaag 4500 ttaaaagaaa tggaatagaa atcaaagtat gtagtcacaa ctataataag caacttggaa 4560 aaagggttat ccctcttaac aattctaaag gccaaaatgg aagtcaagaa atagaacttg 4620 ctcttctaga tatatgccaa attgctaaga aatttaaccg agacaagtta gcaatatcag 4680 ttgaggacaa attattcgaa tattactctc aattaaccat aaaggaaata gcagacacag 4740 caattttcgg ttacgagata attttgtttg atccacctaa atggataaca agtgagaaag 4800 aaaagttgaa aataatggaa gaatatcatg ccatacccac aggaggtcat atagggcaat 4860 acagattata tttaaaaatt cgtgaaaaat ttaaatggaa taatatgaag caagatataa 4920 taagttacgt gcagaaatgt gaaatatgca aagtcagcaa agtaaataag cacacaaaag 4980 aaaaatctat aatcacatca actccagcaa aaccatttac catcatatca attgacacag 5040 ttggtccatt acccaaaaca ataaataaca acagatatgc aattacgatt caatgcgacc 5100 tatcaaaata tatagtgatc attccaatta aaaacaaaga agcaaacacg attgcgaaag 5160 cattggttga acattttatc ttgacttatg gaagcttttt ggagcttaga tcagaccaag 5220 gattggaata taacaatgag gtgttaagta aaatagccca aattttaaaa ataaaacaaa 5280 cattttcaac gccatatcat ccacaaacaa ttggagctct agaacgtaat catagaagct 5340 taaatgaata ccttagagca tttactaatg aacatcacga tgattgggat gattggataa 5400 aattttatga gtttgtttat aacacaacac cacatacaga aacaaaatat actccgtttg 5460 aattagtttt cggtagaaca gcaaatctac cacaagaaat ttataaacat aaaatcgatc 5520 cagtttataa tatagaacaa tattacaatg aaatgaaatt taaactacaa aagtccaatg 5580 aaattgcacg taggaacata attatagaaa aagaaaaaag agaaaaagaa ttaaatcagc 5640 acataaatcc catagatgta agcataggag acttggtata cttaaaaaat gaaaatagaa 5700 aaaaattaga tccattgtac ttaggtccat ttataattac gaatattaaa gatccaaatt 5760 gtacaatcaa gaataagcat acgcaaaaga ctacaacagt acataagaac agattaatca 5820 agaactaaat gaataacgca attctttcgc tcattcactc aatcttacgt tattcacaaa 5880 aacgggggag g 5891 // ID GYPSY40-I_AG repbase; DNA; ANG; 4904 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY40-I_AG is an internal portion of retrotransposon GYPSY40_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY40-I_AG; GYPSY40-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY40_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4904 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY40_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 72-72 (2004). XX DR [1] (Consensus) XX CC GYPSY40_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY41_AG, GYPSY42_AG, GYPSY43_AG, GYPSY44_AG, CC GYPSY45_AG, CC GYPSY46_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY40-I_AG consensus was reconstructed after multiple CC alignment of 5-7 copies. CC The consensus encodes the 346-aa GYPSY40_AG1p gag-like CC polyprotein (pos. 222-1259) and the 1115?aa GYPSY40_AG2p CC pol-like polyprotein (pos. 1217-4561). CC The sequence of the LTRs flanking GYPSY40-I_AG is deposited as CC GYPSY40-LTR_AG. XX FH Key Location/Qualifiers FT CDS 222..1259 FT /product="GYPSY40_AG1p" FT /translation="MPRFQPFPKVKALSFVKKMEPADLHTFMANFSAMEQR FT LNALQLKNGELFRQLEQQTALASPGPSGGYQNDLFKVPDPLKSIPLFDGNK FT MHLASWIATAKKTLSMYQPIVSTAVYAMYEQIVINKVEGKARDSLCVNGNP FT TSFDEVIEVLNATFGDKKDMATYQTMLWSMKMDSSIHFYYKRTKEIAHDMK FT CLARKKDLYCNHWQAVNDFIDQECLAAFIKGLNKEYFGYVQAAKPEDLESA FT YSFLCKFQNLEHTNKILAPRAETKISYQKQFDPSSKNRNTNAFTNTIRKPN FT SSEKLNSSRQKHTPMEVDTTRTGVKLFTHTSEDTEQHYDNGVNFQEVFAPL FT PQR" FT CDS 1217..4561 FT /product="GYPSY40_AG2p" FT /translation="RSKFSRGLCTSASEVNAVNGMPYILLHIDNKCAKILV FT DSGSNKNFISPNVFPKSMEKKCKPISIANSNGTHYANKKVDTKMFNSDVTF FT FLFKFHNYFDGILGYESLSELQAVLDTGKHKLIFPNTSIDLSIRQLNPETH FT TVEANSVTQIKVPVSHESGNVFLPDDVRINDIVIPSGLYVAKNSNITALAS FT NFGRKMVFFFTKPVVTFKVDKIAEVNNTIQTPDTISKTEIENMIKTDNLNQ FT EEKEHLLKVLYENISIIPRNNEKLSCVTSVKHIINTVDEIPVHCKSYRYPH FT IHKAEIQKQINEMLADGIIQHSISPWTSPIWIVPKKPDSNGAKQWRIVVDY FT RKLNEKTIDDKYPIPNIEEILDKLGRSMYFTTLDLKSGFHQIEVELKDRPK FT TAFSTEKGHFEFIRMPFGLKNAPSTFQRAMNNILNELIGTCCLVYLDDIIV FT FGSSLQQHIENLQKVLHRLKQANLKIKIDKCEFLQKECEFLGHIVTQEGIK FT PNPNKIEKIVSWPLPKTVTQIKAFLGILGYYRKFIKDFAKLTKPLTKCLKK FT DSKIVHDEIFINCFRDCKQMLVMDPILKYPDFSKKFILETDASDFALGAVL FT SQRFEDSKEHPIAYASRTLNDTECRYSATEKELLAIIYAIKHFRPYIYGNK FT FEIRTDHKPLLWLRQKNDLNRKLLRWKLELEEFEFDIKYKKGTTNSNADSL FT SRIEPAINANVSESLMTQHSSNTDDNDYIPSTERPINEFRNQIILEQTNES FT STETLKQFPKYSKIIIKKPSFTTDNLVQIIQKYAAPNCLNGIYCDKNILKM FT LHEVYKKYYSRAKSLKFRWTTKMLVDVTDELEQDQIIQTYHDTNHRGVTES FT TKHLQRKYFFPQMKSKITKYINLCALCKKSKYERHPYKIKYKQTDTPKRPL FT EIVHTDIFIMKDKHYLTFCDKFSRLALAVPIKTRYTVHILKGISTFIATVG FT KPLLLIMDQECSFKSIAVQSFLNDNLIKYHYTSVAQSSSNGTVEIVHRTIR FT EIHNILSQKESTKDLSESTKINLAVATYNDSNHSETNLTPNELFRGFRNDQ FT PISSILDEHIQAKEKLYAVVHHKMLEQKEKRIAKLNEKREEPIGLSEGETV FT FLRKKTI" XX SQ Sequence 4904 BP; 1810 A; 919 C; 870 G; 1305 T; 0 other; ggcgcagccg ttcgggcctt tcctaagtgt gttttagttt accgagtgat aagtgcagaa 60 acctgcgatt tactttgtgg tatcatccgg tgtaaaccat ctcccgaccc gtggacgact 120 tgcggtgata ttcaggtaaa ctaaaacacc ctggaggatt tcaatccgga gaattggcac 180 cgtacctact tccactgaaa taagcattgt gttaaaagct tatgcctaga ttccaaccct 240 ttccgaaagt gaaagcatta tcttttgtga aaaaaatgga accagccgat ctccatacgt 300 tcatggctaa tttttctgcc atggagcagc gtttaaatgc tcttcaatta aaaaatggcg 360 aattattcag acagcttgag caacaaaccg cgttggcttc ccccggacct tcgggtggtt 420 atcaaaacga tcttttcaaa gttcctgatc ctttgaaaag cattccatta ttcgatggaa 480 ataaaatgca tttggcatca tggattgcca cagcgaaaaa aacactgagc atgtatcagc 540 cgattgtttc taccgcagtt tacgctatgt acgagcaaat agttataaat aaagtagaag 600 gcaaagctag agatagttta tgtgtgaacg gtaatcctac atcatttgat gaggttatcg 660 aagtgttgaa tgctacgttc ggtgacaaaa aagatatggc tacctatcaa acgatgctgt 720 ggtctatgaa aatggatagt tcgatccatt tttactacaa gcgtacaaaa gaaattgcac 780 acgatatgaa gtgtttggca agaaaaaaag acttgtattg taaccattgg caggctgtaa 840 atgattttat agaccaagaa tgtcttgccg cttttataaa agggctaaat aaagaatatt 900 ttggctacgt gcaagctgct aaacccgaag atttggaatc tgcatattcg ttcctttgca 960 aatttcaaaa tctggaacac acaaataaaa tactagcacc tcgtgccgaa acgaagattt 1020 cttatcagaa acagtttgac ccttcatcaa aaaatcggaa tacgaacgca ttcactaaca 1080 ccattagaaa gccgaatagt tccgagaagc taaacagcag taggcaaaag cacacaccga 1140 tggaagtcga tacgacaaga acaggtgtaa aactattcac gcacactagc gaggatactg 1200 aacagcacta tgataacgga gtaaattttc aagaggtctt tgcacctctg cctcagaggt 1260 aaacgcagta aacggaatgc cttacatatt gttacatatc gataataaat gcgcaaaaat 1320 tttggtcgat agtggatcga ataaaaattt catttcccct aacgtatttc ccaaatcgat 1380 ggaaaaaaaa tgcaaaccaa tatcaatagc aaatagtaac ggcacgcatt acgcaaataa 1440 aaaggtcgat acaaaaatgt tcaattcgga tgttacattt tttttgttca aatttcataa 1500 ttattttgac ggcattctag ggtacgaaag cctttctgaa ctccaagctg ttttggatac 1560 gggtaaacat aaacttatct ttcccaatac aagcatagac ctgagtatac gacaactcaa 1620 cccagaaacg cacacagtgg aagccaattc agtaacacaa atcaaggtac cggtttcaca 1680 cgaaagtggc aatgtgtttt tacccgatga tgttcgcatt aacgatatcg ttattccgtc 1740 aggactttat gttgcaaaaa atagcaatat tacagcactt gcaagcaatt ttggaagaaa 1800 aatggttttc ttttttacca aaccagtagt aactttcaaa gtcgataaaa ttgccgaagt 1860 taacaatacg atacaaacac ctgatactat ttcgaaaact gaaatagaaa atatgattaa 1920 aacggataat ttaaaccaag aagaaaaaga acatctgtta aaagtgcttt atgaaaacat 1980 ttccatcatt cccaggaaca acgaaaaact atcctgcgtt acatccgtta aacacattat 2040 taacacggta gatgagatac ctgtgcattg caaatcatac agatacccac acatccacaa 2100 agctgaaatt caaaagcaga taaacgagat gcttgctgat ggcattatac agcattcgat 2160 atctccgtgg acttcaccaa tttggatagt gccgaaaaag cccgattcaa atggtgcaaa 2220 acaatggcga atcgtcgttg attatcgcaa acttaacgaa aagacgattg acgataagta 2280 tccgatcccg aatattgagg aaatcttaga taagctaggt cgcagcatgt acttcacaac 2340 actagaccta aaatctggtt ttcaccaaat cgaggtagaa ctaaaagaca gaccgaaaac 2400 tgctttcagc accgagaaag gtcatttcga attcattcgt atgccttttg gtttgaagaa 2460 cgccccttct acgtttcagc gagcgatgaa taacatacta aacgaattga tcggtacatg 2520 ttgtttagtc tatttggatg acattatcgt tttcggaagc tcactgcagc agcatatcga 2580 aaatttacag aaagttcttc acagattgaa acaggccaat ttaaaaatta aaatagataa 2640 atgcgagttt ttgcaaaagg aatgcgaatt tttgggacat attgtgacgc aggagggtat 2700 caagcctaat ccgaataaaa ttgagaaaat tgtatcgtgg cctcttccta aaactgttac 2760 acaaattaag gcatttcttg gaattttagg gtattaccgt aaatttatta aagattttgc 2820 taagttaacc aagccattaa caaaatgcct aaaaaaagat tctaagatag tacacgatga 2880 aattttcata aactgcttta gggattgtaa acaaatgtta gttatggatc ctattctcaa 2940 atatcctgat ttttctaaaa aattcattct tgaaacagac gcaagtgatt ttgcgttggg 3000 tgcagttctc tcgcaacgat tcgaagattc aaaagaacac cctatcgcgt atgcttccag 3060 aactctcaac gatacagagt gtagatattc tgccacagaa aaagaattgc tcgccattat 3120 ttacgctata aaacatttca ggccatacat ttatggcaat aagtttgaaa tccgtacaga 3180 ccataaaccc ttattatggc ttaggcagaa aaacgattta aataggaaat tattacgctg 3240 gaaattagag ctggaagaat tcgaattcga tataaaatac aaaaaaggaa ctacaaatag 3300 taacgctgac tcactctcac gaatagaacc agcaataaat gctaatgtat ctgaaagctt 3360 aatgactcaa cattcatcta acacagatga taatgattac attccaagta cagaaagacc 3420 gatcaacgaa ttcagaaatc aaattatttt agaacaaact aatgaaagta gtacagaaac 3480 gttaaaacaa tttccaaaat attctaaaat aataattaag aaaccctctt ttactacaga 3540 taatttagtt cagataattc aaaaatatgc cgcacccaat tgtttaaatg gaatatattg 3600 tgacaagaac attttgaaaa tgctgcacga agtatacaaa aaatattatt ctagggctaa 3660 atcattgaaa tttcgctgga ctacaaaaat gcttgtagat gtaacagatg aactagagca 3720 agatcaaata attcaaacgt atcacgatac caaccataga ggagtaacag aaagtacaaa 3780 acatcttcag agaaagtatt ttttccccca aatgaaatcc aagattacta aatacataaa 3840 cctgtgcgcg ttgtgtaaaa aatctaaata tgaaagacac ccgtataaga tcaaatacaa 3900 acaaacggat acacctaaaa ggcctttaga aatcgttcat accgacatat ttatcatgaa 3960 ggataaacat tacctcacat tttgtgacaa gttctcacgg ttagctctgg cagtccctat 4020 aaaaacacga tatactgtgc atatactgaa aggaatatcc acatttatcg ccacagtagg 4080 aaaacctttg ctactaataa tggaccaaga atgcagcttc aaatcaattg cagttcaatc 4140 gtttttgaac gacaatttga taaaatacca ctacacctct gtagcgcaat catcatctaa 4200 tggcacggtt gaaattgtgc atagaacaat aagagaaata cacaacatat tatctcagaa 4260 ggaaagcaca aaagacctat cggaatcaac aaaaattaat ttagcagtag ctacttataa 4320 cgattctaat cattccgaaa ccaacctcac gccgaatgaa ttgttccgtg gttttagaaa 4380 tgatcaaccg atatcatcga ttctcgatga gcacatacag gcgaaggaga aactttacgc 4440 agtagtccat cataaaatgc tagagcaaaa agaaaagagg attgctaagt tgaacgaaaa 4500 aagagaagaa cctataggcc tttcagaggg agaaaccgtg tttctgagga aaaaaacaat 4560 ttgaaacatc aagagcgtta taaagaaatc caggtattag aagataaaga cgttaacttt 4620 atagattatt gcggtagaaa atatcataaa gaaaaactca aaagaagatg tatgaactaa 4680 aatattttgt aaaaataacc attagtaata cgcagcaata tagttagggt tagataggtc 4740 tccaatatta tgttatagtc ctaagatttg atttataacg ttttaagaaa aaaagatgag 4800 aaagaaccac aagcgggacg aagatagcaa acctacatgc cgaactaatt gcatacggtg 4860 aaactccaaa gtgccgagga cgacactttt ctaccccccc gaga 4904 // ID GYPSY53-LTR_AG repbase; DNA; ANG; 300 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY53-LTR_AG is an LTR of retrotransposon GYPSY53_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY53_AG; CsRn1 lineage; GYPSY53-I_AG; GYPSY53-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-300 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY53_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 99-99 (2004). XX DR [1] (Consensus) XX CC GYPSY53-LTR is a long terminal repeat of GYPSY53_AG (its internal CC portion is deposited as GYPSY53-I_AG). XX SQ Sequence 300 BP; 92 A; 75 C; 41 G; 92 T; 0 other; tgtggcaacc ttgcgtatta ctgcaacgca ctgtaaagtg ccgaaagctt ttaacctttg 60 taaggttccg tacacacctt tacgcttccg tgtctattgc gaaactaata cgcgttcctc 120 tattaacact aatactacaa acttatacta cctacgctat gtaaggatca ttacactact 180 accacgaaat cttccagttg aatttgtaca gaataaaatt aagttagtct tatctcgcaa 240 tctatctcga tcacatctct ttccccgtta aattcctaca aatcgaaaga acgcgctaaa 300 // ID GYPSY28-I_AG repbase; DNA; ANG; 4216 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY28-I_AG is an internal portion of retrotransposon GYPSY28_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY28-I_AG; GYPSY28-LTR_AG; GYPSY28_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4216 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY28_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 21-21 (2004). XX DR [1] (Consensus) XX CC GYPSY28_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, CC GYPSY24_AG, GYPSY25_AG, GYPSY26_AG and GYPSY27_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY28-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 1387-aa GYPSY28_AGp gag-pol like CC protein CC (pos. 16-4176). CC The sequence of the LTRs flanking GYPSY28-I is deposited as CC GYPSY28-LTR_AG. XX FH Key Location/Qualifiers FT CDS 16..4176 FT /product="GYPSY28_AGp" FT /translation="MVDQASNAGNVPSTSSSIIPNAAAAAGTVPPPNFAME FT PFDKRKSKWMRWVDRLETAFDIYRVTEERDKKNYLLHYMGTETYDVICDKV FT APAAPRDCTFQQIVDTLKDYFSPQPLEIAENYKFNSRRQGDKDAVTADESV FT DEYLVALRRLAASCNFGDYLEKALRNQLVFGIKRGDIRDRLLEKRTLTLQE FT ARDIAVSMELSRKGRTEIEGCSAKQEMHAVYQSKGKGWKPNKGVDKPSTST FT EKSNCYRCGDKSHLANACKYKNIICSFCKTKGHLAKVCFKKKKNEAHSTGS FT AAAQTNYVEQQGNDATIESTNIRDVCTVGSSTHCKKLWLNMKVNKKTIKFE FT IDTGSPVTIISEEDQKRLFPDAQLRACTTNLVSYCNTPIDVCGILDVDVQL FT ARHTLKLPLYVAKTTKHPLVGREWLSEIPLDWNKVVADSKSVNKIESDSMC FT SVADCNALLERYPRVFEASIGRISNVQANLRLKENARPVFIKARKLPFNMI FT KVVENELDKLVQEGVLVKVDSSDWATPIVPIKKSQNRVRICGDYKQTVNPN FT LVVDRHPLPSVDELFASLAGGKKFSKIDLVQAYLQLEVAPADREILTLSTH FT RGLYRPNRLMYGIASAPAIWQRQIESILQGIEGVSVFLDDIKITGETDEIH FT LFRLEEVLRRLNEREIRVNKEKCEFFVDQIEYCGYVIDAKGIHKIQKKVDA FT IQEMPKPRNKDEVRSFVGLVNYYGRFLQNLSTILYPLNNLLKTDVPFQWNK FT QCEDSFKKVKEQMQSDNCLVHYSTELPLLLATDASPYGVGAVLSHIYPDGS FT ERPIQFASQTLNRVQQKYMQVDKEAYAIVFGVRKFFQYLYGRTFTLLTDNQ FT AISKIFGEHKGLPVMSALRMQHYATYLQSFDYKIRFRKSTDHANADAMSRI FT PLHITDPDNEIEESDSIELSQIETLPLTAVELEQAVAKDQQVQKLIQGIKH FT GRVVEAKDRFGVDQQEFAIQKGCLLRGIRVYVPEILRRKVLDELHSAHFGI FT TRTKSLARGYCWWPGMDQDIERMVANCTECQLIRAEPAKMKLHCWETPSAP FT FQRVHVDFAGPYKDLYFFIFVDAYSKWPMVKVCKSITAEQTVNMCRELFST FT FGIPSVLVSDHGVQFTSTLFQQFLKMNGVVHKMGAPYHPATNGQAERYVQT FT IKNKLKALKCSTSKINLELCNILLTYRKTVHPATGKSPSMLVFGRQIRSRL FT DLLIPTQESSRKAEIPVRSFQIGARVRVRDFLSHDKWKFGKISARVGKLRY FT EVKLDDGRVWERHIDHIAEVGADLRGTSAVNQESDLREQGDTTYFEEVLPS FT TSNANTAVTTSDSEITVENTPGSSDRDADRPVERNNDPGMERDLTPAPVRP FT INLEEESVLRRSTRARKPPQRLNL" XX SQ Sequence 4216 BP; 1331 A; 850 C; 1015 G; 1020 T; 0 other; gttttggcga cgaggatggt tgatcaagca agtaacgcgg gcaacgtgcc atctacgagt 60 tcttcaatta taccgaacgc agctgccgct gctggtacag ttccacctcc gaattttgct 120 atggaaccat ttgacaaacg caaatcaaaa tggatgcggt gggtggatcg actggaaact 180 gccttcgaca tctacagagt gaccgaagag cgggataaga agaactactt gctacattac 240 atgggaaccg aaacttacga cgtcatttgc gacaaggttg ccccggctgc acctcgtgat 300 tgtaccttcc agcaaattgt ggatacgctg aaagattatt tcagcccgca acccttagaa 360 attgcagaaa actacaagtt taacagtcgc cgccagggtg acaaggatgc agttacagcc 420 gatgagtcgg tagatgaata tttggtagcg ctacgcaggt tggccgcatc ttgtaacttc 480 ggtgattact tagagaaggc gttgcgtaac cagctggttt tcgggataaa aagaggagac 540 attcgcgatc gacttctaga aaagcgaacg ttgacgttgc aggaggctcg ggacattgca 600 gtcagcatgg aattgtcccg gaaaggtaga accgagatcg aaggttgctc agcgaaacaa 660 gaaatgcatg cggtgtacca gtctaaaggt aaaggttgga agccaaataa gggtgtggac 720 aaacccagca cgagtacgga gaaatctaat tgctaccgtt gcggagacaa atcacatttg 780 gcaaatgcat gtaagtacaa aaacataatt tgttcgtttt gtaaaacaaa gggacatttg 840 gcgaaagtgt gttttaaaaa gaagaaaaat gaagcacatt cgacaggcag cgcggcggct 900 caaacaaatt acgtggaaca gcaaggcaat gacgcaacaa tagaaagcac aaacatacga 960 gatgtgtgta cagttggttc atctacacat tgcaagaaac tgtggttaaa tatgaaggtg 1020 aacaaaaaga caattaaatt cgaaatagat acaggttcac cagttacgat cattagtgaa 1080 gaagatcaaa aaaggctttt tcctgacgca cagctacgtg cgtgcacaac caaccttgta 1140 agctattgta acactccgat cgacgtatgt ggcatacttg acgtagatgt acagttagca 1200 agacatacat taaagctgcc gctgtatgta gcgaaaacta cgaaacatcc cttggtaggc 1260 cgtgaatggt tgtcagaaat accactagac tggaacaagg ttgtggcgga ttcgaaatcg 1320 gttaacaaga ttgaatcgga ttctatgtgc agtgtcgcag attgcaacgc attgcttgag 1380 cgatatccaa gagtttttga agcttcaatc ggtcgcatct ccaatgtgca ggctaacttg 1440 cgactaaagg aaaatgcacg accagttttc atcaaggcac ggaaactgcc atttaacatg 1500 ataaaagtgg tagaaaatga attggacaag ttggttcagg aaggtgtact ggtaaaggta 1560 gattcaagtg attgggcaac accaatcgta ccgataaaga aatcacagaa tcgggtgaga 1620 atttgtggcg attataaaca gaccgttaat ccgaatctgg ttgtagacag acacccccta 1680 ccgtcagtag atgaactttt tgcatcactt gcaggaggaa agaaatttag taagattgat 1740 ctggtacaag catatttaca gctggaagtg gcaccggcag atcgggaaat actgacatta 1800 tctacacatc gtggtctgta tcgacctaat agattgatgt atggcattgc atcagctccg 1860 gcaatctggc aaagacaaat cgaatcaatt ttacaaggta ttgaaggagt cagtgtcttt 1920 ctagatgaca tcaaaatcac aggggaaacc gatgaaatac acctttttag actcgaagaa 1980 gtgctacgtc gactcaatga acgggaaata cgcgtcaaca aagaaaagtg tgaattcttc 2040 gttgatcaaa tagagtactg tggctacgta attgatgcga agggcataca taaaattcag 2100 aaaaaagtcg acgccataca agaaatgccg aaaccacgaa acaaggacga ggtacgctcg 2160 tttgtcgggt tagtaaatta ttatggtaga ttcttgcaga atctgagtac catactttat 2220 ccattgaaca atctactcaa aaccgacgtg ccatttcaat ggaataaaca atgtgaagat 2280 tccttcaaga aggtaaagga acaaatgcag tcagataatt gcttagtaca ctactctacg 2340 gagctaccgt tgctactggc cactgatgct tctccctacg gggtcggagc agtactgagt 2400 cacatttacc ccgatggttc cgagcgtccc atacagtttg cttcacaaac attaaatcga 2460 gtacagcaga aatacatgca agttgacaag gaggcctatg ccatcgtctt tggtgttagg 2520 aaatttttcc aataccttta cgggcgcaca ttcactttac ttacggacaa ccaagctata 2580 tcgaaaatat ttggagagca caaagggttg ccggtgatgt ctgcattacg gatgcaacat 2640 tacgctacgt acttgcagag tttcgattac aagatccgat ttaggaagtc taccgaccat 2700 gcaaacgccg atgccatgtc acgcattccg ttacatatta ctgatccgga taacgagatt 2760 gaagaatcgg attcgattga gctaagtcaa attgaaacgt taccgttgac agctgttgaa 2820 ttagagcaag cggttgctaa agatcaacag gttcaaaaac tgattcaagg aatcaaacat 2880 ggccgggtag tagaagcgaa agatcgtttt ggagttgatc agcaagaatt tgccattcaa 2940 aaaggatgtt tgctacgggg tatccgagtc tatgtgccag aaatcttacg cagaaaagtg 3000 ttggacgaat tacattcagc acattttgga attactcgaa ctaaatcgct tgcaagaggt 3060 tactgttggt ggccgggaat ggaccaggat atagaacgaa tggtagcaaa ctgtactgaa 3120 tgccaattga ttcgagcgga accagctaaa atgaagttac attgttggga gacaccaagt 3180 gctccatttc aacgagtaca cgttgatttc gccggtccat ataaggactt atattttttc 3240 atcttcgtcg atgcttacag caaatggccg atggtaaagg tttgcaaatc aataacggca 3300 gaacaaaccg taaacatgtg tcgcgagttg tttagcacat tcggaattcc atcagtactg 3360 gtaagcgacc atggcgtaca atttacttct acgctgttcc aacaatttct taaaatgaat 3420 ggcgtggtgc ataagatggg tgcaccttat caccctgcta cgaatggcca ggcggagcga 3480 tatgtgcaaa caattaagaa caagctgaaa gcattaaaat gctccacatc gaagatcaat 3540 ctggaattgt gcaacatttt acttacctac cgaaaaacag tacacccagc gacgggaaaa 3600 tcaccatcca tgcttgtttt cgggaggcaa attcgatctc ggttagacct attgattcct 3660 acgcaagagt catcaaggaa ggcagagata ccagtacgta gtttccagat cggagcaaga 3720 gtgcgagtac gagatttcct gtcccacgat aaatggaaat ttggaaaaat ttctgcgcga 3780 gttggaaagc tgcgatacga agtgaaacta gatgacggtc gtgtctggga acgtcacatc 3840 gatcacattg cggaggtggg tgctgatcta cgaggtactt cggcggtcaa ccaggagagc 3900 gatttacggg aacaaggcga tacaacgtac tttgaagagg tacttccttc cacttcgaat 3960 gcgaatacgg cggttactac cagcgacagt gagataaccg tggaaaatac accagggtct 4020 tctgatcgag atgccgatag acctgtagaa cggaataacg atccgggcat ggaacgggat 4080 ctgacaccag cgcctgtccg accgatcaac ttggaagagg agtcggtcct acgccgttct 4140 actcgtgcta gaaaacctcc ccaaagattg aatttgtaaa tggaacgatt tttttttcaa 4200 ttacgaaggg gagaag 4216 // ID BEL6-LTR_AG repbase; DNA; ANG; 487 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL6-LTR_AG is a long terminal repeat of the BEL6_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL6-I_AG; BEL6-LTR_AG; BEL6_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-487 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL6_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(4), 72-72 (2003). XX DR [1] (Consensus) XX CC BEL6-LTR_AG flanks an internal portion of BEL6_AG (deposited as CC BEL6-I_AG). XX SQ Sequence 487 BP; 133 A; 90 C; 167 G; 97 T; 0 other; tgttccattt cgatggaacg agcgagatgc gcggtctaag gaacgggcga gaatggcggt 60 ctgaggaagg agagagaggt gcggggtgag gagcgaacgg aaagctgtcc gagtggcatc 120 tgttttggcg acaattgtct ggaacggcgg cacccgcaat tgttgttcga tctcatcgac 180 accggccaga attccgcgaa acccaccgta gaatgcaagc gggcggcttg tgtggtcgga 240 cggaccagag aagcgaggca agcgagttgc ggctgttcgc cggaccacgt taagggggaa 300 cagagaggga atgaagagat aaggagagag tgagagaaga gagggaatta tgaaatttat 360 ttgtgaccaa gtggagtggg tcagtctgat gaaaaccgcg aatgagttaa gtcgcaacca 420 ataaagttgt gcgtacagga ttaccgtgtt cagtacattg cgttaggtta tccgctgaga 480 tatcaca 487 // ID GYPSY35-LTR_AG repbase; DNA; ANG; 232 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY35-LTR_AG is an LTR of retrotransposon GYPSY35_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY35_AG; GYPSY35-I_AG; GYPSY35-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-232 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY35_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 63-63 (2004). XX DR [1] (Consensus) XX CC GYPSY35-LTR is a long terminal repeat of GYPSY35_AG (its internal CC portion is deposited as GYPSY35-I_AG). XX SQ Sequence 232 BP; 85 A; 35 C; 69 G; 43 T; 0 other; tgtaagaaag gaaaagcgct agtggtagtg agagatagaa ataacggtag cgagatcagg 60 agaacgagag cgtactagga acgagttgcg agagagagaa cgcgcagcgt taaattcggg 120 agcgcgggtt aagcgcgagg agttgaattc ggaccgcgaa gcgaataaca tgttgtgaac 180 taaagtaatc aataaagcgt tatttgtctt aaaccgaata aaaaactcca ca 232 // ID TSESSEBEII repbase; DNA; ANG; 2026 BP. XX AC U89802; XX DT 10-SEP-2005 (Rel. 10.09, Created) DT 10-SEP-2005 (Rel. 10.09, Last updated, Version 1) XX DE Anopheles gambiae transposon Tsessebe II. XX KW Mariner/Tc1; DNA transposon; Transposable Element; TSESSEBEII. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2026 RA Grossman G.L., Cornel A.J., Salazar-Rafferty C., Robertson H.M. RA and Collins F.H.; RT "Tsessebe, Topi and Tiang: Three distinct subfamilies of Tc1-like RT transposable elements in the malaria vector, Anopheles gambiae."; RL Unpublished (2005). XX DR EMBL/GenBank/DDBJ; U89802; Positions 1 2026. XX CC The coordinates are based on imperfect TIRs determined during CC annotation for RU (by JJ). XX FH Key Location/Qualifiers FT CDS 1089..1652 FT /product="TSESSEBEII_1p" FT /translation="MFSDEEDWRRVLWSDESKFNRQGSDGRRLVRRPVNCA FT LDPKYCFKTFKHGGGSLMVWGCFSYYGMGPLVRIHGNLNRFGYRDILDTHL FT LSHARKNLPRSWMFMQDNDSKHTSGTVQTWLADNNVKTMKWPALSPDLNPI FT ENLWAIFKKRLGKNIPEDLDHLFDHMQEVWSKIPPETIQNLVMSMRQRD" XX SQ Sequence 2026 BP; 566 A; 461 C; 481 G; 518 T; 0 other; tacagtatcg acattaagat agaaccctgg tgttttcaac gcaccgtgca ctttgttttc 60 aacgcaccgt gcacttgttt tgccacgtta cttgtgtttg tgtgtgtgtg tgtgtgtgtg 120 tgtgtgtgtg tgcgttcggg tttttgtgtg aagaaacgta gcttgacatg gtttcatatt 180 gcgtggcaag cattctgttt atgtgtgtta tcatgtgaaa cagcgtgtgt ttacacttgg 240 ataaaagcta aagtttgcat cgacagcagc gcttgttcct cggacgagaa cgcaggtgtg 300 ctgtttcatg aatgtgaatc gacaccggtg cgaaatgaaa acgcgcacaa taacaatgaa 360 cttggcattc cctcacagtt gtttcaccag tgcacgccat tgaaaggcat gcgtgcatct 420 ctgcgttaaa cgttgtgtgc gcatgggaaa ctttgtgtgt gcattgagca agagcgcagg 480 tgttcttttc aatgggcgtt tttcgttggc ctgttctggt tttgaatggg tagatgctca 540 ctcgcttact cattgatagt ttttatcaat acgctttttt gctgctcgac aacgatgtga 600 ttcatcgcgt ataaaagcag aacgcaggca gtgcacggca tcagtcgtgt ccaagtttta 660 accagtgtgt tcttgacggt ctaagcaagc aattttagta acaatgggtc gaggtaaaca 720 ttgtaccgat tatcagcgcc atatcattaa gcgaatggcg gctgctggca ttaagcgcag 780 aacgatcgag ttcgtgatgg aacgatcgcg cacctttgtg gcaaacgcgc ttcggacaac 840 cgaaacgcgc aagtcacccg gacgtcctcg taagaccacg gaggaggaag accgtaagat 900 cataaacatc gaagaaaaat ccgtttagct cggcaccaga gatccgaggc agactgaaat 960 tgacggtgac gccgagaaca gttcagcgaa gattggtcag tgccgggcta ctcgcgcgag 1020 tgccacgcaa agtgcccaat ttaacacctg cgcataaaaa cgcacgggtg ttatttgcaa 1080 ttgagcacat gttctccgat gaggaggatt ggcgcagggt tttgtggtca gacgagtcga 1140 aattcaatcg acaggggtcg gacggacgca ggttggttcg gcgccctgtt aattgtgctc 1200 ttgatccgaa gtactgcttc aaaactttca agcatggcgg tggtagtctc atggtgtggg 1260 gatgcttctc ctactatggg atgggtccgt tggttcgaat ccacggtaat ttaaatcggt 1320 ttgggtatcg agatattctt gatactcatt tgctatcaca cgctaggaaa aaccttcctc 1380 ggtcttggat gtttatgcaa gacaacgatt ctaagcatac ttccgggact gtccaaactt 1440 ggcttgctga caacaatgtc aaaacaatga agtggccagc ccttagcccg gatctcaatc 1500 ccattgagaa cctctgggca attttcaaga agaggctggg taaaaacata ccggaagatc 1560 tggatcatct cttcgatcac atgcaagagg tgtggagcaa aattccacct gaaacgatcc 1620 agaatctggt gatgtcgatg cgtcaacgcg attaatgtca tcaagggcga cggcaataag 1680 atcaaaaact gagcttattg catttttgaa tagggatcac tgctcgcgag ggcgtggtcc 1740 tgcaaattac aacaaaaacc taacccaaaa cacaaaaaaa ctactgacaa atctatccga 1800 aacgacctac ttgtgaaaca aacacacaca tcaatacaca ttcaaacata catacacaca 1860 cacacacaca cacacagatg catacacaca cacacattta cacacacaca tacacacaca 1920 tgcacacaca cacacaaaca caaacacaca aaccacacat aaacctgatc caacggtgca 1980 tgctgcgtta aaacactagg gtcctatcat tctgttcgat actgta 2026 // ID GYPSY10-I_AG repbase; DNA; ANG; 6242 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY10-I_AG is an internal portion of retrotransposon GYPSY10_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY10-I_AG; GYPSY10-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY10_AG; mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6242 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY10_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 162-162 (2003). XX DR [1] (Consensus) XX CC GYPSY10_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY9_AG, GYPSY11_AG, GYPSY12_AG, GYPSY13_AG, CC GYPSY14_AG, CC GYPSY15_AG, GYPSY16_AG and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY10-I_AG consensus was reconstructed after multiple CC alignment of 7 copies. CC The consensus encodes the 438-aa GYPSY10_AG1p gag-like protein CC (pos. 1258-2571) and the 1250-aa GYPSY10_AG2p (pos. 2445-6194). CC The sequence of the LTRs flanking GYPSY10-I_AG is deposited as CC GYPSY10-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1258..2571 FT /product="GYPSY10_AG1p" FT /translation="IIYNSFYMTKLKGIVRRLELINKTLQQNQGRVRQCAL FT TTYRLQVDEIYSVFRKEVGTNYDKYDETEIKFYNNIIQNLITNIIERINKE FT TTNNSDDLNETLKTSKKLTLKTTTHIIISLLAAYKRQKQFVPIAHEAIKTN FT TSVDANNLPNMDLLKIINTATNAIPTFSGRDDEARAMLAALEMLKESVDEQ FT HHRIIVQAVESKLKGKGRKIIGTTVNRVDEIINKIKTHLKKTESPEDIASA FT IHATKQKTTPKDFGEEIQALTEELERAYLNEKMDPELASEKTKKIAMTAFG FT KGLKKELHQGLVLSGTIPSLEAAIKAIDIMDKLSQNSQERKTNKWQNGQDN FT RYNSSNQRQPNNPRNNEQRSNNNWRAQQPQTGRQGNAQNRQSTQGTGYNTQ FT NRQSASPNFLGQREPQRAVHCTQVETQNPQVDQPSTSQYGQHTQ" FT CDS 2445..6194 FT /product="GYPSY10_AG2p" FT /translation="AIRQPKFFRATRATKSSALHSGGDPEPTGGPALHKPI FT RATYTINVQKSNFIKTRLGLADSICNLFVDSGSDISIIKGNKVRPTQTYKP FT KDIVDIISVGEGTITTHGSTITDVIVEGKKIQQLFHIVPDNFKIPADGILG FT RDFFMDHRCIINYDTWIFSVKHDGEFLETPIEDTINDKTLIPPRCEVIRKL FT DKLKELDTDAVVCAEQLQEDVLVGNCIVNKNYPFIKIINTSNKAKLVNISH FT IKTIPLNEFEIVKTNNHKNENRLAIIKELIRKENISEDTDTSFEQLLLSYN FT DIFHLPNDHLTTNNFYKQDIKLEDKRPVYIPNYKQNHSQGPEIKKQIEKML FT QDDVIEHSVSHYNSPILLVPKKSSDEKKWRLVVDFRQLNKKLLPDKFPLPR FT IDSILDQLGRAKFFSTLDLMSGFHQIPLEESSKKYTAFSSTDGHYQFKRLP FT FGLNISPNSFQRMMTIAMTGLTPECAFVYVDDIVVVGASENHHLKNLEKVF FT DRLRHYNLKLNPEKSCFFKKEVTYLGHKITDKGILPDDSKYDSIKNYPIPQ FT NADDVRRYVAFCNYYRKFIPNFALKAKPLNSLLKKNTKFEWTQECQEAFEY FT LKNTLISPQVLQYPDFSKPFILTTDASTMACGAVLAQEHGGKDMPICFASR FT TFTKGEANKAIIEKELAAIHWAIMHFKHYLYGKKFTVKTDHRPIVYLFGMK FT NPSSKLTRMRLDLEEFDFTVEFVKGKQNVVADALSRIKITSDEIKSINVIT FT KSMSKPVTSDNVLGNTSESDQLKMFHALAYDEVKDLPKLESSVKKNEDTIE FT LIGKILNKRKSKELLSVRDIHLKTDIGLQEPLLVKDFQRRKEKSAIVQFIK FT NIEKKLVMKSITQLAISETDEIFKEVHPNELKQIANNHLKNIQILIYTKPQ FT KITNEKTINDILDKVHNTPTGGHIGQYKMYKKIRREYSWNKMKKTIKEFLD FT KCLTCKLNKHQTKTAEPFVKTDTPNTPFEAVSIDTVGPFQKTNNNNRYAVT FT IQCNLTKHVTVIAIPNKEANTVARAVIEKIMLIYGTNIKKFRTDMGTEYKN FT EIFKNISEILKIEHKFSTPYHPQTIGALERNHRCLNEYLRIFTNEHKDDWD FT DWINYYSFAYNTTPNLDHGYTPFELVFGRNERITPNVKDTYSPLYNYDDYS FT KEFKYRLKIAHNRTRKHIEQVKFKLLKEQQNINQVNFEIGDQIALTNENRT FT KLDPVYKGPYKVKEINGPNMIIENTEGVVQKIHKNRAIKL" XX SQ Sequence 6242 BP; 2560 A; 1140 C; 1104 G; 1438 T; 0 other; tggcgaccgt gacagcgtag caagaacaaa gctacaagaa cattaaagtg gagtaaaagc 60 taaagtgcag tgctgtgtaa aaagcaagtt acgcgcaaag tgcggtgcgg aaaaatcgag 120 tgaaaaagtt aaagtacaat actgtaaaaa gcaagttatg cgcaaagtgc tgtgcaggaa 180 aattgacagt gcccgaaaaa aaaaataaac cctttaagca tatacattac aagtgaataa 240 taagcaagac tgaacaagtg gataaaaaga actaaaagtg cagtgagaat aagtgtatat 300 aaagatcgat taacgatcat taggccatca gaagtcacct gcccaatgta catccaacac 360 tcggacgctg cgataacagt aaagcacccg agcgacggtt ggatcagcga cgaagacaac 420 acggattttg acggaccgtc agagccagag gacccagcga tcgtagaggc acggcgtctt 480 acgtggaaag ccatcgactc cgtggagcaa accacaaacc taaccccaga ggaggagttc 540 gaggcgaagc agccctaccg gccaacacac gaagacatct gcagttgcat ccgggacaca 600 aacgcgacgc tgaggatcct ggtcaaccta ggcgtggagg taccgtacga tcttttagaa 660 gaacttacag cggcggtcag ttgctatatt gagggagtag tcgacacgca attgggtcta 720 gaactcattt cagacgtcga ctacgatctc aacctagtag cttgcggtaa gtcttaccag 780 cacgcctgga gaacaacggt gttcgacaga ctagaagcgc agctaaggcg ctaagaagaa 840 tgtgcttacc tcgggaaccg aaacctaaag ctgtagtcct cacggaggag gaatgatttc 900 ctgtaatcca agagataagg caactcaatg atttcctgga tcattatgag gattaggaaa 960 ataaaataac attatgcatt gaaaaaaaaa taatattctt cttctgaaag caaatttgaa 1020 agaaaatatt tttttccact ttgaagattt ttcaccttga aaaaaaaaaa aaaaaaaaaa 1080 aaaatatata tatatatata tatatatata tatatatata cgtatatata tttaataata 1140 atatttttcg tctaaaagta atttggaaag aaagaaaaga tttttacttc aacggttttt 1200 tcaccttgga aaaaaaatat atatatatga tagtaatata tatttttttt taactgaata 1260 atatataatt cattttacat gactaaactg aagggtattg taagaagatt agaattaatc 1320 aacaaaacat tgcagcaaaa tcagggaaga gtaaggcagt gtgctctaac cacatacaga 1380 ctacaggtgg acgaaatata ctccgtattt aggaaagaag tagggacaaa ctacgacaaa 1440 tacgacgaga cagaaattaa attttacaat aatatcattc aaaatttaat cactaacata 1500 atcgaaagaa ttaacaaaga gacaactaac aactcagacg atttaaacga aactttgaaa 1560 acaagtaaga aactaacatt aaagaccaca actcacataa tcatatcact tttagctgca 1620 tacaaaagac aaaaacaatt cgttcccatt gctcacgaag caataaaaac aaatacaagc 1680 gtagacgcaa ataacctacc caacatggat ctattaaaaa tcatcaatac ggcaaccaat 1740 gccataccaa cattcagcgg tagagatgac gaagccaggg caatgctagc tgcattagaa 1800 atgctaaaag aatcagttga tgaacaacat catcgcataa tagtgcaagc tgttgaatca 1860 aaactaaagg ggaagggaag aaaaatcata ggtacaacgg ttaatagagt cgacgaaata 1920 ataaacaaaa tcaagactca tttgaaaaaa accgaatcac cagaagatat agcctcagca 1980 atacatgcca caaagcaaaa aaccacaccg aaggattttg gtgaggaaat tcaagcactg 2040 acagaagaac tagaacgggc atatctcaat gaaaaaatgg atccagaatt agcttctgaa 2100 aaaaccaaaa aaatcgctat gacagctttt ggaaaaggac taaaaaagga actacatcaa 2160 gggctagttt tgtcaggtac aatccccagc ctagaagctg ctatcaaagc aattgatatc 2220 atggacaaac taagtcaaaa ctcacaggag agaaagacga ataaatggca gaatggtcag 2280 gacaacagat ataatagttc caaccaacga cagcccaata atccaagaaa caatgaacaa 2340 cgctcaaata acaactggag ggcacaacag ccacaaaccg ggagacaagg gaatgcccaa 2400 aatagacagt ctacacaggg aacagggtat aatactcaaa ataggcaatc cgccagccca 2460 aattttttag ggcaacgcga gccacaaaga gcagtgcatt gcactcaggt ggagacccag 2520 aacccacagg tggaccagcc ctccacaagc caatacgggc aacatacaca ataaatgtgc 2580 aaaaatcaaa ttttataaag acaagattag ggctagcaga ttcaatatgc aacctatttg 2640 tagattcagg ttccgacatt tctatcatca aaggcaacaa agtaagacct acacaaactt 2700 acaaaccaaa agatatagtg gatatcataa gcgtaggaga aggaacaata accactcatg 2760 ggtccacaat tacggatgta atagtggagg gaaagaaaat ccaacaatta tttcacatcg 2820 taccagacaa cttcaagata ccggcagatg gtatacttgg tagagatttt tttatggatc 2880 accgatgtat aataaattat gatacttgga ttttctctgt aaaacacgat ggagagtttt 2940 tggaaacacc cattgaagat actatcaatg acaaaacact catacctccc agatgtgaag 3000 taattagaaa actagataag ttaaaagaat tagatacgga tgcggtagta tgcgcagagc 3060 aactgcaaga agacgttctt gtaggaaact gcattgtaaa taaaaactac ccatttatta 3120 aaataatcaa tacttctaat aaagctaaat tagtaaacat tagccatatc aaaacaatac 3180 ctttaaatga atttgaaatc gtaaaaacta ataatcataa gaatgaaaat aggttagcaa 3240 tcataaagga attaatccga aaggaaaaca tttccgaaga tacagataca tcttttgaac 3300 aattactgtt aagctacaat gatatttttc acctacctaa tgatcattta actacaaata 3360 atttttataa acaagatata aaattagaag ataaaagacc cgtgtacata ccgaattata 3420 aacaaaatca ttcccaagga ccagaaatca aaaagcaaat tgaaaaaatg cttcaagatg 3480 atgtaataga acactcggtg tcacattaca attcacccat cttactagta ccgaaaaagt 3540 cctcagatga gaaaaaatgg agattagtag tcgattttag acagcttaac aaaaagctgc 3600 tccctgataa atttccacta cctagaatag actccatatt agatcagcta gggcgagcaa 3660 aattttttag cacattagat ctcatgtcag gatttcatca aataccactg gaagaatcat 3720 ctaaaaagta tacagctttt tcaagcacgg atggtcacta tcaatttaaa cgattacctt 3780 ttggattgaa catttctcca aatagttttc aacgaatgat gaccatagcc atgacaggcc 3840 tcacgccaga atgcgctttc gtatacgttg atgatattgt agttgtagga gcttcggaaa 3900 atcaccatct aaaaaattta gaaaaagttt tcgacagact aagacactac aacctcaaac 3960 taaacccaga gaaaagttgc tttttcaaaa aggaagttac ttatcttgga cataagataa 4020 ccgacaaggg tattcttcca gatgactcga aatacgatag cataaagaat tacccgatac 4080 cacaaaacgc agacgacgtg agaagatacg tagcattctg caattattac agaaagttca 4140 tcccaaattt tgctttgaaa gcaaaaccgc taaacagtct tttaaagaaa aatacaaaat 4200 ttgaatggac acaggagtgt caagaagcat tcgaatattt aaaaaacaca ctgattagtc 4260 cacaggtttt acaatatcct gatttcagta aaccatttat actaaccaca gatgcttcaa 4320 ctatggcatg tggagctgtt ttagcacagg aacacggcgg caaagatatg ccaatatgct 4380 tcgcgagtag aacctttacg aaaggggaag cgaataaagc aataatcgaa aaagaactag 4440 ccgcaataca ttgggctata atgcatttca agcattacct atacggtaaa aagtttaccg 4500 tcaaaacgga ccatagacca atagtctatc tgttcggtat gaaaaatccg tcatcaaagc 4560 tgacgagaat gagattagat ttggaagagt tcgattttac agtcgaattt gtaaaaggga 4620 aacagaacgt tgtagcagac gctctatcgc gaattaaaat cacctcagat gaaataaaat 4680 ctatcaatgt gattacgaaa agcatgagca agcctgttac ttccgataat gttttaggaa 4740 acacgtcaga gtctgatcaa ctcaaaatgt tccatgcctt agcatacgac gaagtaaaag 4800 acttaccaaa actagaatca tcagtaaaaa agaatgaaga cactatcgag ttgataggaa 4860 aaatcctaaa caaaagaaag tccaaggagc tcttatcagt aagagacatc catctgaaaa 4920 cagatatagg actgcaggag cctttattag taaaggattt ccaacgaaga aaggaaaaat 4980 ctgccatagt gcaatttatc aaaaacatag aaaagaagct cgtaatgaaa agcattaccc 5040 agctagcaat ctctgaaaca gacgagatat tcaaagaggt gcacccaaac gaattaaagc 5100 aaatcgctaa caatcatttg aaaaatattc agatactaat atatactaaa ccacaaaaga 5160 ttactaacga aaagacgata aacgacatac tcgacaaagt gcacaacacg ccgacaggag 5220 gacacattgg acaatataaa atgtacaaga agattagaag agaatattca tggaacaaaa 5280 tgaaaaagac aatcaaagaa tttttagaca aatgcttaac atgtaagctt aacaaacatc 5340 aaacaaaaac tgcagaacct tttgtcaaaa cagacacacc taacactccg tttgaagcag 5400 tatcaattga tacagtaggc ccatttcaaa aaacaaacaa caataaccga tatgcagtaa 5460 caattcaatg taatttaacc aaacatgtaa cagtcatagc aattcccaac aaagaagcga 5520 atacggtagc tagagcggta atagaaaaaa ttatgttaat atacggcaca aatataaaga 5580 aattcagaac cgatatgggt acagaataca agaatgaaat atttaaaaac atatcagaaa 5640 tcctcaaaat agaacacaaa ttttcaactc catatcaccc acaaactata ggagcattag 5700 aacgaaatca tagatgcctc aacgaatacc ttagaatatt tacaaacgaa cacaaggacg 5760 attgggatga ttggatcaat tactattcat tcgcatataa cacaacacca aatttagacc 5820 atggttatac accatttgag ttagttttcg gaagaaatga aagaataaca ccgaatgtga 5880 aagatacata ttcaccttta tataattatg atgattattc taaagaattc aaatacagat 5940 taaaaatagc tcataatagg actagaaaac atatagaaca agtaaaattc aaactattaa 6000 aagaacaaca aaacattaat caagttaatt tcgaaatagg agatcaaata gccctgacaa 6060 atgaaaatag aaccaagtta gatccggtat acaaaggacc atataaagta aaagagataa 6120 atggacctaa catgataatt gaaaacacgg aaggtgtagt acagaaaata cataaaaata 6180 gagcgattaa attatgacag aataacttca cttcattacg ttattcttcc gaagggtgga 6240 gg 6242 // ID RETRO6_AG_LTR repbase; DNA; ANG; 410 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO6_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; DIVER2; KW Long terminal repeat; RETRO6_AG_I; RETRO6_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-410 RA Jurka J. and Drazkiewicz A.; RT "RETRO6_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 15-15 (2002). XX DR [1] (Consensus) XX CC Related to DIVER2 from Drosophila melanogaster. 5 bp target site CC duplication. XX SQ Sequence 410 BP; 126 A; 73 C; 119 G; 92 T; 0 other; tgttccgttg ataacggaag atgaagagga gagagagaga gaaggaaggc atcgtgccac 60 acggaacggg cgacatttgt agttgacggt aggcacgtac gtatcgatat aataagtcga 120 gcgaaaacgc cattcggtgt attggtgtac gtgtagtctt gtttggcaag caagtgtggg 180 agcaagagca cgaggctaga aacaggcgag caggtagaaa cgcgtatgcg atgaatttat 240 actggtagtg gtaagagcga gcggtgaatt tcgaattgaa acggcattct gtttgtagca 300 agcggcttag ccagacgcaa acgcaagtag cggtgttacg ttagtcaata aagcgacaca 360 aaaaacaaac ttttctttcg tatccactgc ttcccagagc aaattatcca 410 // ID NotoAg1 repbase; DNA; ANG; 5540 BP. XX AC AB090823; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon NotoAg1 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; NotoAg1. XX NM NotoAg1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5540 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090823; Positions 1 5540. XX FH Key Location/Qualifiers FT CDS 552..1838 FT /product="NotoAg1_1p" FT /translation="MREAIRQLTNSLAEANARNERINEELTQMRILMTKQQ FT EYTERRELIAREEMEKMRAAHERDRTALNKLLMQGAGTSSHRAAATPTTPT FT PQPRRMQQHQEKQRQPPQQQHQQIGPSTSAAPPQLLVSGASFDPEGDDGQG FT SFAEVVRHKWGRNTGKPRGYQQQQSQSHRQVVIGTQQECLQPEQQHQRQQQ FT HTVRRHNVDKVEVIPGENQTWETVYQMVKDAIKFDPAHKDLADHVVIGRRT FT HAARLRIQLSCTADSTLMLQEVQQIIGNAGIARVITEMGEILITHIDPLAS FT EEDLKEAVDRKLQASAGVTKVSMWQLSDGTKRARVRLPAKAAKQLVGQKLT FT VSCCISNIKEAPAINLQQQRCYRCLERGHIARECRSPVDRQKACIRCGAEG FT HLAKDCNAEVKCAVCSGPHRVGHSDCVRPMLRCPH" FT CDS 1829..5290 FT /product="NotoAg1_2p" FT /translation="MSSLRVLQLNVDHCREGQGLALQSAREHRADVLILSD FT MFTPPNNNGRWEFDASRKVAIVATGSYPIQRVWGSTVPGLVAAKVAGIDFI FT SVYAPPSLSPQEYERLLEAVELEASSHSHVVIAGDFNAWHTEWGSRRNNLR FT GEELLQMVEVLGLSILNNGSAPTFIGRGAARPSVIDVTFATPSLVLHDTWE FT VLDFARSDHQLIRFETNSPALAARRVKLSQRNRSQQRSPRRDPPINRQHTP FT CAGRRWKTKQFRENSFLLALKDVNFAEQAVTDADIVEMMTRACDEVMQRAN FT HLSSNPYRDLYSWSPELERLRGICLAARERLRLITDLQERSFVAADHRTAK FT RNLEKAIRASKRQQIDALIDTAEDNEFGGGYRVVMSTLRGSRVPQEKDPIE FT LGRIVSDLFPNHPPVPWPYVSDVTQGEASVDDVTPRELQDIAHQMATRKAP FT GLDGIPNAAVKAAIGMYPDVFCRMYQDCLTRGTFPSEWKRQRLVLLSKTGK FT PPGESSSYRPLSMLDALGKVLEQLILNRLNKHLEDPDSPRLSDAQYGFRRG FT RSTFSAIQRVVDAGRRAKSFRRTNHRDKRCLMVVALDICNAFNTASWQSIA FT DALRNKGVPSALLNIIGSYFEERKLIYNTSAGPVERHISAGVPQESILGPT FT LWNVMYDGVLGVELPPGAELIGYADDLVLLAPGTTPAAAAVVAEEAVSAVD FT RWLREHHLELAHAKTEMTVISSLQQPPEDITITVGGTEVPFSRTLKYLGVR FT LHYNLSWVPHVKAVIQKATQIVQAVTQLMPNHRGPKTSRCRLLAAVADSTM FT RYAAPVWHGALTNRECRSLLKRVQRKAAIRVARTFRTVRYETAVLPAGLVP FT ICRAVAEDTRVHSRRGTGVSSSELRKEERQRTIEEWQTTWDADAAADNASR FT YVRWAHLVIPDVGAWQLRNHGEVTFHLSQVLSGHGFLREYLNKMRFTSSPA FT CPRCPGVVEGVEHVMFECPRFAEVRSELLDGVLPETLEAHMLQSPTNWSNV FT CEAAKRITSKLQRCWDDECAILAAQAMLEEPANRLDPEAVRCTRNDLRNVA FT RRTQTVRQREEQCGERPSMPSSSPRTSERRANIRARMARLRQRHRQHQQDE FT RRGVEGGDIERGESVYPELASSPNNRQGGLTSAEKAAAVEADVASR" XX SQ Sequence 5540 BP; 1448 A; 1446 C; 1621 G; 1025 T; 0 other; gggtagggct tgagctgtca aaatagtgct gacgtccaaa agtgcgctcc gtgttaaact 60 aggaaattat agttaaaaac accagtatag tcctaaaaag aaacattggt agatgtgtgc 120 agtgatatag tatgtgcgca gcacgagtac cgagtggaaa gtggagtatt tttcattaaa 180 aagataaaag tgcgagggcc caagtctgga acaacaatac aacaatatgg cgtaaaccgg 240 gtgacattgg aaagacgtca cgcctaactg acaaaaaaaa aacaatccca aagggcaaaa 300 ctacatttaa agggggagct actagtaaaa gtgtcagaca agagtcggat aacgctcgga 360 taagaccaag aggcggacaa atagcaaggt gtgtaaattg aagtagcgtg atttcgacct 420 tcgggacggc gttacgttcg aatgccagca acgcaacagt gccgaagctt ggcaccggtc 480 ccgtggctgc aggacagcgg ctaagtagag gagctacgca gaagcaagca gcttcgccag 540 tgcttacgga gatgagagag gcaatcaggc agctcactaa ctcgttggcc gaggcgaatg 600 caagaaacga acgcatcaat gaggagctaa cacagatgcg catcctcatg actaaacagc 660 aggagtatac ggagcgccgg gaattgattg cccgggaaga aatggagaag atgcgtgcag 720 cacatgagcg cgaccgtact gcgctcaaca agttgcttat gcagggggcc ggtactagca 780 gccatcgggc agcagcaaca ccaacaacac caacacctca gccacgccga atgcagcagc 840 atcaggagaa acagcggcag ccgccacagc agcagcacca gcaaataggc ccatctacct 900 cggctgcacc gcctcaactc ctagtgtcgg gagcatcgtt cgacccggag ggggacgatg 960 gtcaaggcag tttcgctgag gtggtgaggc acaaatgggg gcggaatacc ggcaaaccgc 1020 gcggctacca gcagcagcaa agccagtctc accgacaggt ggttatcggc acgcagcagg 1080 agtgcctgca accggaacaa caacatcaac gccagcaaca gcatacggtg cgtcggcata 1140 atgtcgataa ggtagaagtc atcccaggcg agaaccagac ctgggaaact gtttaccaaa 1200 tggttaagga tgccattaag ttcgacccgg cgcataagga tctggcggat cacgtagtaa 1260 tcggccgccg cactcatgct gcccgtcttc ggatacagct cagttgcacg gcagactcaa 1320 cgctgatgct acaggaagtg caacaaatta ttggaaatgc tggtattgcg cgagtgataa 1380 cagagatggg cgagattctc atcactcaca tcgatcctct cgcaagcgaa gaagatttaa 1440 aggaggccgt agacagaaag ttgcaagcta gtgccggcgt tactaaggtc agcatgtggc 1500 aactgtccga tggcacaaaa cgagcccgcg ttagactacc ggcaaaggca gccaaacaac 1560 tcgtggggca aaagttgacg gtaagttgct gtataagcaa tatcaaggaa gccccggcca 1620 tcaatctcca acagcagcgc tgttaccgct gcctggagcg tggccatatt gctcgcgaat 1680 gtcgttctcc ggtcgaccga cagaaagcat gcattcggtg tggagcagaa ggccacttgg 1740 ctaaagactg caacgccgag gtgaagtgcg ccgtgtgcag tggtcctcat cgcgtcggtc 1800 acagtgattg tgtacgcccc atgctgcgat gtcctcacta agggtacttc aactcaatgt 1860 ggatcattgt cgggaaggac agggcctagc actgcaatcc gcgcgggaac atcgtgctga 1920 tgtcctgatc ttgtcggaca tgtttacgcc tcccaacaat aacgggcgat gggaattcga 1980 cgcatcgagg aaagtagcta tagtagccac cggctcgtac ccaatacaac gggtatgggg 2040 cagtacagtg ccgggactgg tggctgctaa agtggccggg atcgacttta tcagcgtcta 2100 cgctcctccg agcctatctc cacaggaata cgagcggctt cttgaggccg ttgagctgga 2160 ggcctcatcc cactcccacg tcgtgatcgc tggtgatttc aatgcttggc acacggaatg 2220 gggtagcaga cgcaataacc tgcgtggcga ggaattactg cagatggtgg aggtgctggg 2280 actctccatt ctcaataatg gcagcgcacc gacgttcatc ggcagaggag cagcaaggcc 2340 cagtgtcatt gacgtgacct tcgcaactcc gtcgctagta ctgcatgaca cctgggaggt 2400 actagatttc gccagatccg accaccagct gatccggttc gagaccaaca gccctgcact 2460 ggccgcaagg agagttaagc tttcccagcg gaatcggtcg cagcaacggt ctccccgccg 2520 tgatccacca atcaaccggc agcacactcc atgtgccggt aggaggtgga aaactaaaca 2580 attcagggaa aattctttcc tcctagcact caaagacgtg aacttcgccg agcaagctgt 2640 gactgatgcg gatatagtcg agatgatgac gagggcatgt gatgaagtga tgcaacgagc 2700 caaccacttg tccagcaacc cttatcgtga cctttactcg tggtctcctg agctggagcg 2760 gctacgtgga atatgtctag ccgcgcgcga gcggcttaga ctcatcaccg atctacaaga 2820 gaggagtttt gttgcagcag accatcgcac ggcgaaacgc aacctggaga aggcgattcg 2880 tgccagcaaa cgtcagcaga ttgacgcact gatcgataca gccgaggata atgagtttgg 2940 tggcgggtac agggtggtga tgtccacgct gcgcggcagt cgagtgccgc aggagaaaga 3000 cccaatcgag ctggggcgga tcgtgtctga cctgtttccc aaccacccgc cggtcccatg 3060 gccgtatgta agtgatgtca cccagggaga ggcatccgtc gacgacgtga ctcccaggga 3120 gctgcaggat atagcccacc agatggcaac aaggaaggca ccaggactag atggaattcc 3180 caacgccgca gtgaaggccg cgatcgggat gtacccggat gttttttgca gaatgtacca 3240 ggactgctta actcgtggca cgttcccgtc cgagtggaag cgccagcgcc tggtactgct 3300 ttcgaagacg ggcaaaccac ccggggaaag cagctcatat cggccgctga gcatgctcga 3360 cgcactcggc aaggtattgg agcaactaat cctgaaccgc ctcaacaagc atctcgagga 3420 cccggattca ccgcggttgt ccgatgccca atacggtttc cgccgaggac gctccacctt 3480 tagtgcgatc cagcgtgttg tagacgcagg gagaagggcc aagtcgttcc gtcgtaccaa 3540 ccatcgcgac aagcgctgtc tgatggtggt cgcattggat atttgcaacg cgtttaacac 3600 cgctagttgg cagtctatag ctgatgcgtt gcggaataag ggggtcccat cagcgcttct 3660 aaatataata ggaagctact tcgaggaacg caagctgata tacaacacca gcgcgggccc 3720 ggtcgagcgt cacatcagcg cgggagttcc acaggagtcc atcttgggcc cgaccctgtg 3780 gaacgtgatg tacgacggag tccttggcgt tgagctacca cctggggcgg aacttatcgg 3840 ctatgccgat gacctcgttt tgctggctcc aggcacaacg ccggcagcag cagcagtagt 3900 agctgaggaa gctgtgtcag cggtagaccg gtggctgcgc gagcatcact tggagctcgc 3960 acatgcgaaa acggagatga cggtgatctc tagcctgcag cagcctccgg aggacatcac 4020 catcaccgta ggaggtacag aggtgccgtt ctcgcgtacc ctcaaatacc tcggggtacg 4080 cttacactac aacctgtcgt gggttcctca tgtgaaggcg gttattcaga aggcaacgca 4140 gatagtacag gcggtcacac aattgatgcc gaaccaccga ggaccaaaga cgtcacgatg 4200 ccgcttgctt gcagcggtcg ccgactcgac aatgcgatac gctgcacccg tctggcacgg 4260 agccttgact aaccgagagt gccgcagtct gctaaagcgc gtgcagcgaa aggcagcgat 4320 ccgcgtggct cgaacgttcc ggacggtaag gtatgagacc gccgtgctgc ccgcgggact 4380 ggtgccaatc tgcagagccg tagcggagga cacccgagtt cacagcagac gcgggaccgg 4440 tgtaagtagc agcgagctcc ggaaagagga gcgacagcgg actattgaag agtggcagac 4500 gacttgggac gcagacgccg cagcagacaa cgccagcaga tatgtcaggt gggcacacct 4560 cgtaattccg gacgtgggag cctggcagtt gcggaatcac ggagaggtga cgtttcattt 4620 gtctcaggtg ttgtcaggac acggattttt acgcgaatac ctgaacaaaa tgagattcac 4680 ctcgtctccg gcctgccctc gttgccctgg tgtagtcgag ggagtagaac atgtaatgtt 4740 cgaatgccct cgctttgctg aggtgaggag tgagctattg gatggagttt tgccagaaac 4800 gttggaggcg cacatgcttc aatcacccac caactggagc aacgtgtgcg aggccgccaa 4860 gcgcataacc tcaaaactcc aacgctgctg ggacgacgaa tgcgccattc tcgccgcaca 4920 ggccatgctg gaggaacccg ccaatcggct cgatccagaa gcagtccggt gtacccggaa 4980 tgaccttcga aatgtagcga gaaggacgca gacggtgcgt caaagggagg agcagtgtgg 5040 cgaacggccg tctatgcctt catcgtcacc acgaacgtcg gagcgccggg cgaatatccg 5100 cgctcggatg gcaagacttc ggcagcgaca tcgccagcat caacaggatg aacgacgagg 5160 agttgaagga ggtgatatag agagagggga gtctgtctat cctgagctgg ccagctctcc 5220 caacaaccgg caaggcggtt tgacctcggc agagaaggcg gcggcggtgg aggcggacgt 5280 ggcctcccgc tagtcagcca aaaaatcaca cgaaagcacg cgaaggaatg ccaaaattgg 5340 tgcgaaattg ccccacacaa gggcagagag tgcaaattac cagcgagacc aaggaggata 5400 aaggatgcgc tcctaagagc tataacatcc cctcccccta gacccctcgc ggggcacagg 5460 ggaaggggca ggaagagggt taggtaattt ttaaatatta taaatttact gaaataaact 5520 aacccgattg ttaaaaaaaa 5540 // ID QUETZAL repbase; DNA; ANG; 1680 BP. XX AC L76231; XX DT 21-AUG-1997 (Rel. 2.07, Created) DT 25-SEP-2007 (Rel. 12.1, Last updated, Version 3) XX DE Quetzal, a DNA transposon of the Tc1 superfamily. XX KW Mariner/Tc1; DNA transposon; Transposable Element; TIR; KW Tc1 superfamily; Quetzal. XX NM QUETZAL. XX OS Anopheles albimanus OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1680 RA Ke Z., Grossman G.L., Cornel A.J. and Collins F.H.; RT "Quetzal: a transposon of the Tc1 family in the mosquito RT Anopheles albimanus."; RL Genetica 98(2), 141-147 (1996). XX DR GenBank; L76231; Positions 1 1680. XX CC TA target-site duplication. The closest known element to the CC Quetzal is the Uhu transposon from D.heteroneura. It has 236 bp CC TIRs (position 1-236 and 1445-1680). The transposase is encoded CC by the sequence 373-1398. XX FH Key Location/Qualifiers FT CDS 373..1395 FT /product="QUETZAL_1p" FT /translation="MTREELSVSKRQDIIRLHGAQGKSYTEIAMLTNINRN FT TVARVIQRYKYEGRVSNLPRKGRPSVCTDRMRRAIKRLVDAEPEISAQSVA FT IVLNERHGIAISCETVRRYIHKFGYKAYNRRKKPQISPINRKRRLEFAKKY FT VNHPPEFWKKVLFTDESKFNIFGWDGTIKVWRPPGEGLNPKYTAKTVKHNG FT GGVLVWGCMAANGVGNLQVIDGIMDQYVYINILKQNLGPSLEKLGMSQDYW FT FQQDNDPKHTAFNSRLFLLYNTPHQLKSPPQSPDLNPIEHAWELLERKIRQ FT TRIKNRVDLENKLKEAWITISEDYTQNLVNSMPRRLAEVIKMKGYATRY" XX SQ Sequence 1680 BP; 561 A; 320 C; 355 G; 444 T; 0 other; cacttctcca caaaagtgaa tacacagcaa acagttttag gaataaatgc ctctagttgt 60 gcatagaccg aattaaaaat atgggaaaat tatcaattat gctttgacct atttgcaatc 120 gattgacgct tggtacgatg tccgtccgat gttgaagttc cgagaaattc tcggaaaact 180 gctgggatgc ctcaaaatcg ttccacaaaa gtaagtacac agcacaggtg ttcgttttgt 240 tttgaatcgt agataaactt ttaattgatg cgttaatgat cgattgggtg caatatgctc 300 agtcgttcag atagtttcga ggtgaacagt ttttagcttc caaatcgact gcagtttacc 360 ataattccaa aaatgaccag agaagaactt tctgtctcta aaagacaaga tattataaga 420 ttgcacggcg ctcagggcaa aagctacaca gaaattgcaa tgttaacaaa cattaataga 480 aatactgtcg ctagggtcat ccagcggtac aaatacgagg gccgtgtatc taatttacct 540 agaaagggtc ggccctcggt gtgcactgat cgtatgcgac gggcgataaa acgattggtg 600 gatgctgaac cagaaatcag tgctcaatct gtagctatag tacttaacga aaggcacggt 660 attgccattt catgtgagac agtgcggcgg tacattcata aatttggcta caaggcttac 720 aacaggcgca aaaaacctca gatcagccct atcaatcgga aacggcgatt agaatttgcg 780 aaaaaatacg ttaaccaccc acccgagttt tggaaaaaag ttttatttac agacgagagt 840 aaatttaaca ttttcgggtg ggatggcaca ataaaggttt ggcggccacc cggagaaggc 900 ctgaacccta aatacacagc caagacggta aaacataacg gagggggtgt gctagtttgg 960 gggtgtatgg cggcaaatgg tgttggaaat ttgcaagtta tagatggaat tatggaccaa 1020 tatgtttata tcaacatttt aaagcaaaat ttaggaccaa gtttggaaaa attagggatg 1080 tctcaagatt attggttcca acaagacaat gatccaaaac acacggcatt caattcacgg 1140 ctatttttgt tgtacaacac tccccaccag ctaaaatcac cgccccaaag tcccgacttg 1200 aacccaatag aacatgcttg ggaattactt gaacgaaaaa ttcgtcaaac acgaattaaa 1260 aaccgtgtcg atctagaaaa caaattaaaa gaagcgtgga tcacaatttc tgaagattat 1320 acgcaaaatt tggtaaattc aatgccacga aggttggcag aagttataaa aatgaaaggg 1380 tatgctaccc gatattgaaa acgttacaaa atgtcgaagg acacgaaata aaaacgaaat 1440 acccaacgaa cacctgcgct gtgtacttac ttttgtggaa cgattttgag gcatctcagc 1500 agttttccaa gaatttctcg gaacttcaac atcggacggc atcgtaccaa gcgtcaatcg 1560 attgcaaata ggtcaaagca taattgataa ttttcccata tttttaattc ggttctatgc 1620 acaactagag gcatttattc ctaaaactgt ttgctgtgta ttcacttttg tggagaagtg 1680 // ID CR1-1_AG repbase; DNA; ANG; 5401 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 21-JUL-2009 (Rel. 14.08, Last updated, Version 3) XX DE CR1-1_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW L2B; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; CR1 clade; DNA/RNA-binding; KW CR1-1_AG. XX NM CR1-1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5401 RA Kapitonov V.V. and Jurka J.; RT "CR1-1_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 1-1 (2002). XX DR [1] (Consensus) XX CC CR1-1_AG is a family of CR1-like non-LTR retrotransposons. The CC CR1-1_AG consensus sequence was reconstructed based on multiple CC alignment of ~20 copies identified in the sequenced portion of CC the genome. Given the ~2% divergence of these copies from the CC consensus sequence, transposition of CR1-1_AG occurred less than CC 1 million years ago. Integrations of CR1-1_AG have not produced CC target site duplications. The consensus sequence encodes two CC proteins: a 440-aa CR1-1_AG-ORF1p (positions 425 1745) and 941-aa CC CR1-1_AG-ORF2p (positions 1746-4568). CR1-1_AG_ORF1p is DNA/RNA CC binding protein composed of the PDH domain (aa positions 3-57) CC and gag-like zinc knuckle regions (aa positions 334-442). CC CR1-1_AG-ORF2p is composed the AP endonuclease and reverse CC transcriptase domains. The 3' terminus is composed of the CAT CC microsatellite. XX FH Key Location/Qualifiers FT CDS 425..1768 FT /product="CR1-1_AG-ORF1p" FT /translation="MECLKCSAVVGTSDDPIICSGSCGFIFHRRCITPTLN FT KPAVKLINENRNVVYMCDICLDQSAGLVHMDTDATKSNDLLAQTLRDLEAN FT VSVWISSALERGIETLKTELCAQVERKLETTLRETLSAIEASKMSKAALRA FT TSDTPQTSKTVQDVNLETWATVTKKRKRTNSGDSNVQTIINRFDEGNNKVT FT PKIKKINDVKEPMGKNKENNKTLVIVPKVVQSCDRPRADLSARLDPRKQQL FT SEFRNGRDGQVYAQCPALANLDSIRKEVEDILGDDYSTSLPMARVKIIGMS FT EKYSSSDLVDLLKSQNEGIPWKQENVIGMFESKIYKYQIHNVVLEIDHETD FT KCLAKLDKINIGFDRCKISRSIHVMRCFKCGQFSHKSTDCQNKEACSKCSG FT EHRTSDCTSSILKCVNCVLANTSRNLKLQVQHAANSYECPLFKKQVERRMQ FT LSQ" FT CDS 1746..4568 FT /product="CR1-1_AG-ORF2p" FT /translation="RDECNFLNSRGGVELGRFREILYFNVAGLSSNYAMFR FT ETVEKVQPLLVLISETHVTEKEAFEQFYLKGYRVVSCLSHSRHTGGVAAYA FT RSDVVLKVILNESLEGNWFLGVAVSRGMTVGNYSILYHSPSASDSRFVDIL FT EEWLDRFLDLSKLNIIVGDFNIDWLNVEKSAKLKSLMDSVNMNQKVNEFTR FT IARQSRTLIDQVYSSIDSIKVTTDPLLKISDHETLVLNINDERCKTIQRKV FT KCWNRYSKHALCNNVSQGLQCGASDFDEAADLLWNTLKHAMSTLVEEKTIV FT SRETSRWYTLDLARAKRKRDKVYKKFIRTNRDNDWSEYTKLRNSYSRDLKN FT RRSDFFSNEINKHKKNSKELWKVLKSMLQPDESCVSVVKFNGVIEADDSII FT CNKFNSFFVNSVLDINQNIASVSEPSYYVDSATPRCHFRFQKITLEQLKTI FT CFNLTKTAGIGNVSSTTIQDCYHVIGEDLLMVINQSLERGCFPKSWKESLI FT IPIPKVNGAANAEDFRPINMLHVLEKVLETVVKEQLVQFLNRNELLIREQS FT GYRQGHSCETALNLVLARWRVLMDRRESIVAVFLDLKRAFETISRPLLLST FT LRRFGIVGRELSWFESYLKERTQRTLFGSSVSEPIENTLGVPQGSVLGPIL FT FIMYINDMKQVLKACEINLFADDTVLFISHKEIKQAESLMNIDLNALDGWL FT KYKKLALNINKTCYMVMSAGVLEEPPSIVINSELIERVRQAKYLGVILDDR FT LKFHAHIDWVIAKVAKKCGVISRLAKDLDFFGKVHLYKSLISPHFDFCSSI FT LFLGNKGQIKRLQRLQNRIMRLILGCGRRTPSAVMLNILQWMSVEQRIVYQ FT TMTFIYKMLKGLLPGYLGESIVRGSDIHRHHTRRANEPRVPNLHSQSARNS FT LFFKGIQRYNSLPDEIKNARNLPDFKRKCVIYVEQTV" XX SQ Sequence 5401 BP; 1641 A; 926 C; 1288 G; 1546 T; 0 other; tgatgagtga ctgttgacaa gtgcagttcg ttgacaagtg tgactgttta gctctgtgtt 60 gtgacgtgaa ataagttgtg taaaatagag ctaagtgctc aaatttttag atacgtgtgt 120 agataaatgt tcgtggatgc tctcggtgtc agtgtgtttt gcaaaggtaa aagtgacctt 180 gttaaaaaaa caaaagattg gtgcggtcaa cagcgttccg ccgttggtgg cagtaaccga 240 tccttgttgt gtagataagt gtttgtaccg tttttttatc catgtatgtg tgacaccggg 300 agatacgtag agtagcatat tagtatagta gtgtttgtgt cgcttttggc cggcgtttga 360 aaccaggtaa gcgaaaaaaa acacactttt ataagacttc gcccgtattt gttttttaca 420 cggcatggag tgtttaaaat gctccgccgt ggtgggaacc agcgatgacc cgataatttg 480 ttcagggagt tgtgggttta tttttcaccg tcggtgtatt acacccacac tcaacaagcc 540 tgcggtcaaa ctaattaatg agaaccgcaa tgtcgtatat atgtgtgaca tttgtttaga 600 tcaaagcgcg ggcttggttc atatggatac tgatgcaact aaatcaaatg atttgcttgc 660 acaaacactg agggatttgg aagccaatgt gagcgtgtgg atttctagcg ctttagagag 720 aggaatcgag actctcaaaa ctgagctttg cgcgcaagtg gagcgtaagt tggaaacaac 780 tttgcgcgaa acattaagcg ctatagaagc ctcgaaaatg tcgaaggcgg ccttgcgtgc 840 aacttctgac actccgcaaa ccagcaaaac agtgcaggat gtaaatttag aaacatgggc 900 tacagtaacg aaaaaaagaa aaaggacaaa tagtggagac agcaatgttc aaactattat 960 taatagattt gacgagggaa acaataaagt tactcccaaa attaagaaaa ttaacgatgt 1020 gaaggagcct atgggaaaaa ataaagaaaa taataaaaca ctggttattg ttcctaaggt 1080 ggtgcagtct tgcgatagac caagagctga ccttagcgcc agattggatc cgaggaagca 1140 gcaattgtcg gaattccgca acggcagaga cggacaagta tatgcacaat gtcctgctct 1200 ggcgaattta gatagcatta gaaaagaagt agaagacatt ttaggagacg attattcgac 1260 atccttacct atggcacgcg ttaaaataat tggaatgagt gaaaaatatt cttcttctga 1320 cttagtagat cttttgaaat ctcaaaatga gggaattccc tggaaacagg agaatgtaat 1380 tggaatgttt gagagtaaga tctacaagta ccagatacat aatgtggttt tggaaatcga 1440 ccatgaaact gataagtgtc tggcaaaact tgataaaatc aatattggat ttgatcggtg 1500 taaaatttct agatccattc acgttatgcg ctgctttaaa tgtggtcaat ttagccataa 1560 aagcactgac tgccaaaata aggaagcgtg ttcaaagtgc agtggcgagc accgaacgtc 1620 ggattgcacc tcgtccatcc tcaaatgtgt aaattgtgtt ttggctaaca catccaggaa 1680 cctgaaacta caggtacaac atgcggccaa tagctatgaa tgcccgctgt ttaaaaaaca 1740 ggtagagaga cgaatgcaac tttctcaata gcaggggagg ggtggaatta gggcggttca 1800 gagagatttt atatttcaat gttgccggtc tttcatctaa ctatgctatg tttcgtgaga 1860 cagtagaaaa agttcaaccc ttgttggtct tgatctctga aacccacgta accgagaagg 1920 aggcattcga gcaattttat ttaaaaggat atagggtagt gtcgtgttta tctcattcac 1980 gtcacacagg aggtgttgca gcttatgcca gaagtgacgt tgtccttaaa gtgattttaa 2040 acgagtcatt ggaaggcaat tggtttctcg gtgtagcggt ttctcggggt atgacggtag 2100 gcaattatag catattgtat cactcaccta gtgcgagtga ttcgaggttc gtagatattt 2160 tggaagaatg gttagacagg tttttggatc ttagtaagtt gaacattatc gtcggtgact 2220 ttaatattga ctggttaaat gttgaaaaat ctgcgaaact gaaaagttta atggattcag 2280 taaacatgaa ccaaaaagtc aatgaattca cacgaattgc taggcagagc aggacattga 2340 ttgatcaggt ttacagtagt attgactcaa tcaaagtcac tactgatccg ttattgaaaa 2400 tatcggatca cgaaacactt gttttgaaca taaacgatga acgttgtaaa acgattcaac 2460 ggaaagttaa atgctggaat aggtattcga aacatgctct ttgcaataat gtgtcacaag 2520 gcttgcagtg tggtgcatct gattttgatg aggctgctga cttgttatgg aacacattga 2580 aacatgcaat gagcaccttg gtggaagaaa aaacaattgt ttctagagaa actagtaggt 2640 ggtatacttt ggatctcgca cgtgctaaac ggaaaagaga caaagtgtat aaaaaattta 2700 ttagaacgaa tagagataat gattggtctg agtatactaa acttagaaac agttatagta 2760 gggatctcaa aaatagacga agcgatttct ttagcaatga aataaacaag cacaagaaaa 2820 atagcaaaga gttatggaaa gtcctcaaaa gcatgttaca acctgatgaa tcatgcgttt 2880 cagttgtaaa atttaacggt gtgattgagg ctgacgactc catcatttgc aacaagttta 2940 actcgttctt tgtgaacagt gttttagata ttaatcaaaa cattgcttct gtcagtgaac 3000 ctagctatta cgtagatagt gctactccac gatgccattt cagatttcag aaaattactc 3060 ttgaacaact aaaaaccatt tgtttcaacc tgacaaaaac ggcaggtata gggaatgtaa 3120 gttcaacaac catacaggat tgctatcatg tgatcggaga ggaccttctt atggtgatta 3180 atcaatcact agagagggga tgttttccga aatcatggaa agaatcattg attataccta 3240 ttcctaaagt gaacggagct gccaatgcgg aagattttcg ccccataaac atgttgcatg 3300 tgctcgaaaa ggtgctggag acagtagtta aggagcaatt ggttcagttt ctgaacagaa 3360 acgagctgtt gatccgagag caatcaggat atcggcaagg acactcttgt gagactgctt 3420 tgaatcttgt actggcgagg tggagggtgt tgatggatcg gagggaatcg atagttgctg 3480 ttttcttgga tctaaaacgg gcatttgaaa caatatctag gccattgttg ctttctacct 3540 taaggcgttt tggtattgtg gggagggagc tcagttggtt cgaaagttat ttaaaagaaa 3600 gaactcagag aactttattt ggtagctctg tatcagagcc tatagaaaac acccttggtg 3660 ttccgcaagg tagtgttctt ggaccaattt tgtttatcat gtacatcaat gacatgaaac 3720 aggttttgaa ggcttgtgag atcaatcttt ttgccgacga tactgttttg ttcatctcgc 3780 acaaagaaat caagcaagca gagtctctga tgaatatcga tttaaacgct ctggatggat 3840 ggctgaagta caaaaagctg gcattaaaca ttaacaagac ttgttacatg gtgatgtctg 3900 cgggtgtatt ggaagaacct ccatctatcg taataaattc ggaactaatc gaaagagtta 3960 gacaggctaa atacctggga gttatcctag acgacaggtt gaagttccac gctcacattg 4020 actgggtcat cgctaaagtg gcaaagaagt gtggagtgat aagtagattg gcgaaggatc 4080 tcgatttttt tgggaaagtt catctctaca aatcattgat ctcgccacac tttgacttct 4140 gctcatccat tttgtttctt ggcaacaaag gtcaaattaa aagacttcaa aggttgcaaa 4200 accggattat gaggttaatt ctggggtgcg gtcgacgtac gccgtccgcg gttatgctga 4260 atattcttca atggatgtca gtagagcagc ggattgtgta ccagaccatg acttttatat 4320 ataaaatgtt aaagggcctg ttgcctgggt acctggggga gagcatagtt cgggggtccg 4380 atatccatcg gcaccacaca cgcagggcaa atgagccgag ggtacctaac ttgcattccc 4440 aaagtgccag aaactctttg tttttcaaag ggattcaacg gtacaacagt ctaccagatg 4500 aaattaagaa tgcgagaaac ttgccggatt tcaaacgtaa gtgcgtcata tatgttgaac 4560 aaactgtata atgtgaaata tgtgtagatg tcccatgtca ttatgtaatg tgtaactgca 4620 gttgtcatca cgatctttat gatgatgata agatttttct ttatatacta taattaaaat 4680 tagaaaaaat ataagaaaga gtcaacatag gtttgagaca cgcgcgcgta caagtggata 4740 ttcgggatct atttggggga atttacggtt tgcacacggt ctgggaggtc acaacaggat 4800 tcactcattg atgattgtaa gtggccaatt ccagatacat taggttcacc tgcttcggaa 4860 ggtagtgccc tagagccaat cattggcact agtggaactg gccatatgca tgttgcagtt 4920 gtgttctgat aagtagtgcg ccggatccat tagttggttc ttggcgagct acgtaaggga 4980 tacattagat tctcctgctt cggaaggtag tgccctggag cctgccattg gtgccagaga 5040 ccctggggcc agcggttggc actagtggaa ctggccatat gcatgtcgca gttgtgtcct 5100 gataagtggt gcgccagatc catttattgg ttcttggcgg gctacgtaaa ggatgctagg 5160 gaataccatt gtattcgatg gagtatgacc cgttttttct tgatgaaact atgttggtca 5220 tcttgggtgt gtgtatgact cggacagttc ctctgagagt tttccgatgc caccacttgc 5280 tcgtacaaaa tattttaata tacctatcag agtaatatta tcgtaaagat acttccgtcc 5340 ttctcaaacc tatgttgggg aaagaggtgg gacttatcat catcatcatc atcatcatca 5400 t 5401 // ID GYPSY28-LTR_AG repbase; DNA; ANG; 219 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY28-LTR_AG is an LTR of retrotransposon GYPSY28_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY28-I_AG; GYPSY28-LTR_AG; GYPSY28_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-219 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY28_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 22-22 (2004). XX DR [1] (Consensus) XX CC GYPSY28-LTR_AG is a long terminal repeat of GYPSY28_AG (its CC internal CC portion is deposited as GYPSY28-I_AG). XX SQ Sequence 219 BP; 81 A; 34 C; 59 G; 45 T; 0 other; tgttgtgtac aacggggatg ggaatgtgac acttcacggc caaggtatat gtttacctta 60 cagtgtggga accttgggaa gaaaaggggg ttccggaaga agaactgtca gagggagaac 120 aaaaagggaa gcctacgagc taggaagcaa ttacgaggca gaataaataa tacagtgaac 180 tacgataaat aactgttcct taattacaaa gatacaaca 219 // ID MARINERN8_AG repbase; DNA; ANG; 1068 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE MARINERN8_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN8_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1068 RA Kapitonov V.V. and Jurka J.; RT "MARINERN8_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 25-25 (2003). XX DR [1] (Consensus) XX CC There are ~50 copies of MARINERN8_AG in the genome (multiple CC subfamilies), they are ~99% identical to the consensus sequence. CC MARINERN8_AG copies are flanked by 2-bp target site duplications. CC This element has 35-bp terminal inverted repeats. CC Classification: a nonautonomous Mariner/Tc1-like DNA transposon. XX SQ Sequence 1068 BP; 299 A; 241 C; 203 G; 324 T; 1 other; cccaaccaac caaaccgtgc tctttcgaag taaatttgct tcgatatagc ttcgatagaa 60 gttctctacc gaaggtgtat caactttgat aggcagttcc taccgaagtt aggctcgggt 120 gtttagccga aatgctacaa aagagaaaga aaaaaatctc tcgtatgctg ccgcgttctc 180 gctcacacgc gcgccctgcc cccttttatt tacgtacgtg tttatagccc ctcctcctcc 240 tcctcctcct cstcctcctc tcatctatct tcctacctta tgtgcagtgg ttatgtgttt 300 ttaattgaaa ttggaaggag agatattcga tttatttatt acgtgcgatt taatcagaga 360 attgaaacat ggtagcaggt ggatctactg ttcatcacaa gcgggcctaa cactttaact 420 atttgtgaat tcgcgcgtga ggacgtaaaa ggcgcaataa taatggagcg gttctaacgt 480 aaacagacaa aatcgctcag ctgaaaattt gatagagcgg aatagcattt cattgtaata 540 gcattgctag tatatcatcc gcgcacccta tgaaccgtac ttttccaaat atcttaaaaa 600 gtacgcattg tatgaaaaaa ccgtctttac aagatcgatt cagaatttca aaacacattt 660 ggaaaaagtt ttttgctaaa agtgactttt gaaattttac tcagattaat gtttattcct 720 ttaaattaat tcataaatgg ggtaagggaa ataatttata ctaatactaa tagaccacga 780 ctccattgcc tgaaatattt agtgcctcaa agtagtatat cctctttgcc aagcgtttga 840 gttaaactga ttcttcgcaa aatgtgaaga gagcgttcgg ccccccttca gatgcttcga 900 cagagtgaag ctcctcccct tcctccattc gtcatccatc aatttttctc cgcgcactca 960 cacatctaca cgcgcagctg ttcgcatggc aacacgtgag atgcgtgaca tgctttctcg 1020 ctcatcacga aggatttgct tcgaaagagc tcggtttggt tggttggg 1068 // ID GYPSY23-LTR_AG repbase; DNA; ANG; 162 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY23-LTR_AG is an LTR of retrotransposon GYPSY23_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY23-I_AG; GYPSY23-LTR_AG; GYPSY23_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-162 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY23_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 12-12 (2004). XX DR [1] (Consensus) XX CC GYPSY23-LTR_AG is a long terminal repeat of GYPSY23_AG (its CC internal CC portion is deposited as GYPSY23-I_AG). XX SQ Sequence 162 BP; 59 A; 31 C; 37 G; 35 T; 0 other; tgttatatac aacattgaca gctatggcat gttggcagct ctgcgactgc tcgaacgagc 60 agcactgctc gaaacaggaa gggagcgaat gaaatgtcat gcatacaagg aactgaataa 120 agaaaacgcg tgtaacattt gtactgcatc aaaaatacaa ta 162 // ID GYPSY62-LTR_AG repbase; DNA; ANG; 178 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY62-LTR_AG is an LTR of retrotransposon GYPSY62_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY62_AG; GYPSY62-I_AG; GYPSY62-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-178 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY62_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 164-164 (2004). XX DR [1] (Consensus) XX CC GYPSY62-LTR is a long terminal repeat of GYPSY62_AG (its CC internal portion is deposited as GYPSY62-I_AG). XX SQ Sequence 178 BP; 60 A; 37 C; 35 G; 46 T; 0 other; tgttggatcc cagttaccac agggataatg gaacacatag caacacttag aaagcatagc 60 aaccatgcgg tataaaagga gccgatagct catcaccgct ttactctgaa gttgattttc 120 aaatacaaaa ctttaacttg aacctgtgtg taatagtccg agtagttcgg acatatca 178 // ID AGM1 repbase; DNA; ANG; 5983 BP. XX AC AF060859; XX DT 27-JUL-1999 (Rel. 4.06, Created) DT 25-APR-2010 (Rel. 15.05, Last updated, Version 3) XX DE Anopheles gambiae Moose LTR retrotransposon, complete sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; AGM1. XX NM AGM1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5983 RA Biessmann H., Walter F.M., Chuan S., Le D. and Yao G.J.; RT "AGM1."; RL Direct Submission to Genbank (29-JUL-1998)Developmental Biology RL Center, University of California, Irvine, CA 92697, USA. XX RN [2] RP 1-5983 RA Biessmann H., Walter M.F., Le D., Chuan S. and Yao J.G.; RT "Moose, a new family of LTR-retrotransposons in the mosquito RT Anopheles gambiae."; RL Insect Mol Biol 8(2), 201-212 (1999). XX DR GenBank; AF060859; Positions 1 5983. XX FH Key Location/Qualifiers FT CDS 836..1960 FT /product="AGM1_1p" FT /note="ORF1." FT /translation="MKLPVIQLPEFGGDFNDWLPFHDTFVSLIDKSDELSG FT VQKLHYLKAALKGEAARLMSQFSLQMRITKCMANVGRPLWHKHLLKKRHIQ FT AILRLPKIINSNLDLLRRTVDDFQRHTLVLEQLGEPIKHLSSFLVELLSEK FT LDSASLAAREEAQADKSYTYSDMVEFLRKRVRLLETLANDTGETSKRQPRV FT KVSVNTAAAAEKKVDMCVVCGKQGHTIVNCRRFNEFDAKKRHEVVRQHKLC FT WNCLQGSHFVTSCTSRYGCQTCGKRHHTLLHAERSSSVIADDSVGSVSTMV FT LANIPMQCNSTDRSSYSNVMLTTVVLFVVDANGTQHPVRALLDNGAQPNAI FT SERLSQLLCLRVCVPMYPLQVWMERRLRRHVK" FT CDS 1960..5607 FT /product="AGM1_2p" FT /note="ORF2." FT /translation="MKVEIRSRFTQFALKLNFLVLSKVTANTPATSFSTSC FT WKLPAGLALADPEFHQSGRVDMLIGASHFYTFLREGRLKLSEHGPLLVETV FT FGWVVTGEVLREEAIIQQQAAQCHVMLSSENISDQLERFWKIEELHVSHFS FT ADEQRCEAYYEQTVSRDETGRYIVKLPKHQQHSTMIGKSETTSLKRFAGLE FT RKLFANSQLRQQYNEFMLEYIQLGHMVPVSPDNLDAATCCYLPHHPVFKET FT SSTTKMRVVFDGSAPTSTGHSLNDALLVGPVIQDDLLSLIIRFRKFQVALV FT PDLEKMYRQVLVHPEDRPLQRYGGAYELRTVTYGLAPSSFLATRTLQQLAE FT DEGDAFPTAKDTLKKQLYMDDLIAGSNSVDGAIQLREELSALAQRGGFTFR FT KWCSNSLAVLSDVPAEQLATKSSLRFDDKETISTLGICWEPEIDTFQFNIS FT ITTKSERDTMRTILSMIAELYDPLGLISPVIITAKVLMQSLWRLKLSWDDT FT VPEELQRNWIRFRAELPELKDFSIPRFAFAHQYRQAEIHCFTDASELAYGA FT CIYIRSEAEDGSIHVNLLASKSRVAPLKALTIPRLELCGALLGARLHEKVM FT AAMEIKFVAHRFWTDSTVVLDWLNAESKTWKTFVANRVAEIQAIRDAVWQH FT VSGQENPADLISRGVLPHQLINNQLWKQGPQWLSERKENWPQQKERTGQIT FT TDEIRSNVVLTTQIQEKNEIFTRYGSYQKLIDVVAYCFRFVHNARRLQSRI FT SNSALTVKELADAKKRLVKLVQAEEFTNDLYKIHKGIPVARNSTLKLLNPF FT IDNEGIIRVGGRLRNSDLNYNIKHQIVLPGFHPFTQLLIMDKHVKAMHGGI FT SSTLNAVRDEIWPINGKRAVRKVIRNCFRCCRANPQPIIQPEGQLPAERVT FT VNEVFSCTGLDYCGPLYLRPTHRKAAPNKCYICVFVCMSTKAVHLELVGDL FT STNSFLMALDRFVYRRGKPKHIYSDNGTNFIGAKNELHQIYKMLFNDSADS FT KIAKHLAKEEIQWHLIPPRAPNFGGLWEAAVKVAKTHLIRQLGSSRLSSEE FT MTTVLVKIEGCMNSRPLVPLSEDPNDLTALTPAHFHITNNLKVILEPDLKE FT VPMNRLGRYQLLHGYTQNFWIHWKQDYLKNLTVLHRSAKQSKQLSVGDIVI FT LKDEQLPAVQWPLARVVEIHPGADGISRVATLRTASGIVKRAVSKICPLQC FT SNQRMD" XX SQ Sequence 5983 BP; 1808 A; 1152 C; 1427 G; 1596 T; 0 other; tcgttgctaa ccaaaaccat ccaacgaaaa acacattcaa attgttcgat ctcaaatgca 60 tgcggttcta ccacagtgcc ggattatggc agctgacgtt ttgtcattct tgtgggtcta 120 gggacgtacc gctgtgaaag gggaaccgcc tttttgtggc aattgtcacg agccgaagaa 180 aagcgaagca cgcataaatg gtgtgtgaat aaagaagtga acttacttgc caacaaagaa 240 caaccgactg attatttcgt gtaaaaaaaa agtaaaaata tcacaacgta gggactttca 300 cggaacattt tggtgcaagt gaccaggata agtgcctgat tatcggtgtc aattgtggta 360 atcataatta gtgcaaaagt tctcgcttgt tctttgctgt gtgcgtgtga gtgtgtgctt 420 ggagctggct aggattttgt gtgcgtgtgc gcgtgtgcaa cgtattaaac agtgtgttta 480 acgcaagtgt aacaatggcg acggcgcgag aagacaaaat tcgtggcaaa gagcaaaaaa 540 gaaaaaacat tatagattcg atgcggcgaa tcgatttatt cgtacaaacg tacactgtgg 600 acaagattca tgaggtgacc acaaggctcg agaggctcga gaaagtgtgg tatggttttg 660 aagaagttca agaagagtta gacaagctaa cccttgaggg agacaacgct gaaaatgaaa 720 agacgcgtgc agaaatggag gagctataca tgagtgttag atccaatcgc tacggctgaa 780 gccatcgtct aatccaatag tcagcgacgt taaaccggta cccattgtgt cccaaatgaa 840 attaccagta atacaattgc ctgagtttgg aggtgatttt aatgattggt taccatttca 900 tgacacgttt gtgtccctga ttgataaatc agatgagctt tctggtgtgc aaaagctaca 960 ttatttgaaa gctgcgctca aaggtgaggc agcacggcta atgagccaat tctcactaca 1020 gatgagaatt acaaaatgca tggcaaatgt tggtagaccg ctatggcata aacatttgct 1080 taagaaacgt catatccaag ccattttgag gctaccgaag attatcaata gcaatctaga 1140 tttattgcga cgcactgtag atgatttcca gcgacacaca ttggtgttag aacaactcgg 1200 tgaaccaata aaacatctta gctcattcct tgttgagctt cttagtgaga aattggatag 1260 tgcttctctt gcggcgaggg aagaagcaca agcggataaa agttacacat acagtgacat 1320 ggttgagttt ctgcgtaagc gtgtgcgttt gttggaaacg cttgctaacg atacgggtga 1380 aacgagtaag cgacaaccgc gtgtaaaagt gagtgtgaac actgctgcag cggccgagaa 1440 aaaagtggat atgtgtgttg tgtgtggaaa gcaggggcat acaatagtga attgtcggcg 1500 tttcaatgaa ttcgatgcaa agaagcgcca cgaagttgtg aggcagcaca aattgtgttg 1560 gaattgcttg cagggcagcc attttgtaac cagttgtacg tcaaggtatg ggtgtcaaac 1620 gtgtggcaaa cggcatcata ccttgctgca tgctgaacga agtagtagtg taatagcaga 1680 tgattcggtg ggttctgtta gtacgatggt gcttgctaat attccaatgc agtgcaactc 1740 tactgaccga agctcatatt ctaatgttat gctgacaacg gtggtgttgt ttgtagttga 1800 cgcaaatggt acgcaacatc cagtacgtgc gttgttggat aacggtgcgc agcccaatgc 1860 gataagtgag cgattgagtc agcttttgtg cttgcgcgta tgcgtaccca tgtatccatt 1920 acaggtgtgg atggaacgac gactcaggcg tcatgtgaaa tgaaggtgga aatccgttcc 1980 agatttacgc aatttgctct gaaactaaat ttcttggttt tgagcaaggt aacagcaaac 2040 actccagcca catctttctc cacatcatgt tggaaattac ctgctgggtt ggcgctcgca 2100 gacccagaat tccatcagtc tggacgagtg gatatgctaa tcggtgcatc gcatttctac 2160 acgtttctga gggaaggccg gctcaaactt agtgaacatg gtccattgtt agttgaaaca 2220 gtgttcggtt gggtggtaac aggtgaagtg ctccgagaag aagctataat tcagcaacaa 2280 gcagctcagt gtcatgttat gttatcgtcg gaaaacatta gcgatcaact tgagcggttt 2340 tggaagatcg aagagctgca tgtttcgcat ttctcggcgg atgagcagag atgcgaagct 2400 tattatgagc aaacggtatc acgagacgaa acaggcagat atatcgtgaa actgcctaaa 2460 catcagcaac attctaccat gattggaaaa tcagaaacga catcacttaa gaggtttgct 2520 ggattagaac gtaagttatt tgctaactca caacttcgtc agcagtacaa tgagtttatg 2580 ttggagtaca tacaactcgg tcatatggtt cctgtgtcgc ctgacaacct ggatgcagca 2640 acctgctgct accttccaca tcaccctgtg tttaaggaga caagttccac gactaaaatg 2700 agagtggtat ttgatgggtc agcaccaact agcacaggac actctctgaa tgatgcgcta 2760 ttggttggac cagtcatcca agatgatctt ttgagcttaa taatccgttt tcgtaaattt 2820 caggtcgcgc tagttccgga cttggaaaaa atgtatcgtc aggtacttgt gcatccggag 2880 gatcgaccgt tacagagata tggtggggcg tatgagttgc ggacagtaac atacggtttg 2940 gctccttcat ctttcttagc tacaagaaca cttcagcagt tggctgaaga tgagggtgac 3000 gcgtttccta ctgccaagga tacgctgaaa aaacaactgt acatggatga ccttattgct 3060 gggtcaaata gtgttgatgg agctatacag ctacgtgagg aattgagtgc actggcgcag 3120 agaggcggtt ttacttttcg aaaatggtgt tcgaattcat tagccgtttt atctgacgtt 3180 cccgctgaac aattggcaac aaaatcatcg ttaaggttcg acgacaagga gacaattagt 3240 acgcttggta tatgttggga accagaaatt gatacgttcc agttcaacat ttctattact 3300 acgaaatcag agagagacac catgcgtacg atactatcaa tgattgctga actgtatgat 3360 ccattgggat tgatttcacc tgtaattatt acagccaaag ttttgatgca atcgctctgg 3420 cgtttgaagt tgagttggga tgatacggtg cctgaggaac ttcaaaggaa ttggattaga 3480 tttcgagcag aattaccaga acttaaagat tttagtatcc ctagattcgc tttcgctcat 3540 caatatcggc aagcagaaat acattgtttc acagacgcgt cagagcttgc atatggcgcg 3600 tgcatctata ttcggtctga agctgaggat ggaagcattc atgttaattt gctagcatca 3660 aaatcgagag tggctccact gaaggcattg acgattccta gacttgaact ttgcggtgca 3720 ttattaggag ctcgattgca tgagaaggta atggctgcaa tggagattaa gttcgttgct 3780 catcgatttt ggaccgattc taccgtggtg ctagattggc taaatgctga atcaaagact 3840 tggaaaacat tcgtagcaaa ccgagttgct gagatccaag caattcgaga tgctgtttgg 3900 caacatgtat ctggacaaga aaatccagct gaccttattt cacgcggagt tttaccgcat 3960 cagctcatca acaatcaatt gtggaaacaa ggtccacaat ggttatcaga gagaaaggag 4020 aattggcccc aacaaaaaga gagaacaggt caaattacaa cagacgaaat aagatcgaat 4080 gtcgttttaa caacgcaaat acaagaaaaa aatgaaatat ttacaagata tgggtcatac 4140 cagaagctaa tcgacgtggt tgcatattgt tttcgttttg ttcataatgc tcgtcgtctg 4200 caatcaagaa taagcaatag tgcattgacg gtaaaggaac ttgctgatgc taaaaagcga 4260 cttgtaaaac ttgtgcaagc tgaagaattt accaacgatc tttataaaat ccacaagggg 4320 attcctgttg ctcgaaattc aacgttgaag cttttaaacc cgtttataga taatgaagga 4380 ataatccgcg taggtggccg gctcagaaat tctgatttga attataatat taagcatcaa 4440 atagttcttc ctggattcca tccatttact caactcctca tcatggacaa acatgtaaaa 4500 gcaatgcatg gaggaatatc atcaactctt aacgcagtta gagatgagat ttggccaatc 4560 aatggaaaaa gagctgtacg taaggtcata cgaaattgtt ttcgatgttg cagagcaaat 4620 ccacaaccta taattcaacc tgaagggcaa ctaccagcag aacgtgttac agttaacgag 4680 gtgttcagtt gtacgggtct agattattgt ggacctttgt atttgaggcc aacacaccgc 4740 aaggccgcac caaataagtg ctacatttgt gtatttgtct gcatgagtac gaaagcagta 4800 catttggaat tagtcggaga tttgagcaca aattcgtttc tgatggcact tgatcgcttt 4860 gtttatcggc ggggtaaacc aaagcacatt tattcagaca atggtaccaa ttttattggt 4920 gccaaaaatg aacttcatca gatctacaaa atgctgttca acgattctgc agatagcaaa 4980 atagcaaaac atttggcaaa agaagagata caatggcatt tgataccccc acgcgcccca 5040 aatttcggag gcctttggga agcggcggtg aaggtagcca aaacccattt gattcgtcaa 5100 ctaggatcat cgcgtttatc atcggaggaa atgactactg ttttagtaaa aatagaaggt 5160 tgcatgaact cgcgaccatt agttccgctt tctgaagatc ccaatgattt gacggcatta 5220 actccagcgc actttcacat tacaaacaat ttgaaggtta ttctcgaacc tgacttgaaa 5280 gaggtgccta tgaatcgtct gggaagatac caactccttc acgggtacac gcaaaacttc 5340 tggatacact ggaaacagga ttatctgaaa aatcttactg ttttgcatcg gtcagctaag 5400 caatctaagc aattatctgt cggagatata gttattctga aggatgagca gcttccagca 5460 gttcaatggc cattggcacg tgtcgtcgaa atacaccctg gagctgatgg aatttcccga 5520 gttgcaacac ttcgtacggc atccggtatt gtgaagcgag cagtatctaa aatctgtccg 5580 ttacaatgca gtaatcaaag aatggattga aaactgagtg tttcaaggtg gccggtatgt 5640 tcgatctcaa atgacctgcg gttctaccac agtgccggat tatggcagct gacgttttgt 5700 cattcttgtg ggtctaggga gctaccgctg tgaaagggga accgcctttt tgtggcactt 5760 gtcacgagcc gaagaaaagc gacgacgcat aaatggtgtg tgaataaaga agtgaactta 5820 cttgccaaca aagaacaacc gactgattat ttcgtgtaaa aaaaaagtaa aaatatcaca 5880 acgtagggac tttcacggaa cacaaattaa ccagcaatca aatcaggtga gacagaaaac 5940 ctaccatgca ataaatgctt agcacacaca aaccgtagac agg 5983 // ID BEL11-LTR_AG repbase; DNA; ANG; 261 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL11-LTR_AG is a long terminal repeat of the BEL11_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL11-I_AG; BEL11-LTR_AG; BEL11_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-261 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL11_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 30-30 (2003). XX DR [1] (Consensus) XX CC BEL11-LTR_AG flank an internal portion of BEL11_AG (deposited as CC BEL11-I_AG). XX SQ Sequence 261 BP; 80 A; 49 C; 71 G; 61 T; 0 other; tgtttggtac cgcatacggt aggatgtact ggtgaggaga agtgtagagg aacgagaatt 60 ttcggcgtaa gtgtgtgcgt gtttttgcac acgctgcttg agctgggtgc gcggcgctca 120 gcgatcagtc gcgaatcgat actgaataaa gacacacgtg aactgtacga acaatataaa 180 ttcctccgtg taaggcaaat attgatcaaa taaaaatatc acaacgtagg gcaaacttgc 240 taacgatcac gggacagaac a 261 // ID BEL-20_AG-I repbase; DNA; ANG; 5311 BP. XX AC . XX DT 01-SEP-2010 (Rel. 15.09, Created) DT 01-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE BEL-20_AG-I. XX KW BEL; LTR Retrotransposon; Transposable Element; BEL-20_AG-I. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5311 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1431-1431 (2010). XX DR [1] (Consensus) XX CC Family composed of three sequences presenting high degree of CC identity among them (p-dist= 0.0068; SD=0.0010), two of them CC contain 213 nucleotides long identical LTRs. The consensus CC sequence presents an ORF of 1758 aa with conserved domains for CC RT, PeptA17 and RVE. XX FH Key Location/Qualifiers FT CDS 3..5261 FT /product="BEL-20_AG-I_1p" FT /translation="VKVKMTQAKASSEESFRGFSDSETENSTIIDNSAMAL FT ALAKEERDGLIECLVRLEKFVENSGIVELEQVEARLKRLERCWEHFVKISK FT EIRQFDDQHNKKENFAIFADFDERVCVLTGKLKRMQRSSNVPKKEENMSNV FT TNTSTGGVKLPKMALPEFQGKFDEWLQFRDMYEQMVHNNITLSKVEKLYYL FT KSSLKGEALKVIEAFPICAASYDAAWDAVVSRFANPYIQKKRHTNELLNWP FT RMKKPTAANINAVIDGFERHTKLLQQLGETPSTWGIMLTQLLTSKLDEGTQ FT REWERTVEAQDDANYNDLIKFLRGQVRILEALGEDKIEQRNQPLPTSKAPK FT LAIHVTAGEKKLKCNVCGAEHSTAKCDEFISMAVKERIKIARAKELCLNCL FT GKGHFRNQCASKVRCRACKGAHHSLLHMWLPKENTMPSTSAQSNNNVDETF FT VNEAQGANHGNATLTMAVTRSGVSCGVLSTALVNVKAKNGMFIQTRALLDS FT GSQLNVLTEKLCRKLGLQRRASGIKLTGIGKHEVSSDIAVTAEVASQSNNY FT ARRMEFLVMDGITHNLQPVHLPNACIPNEGKIADPGWHQGGEIDMLLGSEH FT FFEFLALDGGRPGVYRMNDSHPYFVNTVFGWVLTGPKQTQQKTSAICSHVS FT IAEQIERFWSIEEVCPENSLTQEEKDCEQSFVTTHSRDETGRYIVKLPIKL FT KGWERIGESLGTAKKRFLQLENRLSRDVALYENYCSTIKQYQELGYLVEVS FT PDMSQGDNVKPCYLPHHPVVKLSSASTKVRPVFDGSAKTSTGFALNDCLLN FT GPVLQDTLFDLIVRFRSYAIALIADIEKMYLQVKVHEEHTPLQRILWRSSQ FT QEEIKHFELQRVTFGLTSSSFLATRVLFQLAADEGERFPLGKRALQESFYV FT DDYIGGANSDAEAVQLVSELRQLLEKGGFHLRKWNSNNKMVLRNLSPEEKD FT KCKLVSIGPEDQVKTLGVYWDPTTDSLGVAADFDDVSNEPPTRRNVFSFIA FT KLFDPLGIIAPIIAWAKIMMQRLWIATKEWDDPIPVDLADQWELFKKQLYL FT VKEMRIPRYVMLLDHTNVQIHSFADASEVAYGACVYLRVTNREGTVKIGLL FT AAKSKVAPLKKLSLPRLELCAAVLGAKLWKTACIALKQQITESYFWSDSTI FT VLNWLRAPSYTWATFVANRVATIQDLTQGNHWQHVKGSENPADILSRGALP FT NQLTKEWFQGPHWLSNNECTWSFPEEKSDIDESQLERKRQVAVMTMGVAEH FT PFLDRYSSYWKCGRVAAYCIRFAQRCRNEPLVSGPITYNEFIRAIRRLVVD FT LQRIEFHVEMRELEEKGELLSTSKLRKLRPFLDDEGVLRVGGRLRQTNIRY FT NTKHPMILPAKGNFTRIVAKAYHEMALHGGPRITLATMRQDFWPLNGRILA FT NYIHRNCLTCFKANSTPVAQPIGQLPVKRVTPARPFLTTGIDFCGPVYLKP FT VHRRATSQKAYIAVFVCFCTKAVHLELVEDMTTSAFLAAFRRFISRRGYPS FT DVYTDNGLNFRGAQRELNDLFRLLSDDSFQATTREETMKCGITWHFIPPRA FT PNFGGLWEAAVKSTKKTLTKLFGSQRLSFADMSTVLTQIEAQLNSRPLTPL FT SEDPEELNVLTPGHFLIGEALLTLPDANHVTTPENRLKHFEQLQQLVQRHW FT QQWSKEYICELHNVSQKGLRSKKIEIGQMVIVKEDLPPNEWCLGRVIGVHP FT GSDQVVRVVTIKTSKGTYKRPVSRVCLLPDSED" XX SQ Sequence 5311 BP; 1635 A; 1047 C; 1311 G; 1318 T; 0 other; tggtaaaagt gaaaatgacg caagccaaag cctcaagcga agaaagtttc cgtggtttct 60 cggactcaga aacggaaaat agcacgatta tagacaacag tgcaatggct cttgctctcg 120 caaaggagga acgagatggc ttgattgaat gtctagtaag actagaaaaa tttgtggaaa 180 attcgggaat agttgagttg gaacaagtcg aagctcgatt aaaacgtctt gagaggtgct 240 gggagcattt tgtaaagatt tcaaaggaaa taaggcagtt tgacgaccag cacaacaaga 300 aagaaaattt tgctatattt gccgacttcg atgagcgtgt gtgtgtgcta acaggcaaac 360 ttaagagaat gcagcgtagc agcaacgtgc caaagaagga agagaatatg agcaacgtta 420 ctaacacatc cacaggcggt gtgaagctcc ccaaaatggc gctaccagaa tttcaaggta 480 aattcgatga gtggctacag tttagggaca tgtatgagca aatggtgcac aacaacataa 540 ctctttctaa agtagaaaag ttatactatt tgaagagttc actaaagggt gaagcattaa 600 aagtgatcga agcttttccg atatgcgcag caagctatga tgctgcatgg gacgctgtgg 660 taagtcgttt tgctaaccca tacatacaaa agaaaagaca tacgaatgag ttgctaaact 720 ggccgagaat gaaaaaacca acagccgcaa acatcaacgc tgtaatcgat ggttttgaaa 780 ggcacacaaa gttattgcaa cagttaggag aaacgccgtc aacatggggt atcatgctca 840 cgcagctttt aacatcaaag ctcgatgaag gaactcagcg tgagtgggag cgaacagttg 900 aggcacaaga tgatgctaat tacaacgatc taataaaatt cttgcgagga caagtacgaa 960 ttttggaagc tttgggcgaa gataaaatag aacagcgaaa ccaaccgttg ccaacatcaa 1020 aagctccgaa attggcgatt cacgtgacag caggtgagaa aaagcttaaa tgcaatgttt 1080 gcggtgcaga gcattcaaca gcaaaatgtg acgagttcat ttcaatggcg gtgaaagaac 1140 gaatcaagat tgcgcgagca aaagagcttt gtctaaactg tttaggaaaa ggccattttc 1200 gaaaccaatg cgcatcaaag gtacgttgtc gtgcatgtaa gggtgctcat cattctcttt 1260 tacatatgtg gttaccgaaa gaaaacacta tgcctagcac atcagctcaa agcaacaaca 1320 atgttgatga aacattcgtc aatgaagcac aaggtgcaaa tcatggaaat gctacgctca 1380 ctatggctgt tacaagatca ggtgtttctt gtggcgtgct ttctacagct ctagtaaatg 1440 tgaaagcaaa gaatggtatg ttcattcaaa cgcgagcact attagatagt ggctcacagc 1500 tcaacgtctt gactgagaaa ctttgtagaa agcttggttt acaacggcga gctagcggaa 1560 tcaaactaac aggtatagga aagcatgagg tcagtagtga catagctgta acggcggaag 1620 ttgcgtcgca gagcaacaat tatgcgagaa gaatggaatt cttagtgatg gacggcatca 1680 ctcacaatct tcagccagta catcttccta acgcatgcat tccgaatgaa ggtaaaatag 1740 cagaccctgg atggcatcag ggtggagaga tcgatatgtt gttggggtcg gagcatttct 1800 ttgaatttct agcgctagac ggcggccggc ctggagtcta tagaatgaac gattctcatc 1860 catattttgt taatactgtg ttcggctggg tattaactgg gccaaagcaa acgcagcaga 1920 aaacatcagc gatttgttct catgtgagta ttgcggagca aattgagcga ttttggtcga 1980 tcgaagaggt gtgtccagag aacagcttga cccaagaaga gaaagattgt gagcaaagtt 2040 tcgttacaac acattctcgt gacgaaacgg gtcgatacat agtaaaactt cccataaagc 2100 tcaaaggttg ggaaaggatt ggagagtcct taggaacggc aaagaaaaga tttctacaac 2160 tggaaaaccg attgtcgagg gatgtggcac tatacgaaaa ttactgctca acgataaagc 2220 agtatcaaga gctgggatat ttagtagagg tatctcctga catgtcacaa ggtgacaacg 2280 tgaagccatg ctatttacca catcacccag ttgtgaaatt atccagtgct agcaccaaag 2340 tgcgcccggt tttcgatggt tccgctaaaa cctcaactgg ttttgcacta aatgattgcc 2400 tcctaaatgg acctgtgttg caagacacac tctttgatct aattgtgcga tttagatcgt 2460 atgccatcgc gctcattgcg gacatcgaga agatgtactt gcaggtaaag gttcatgagg 2520 aacacacacc attgcaacgc atcttatggc gatcttctca acaagaggag ataaagcatt 2580 tcgaacttca gagagtcaca tttggcctga cgtcatcatc ctttttggca actcgagtgc 2640 tctttcaact tgcggcggat gaaggagagc gatttccatt aggtaaaagg gctttacaag 2700 aatcattcta cgtggatgac tacatcggtg gagcaaatag tgacgctgaa gctgtccagc 2760 tagtaagtga gttgcggcaa ctattggaga aaggaggttt ccatttacgg aagtggaact 2820 caaataacaa aatggttttg cgaaacctgt ctccagagga gaaagataaa tgcaagcttg 2880 tgagtattgg accggaggat caagtaaaaa cgcttggcgt atattgggat ccaactactg 2940 attcgttagg agtagcggca gatttcgacg atgtgtcaaa tgaacctcct accaggcgga 3000 acgtgttttc gttcatagcc aaattattcg acccgcttgg cataattgct ccaattatcg 3060 cctgggcaaa aatcatgatg cagcgtttgt ggatagccac aaaggaatgg gatgatccaa 3120 tccctgtgga tttggctgac caatgggaat tgtttaagaa acaattgtac ctggtaaagg 3180 agatgcggat tccaaggtat gtgatgcttc ttgatcatac caatgtgcaa attcattctt 3240 ttgcagatgc ttccgaagta gcctatggcg catgcgttta cctccgagta acaaatcgcg 3300 aaggcacagt aaaaataggt ctccttgcag caaaatctaa agtagcgcca cttaagaagt 3360 tgagcttacc acggttggaa ctctgtgctg ctgtgctggg agcaaaactt tggaaaactg 3420 catgcatcgc cttgaagcaa caaataacag aaagttattt ctggagtgac tctaccattg 3480 tgctgaactg gttaagagct ccatcgtata cgtgggcgac tttcgtggca aatagagtcg 3540 ccacgataca ggacttaacc caaggaaatc attggcaaca tgtgaaagga agtgaaaatc 3600 cagcagacat cctgtccaga ggtgctttac ctaatcaact cactaaggaa tggttccagg 3660 ggccacactg gctctcgaat aatgagtgca catggtcatt tcctgaggag aaatctgaca 3720 tcgatgaatc gcaactcgag cgaaagcgac aagtggctgt aatgacgatg ggcgttgcag 3780 aacatccttt cctggatcgc tattcctcct actggaaatg tggtagagtg gcagcttatt 3840 gtatccggtt tgctcaaaga tgcaggaatg agccattagt cagcggtccg attacatata 3900 acgaatttat tagagcgatc cgacgtttag tcgtcgattt acaacgaatt gaattccatg 3960 ttgagatgcg ggaattagaa gaaaagggag aattgcttag cacatcaaaa ctaaggaagt 4020 tgcgaccgtt tctggatgat gaaggtgtcc tacgtgttgg aggacgtctt aggcaaacaa 4080 acatcagata caacacaaaa catcccatga ttcttccggc aaagggcaac tttacgcgaa 4140 ttgttgcaaa ggcataccat gagatggcgc tgcatggagg tcctcgcatc accctggcaa 4200 caatgaggca agatttttgg cctttaaatg gcagaatatt ggcaaactac atacaccgaa 4260 actgtttaac atgtttcaag gctaattcaa caccagttgc gcagccaatt ggccagctac 4320 cagttaagag agtgacacca gccaggccat ttctgacgac tggtatcgat ttctgcggcc 4380 ctgtgtattt aaaaccagtt caccgtcgag caacatctca gaaggcatac atagcagttt 4440 ttgtatgttt ttgtactaaa gcagtgcatc tcgaactggt tgaggatatg acaacatcag 4500 catttttggc ggcatttaga cggttcattt cgcgtcgtgg ttatcccagt gatgtctata 4560 cggataacgg tttgaatttt cgtggagcgc agcgtgaact taatgacctt tttcgtttgt 4620 taagcgacga ctctttccag gcaactacaa gggaagaaac gatgaagtgt ggaatcacct 4680 ggcacttcat acctccgaga gcacccaact ttggagggct atgggaggcg gccgtgaagt 4740 cgactaagaa gacgcttacc aaactgtttg gatcccagcg attgtcattt gctgacatgt 4800 ctaccgtgtt gacacagatc gaggctcagc ttaactcgcg cccactcaca cccttatcgg 4860 aggatccaga ggagctcaac gtattgacac ctggccattt tctcattgga gaagctctat 4920 tgacgttacc agatgctaat cacgtgacaa cacccgagaa tcgtttgaag cactttgagc 4980 agctgcagca gctcgtgcaa cggcattggc agcaatggtc gaaggagtac atttgtgagt 5040 tgcataatgt cagtcaaaag ggattacgaa gcaagaagat cgagataggt cagatggtaa 5100 ttgtgaagga ggatttacca ccaaacgaat ggtgtttagg cagagtaata ggagtgcatc 5160 caggtagtga tcaggtagta agagtagtga cgattaagac gtctaaggga acttacaaga 5220 gacctgtgtc aagggtgtgt ctgttaccag atagtgaaga ttagattgtt gaaaatgttt 5280 ttattgaaac attacttcaa ggcggccggt a 5311 // ID HELITRON2N_AG repbase; DNA; ANG; 1087 BP. XX AC . XX DT 29-JAN-2002 (Rel. 7, Created) DT 29-JAN-2002 (Rel. 7, Last updated, Version 1) XX DE HELITRON2N_AG, a nonautonomous rolling-circle DNA transposon - a DE consensus sequence. XX KW Helitron; DNA transposon; Transposable Element; HELITRON class; KW HELITRON2N_AG; HELITRON2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1087 RA Kapitonov V.V. and Jurka J.; RT "HELITRON2N_AG."; RL Direct Submission to Repbase Update (26-DEC-2002). XX DR [1] (Consensus) XX CC HELITRON2N_AG is a nonautonomous rolling-circle transposon CC derived CC from the autonomous HELITRON2_AG. Some copies are less then 1% CC divergent. It's very likely that the mosquito genome harbors CC some active Helitrons. XX SQ Sequence 1087 BP; 334 A; 193 C; 216 G; 344 T; 0 other; tctatatata taaaaatctc gtgtcacggt gtttgtcgcg agcaaactcc gaaacggctg 60 gatcattttc aacgaaactt tgcacacacc ttatggtggt atgagaatag gttttaagac 120 tcacatcatt ttcagaagtt gcactaaact taatttatta gcaattttct aactcccata 180 caaaggacat cattgttgtt gttgtataca gcactttgac agtcggtgtg aagcgcggtt 240 tacactcggt gtaaggtgcg gtttacactc gctgtgaaat gcctagcggt ttacgatgct 300 gtcacaaata tccataatga caaaacagag atggagtact ttagtttggg aataatgaat 360 taaaaaatac atgctatttc gctttgtcct gtttgctgtt tgatgttcca aaacgaacaa 420 atatgcttgc tttaataatt gtaacataaa catgtttggt tactgatgtt aaactcgcaa 480 gtgtaaacaa catttttcgt tatggtgttt tttttttcgg tctgcccggc tgcggtctta 540 cccgaggtca aacgatgtct gtttgcggtt tggacttacg aacatcgtgt ttttcgcatg 600 gacaattata tgtggcatgt tctcgcgtgg gtaaaccttc caatctatac atactagcta 660 aagacagatt aaccaaaaat atcgtccatt cattagcgct tcgagatttg cattagagat 720 tcatagtagt tatatgtgtg tgacaagtat gtaattattt gcatgtaaat aaatcaatct 780 acgtaattct acactgtttt actgttccaa attatgttct taaacatatt atttgagtaa 840 ttaacgacaa tgacatcaat gaaaacaaat aaaatgaata aaaaattaaa tgtgaattta 900 aacacctata tttgcaactt aacggactgg cagacgaagg cgcgagaccg tgagcggttt 960 cggacactcc tgaggcaggc caagaccgca aagcggttgt agtgccggat aagtaatttg 1020 caacttcaaa tacttaatcg gcaactaata aaatgtgggg taaaactagg tttaccgggc 1080 cagctag 1087 // ID MARINERN9_AG repbase; DNA; ANG; 358 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE MARINERN9_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN9_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-358 RA Kapitonov V.V. and Jurka J.; RT "MARINERN9_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(3), 64-64 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of MARINERN9_AG in the genome, CC they are ~96% identical to the consensus sequence. CC MARINERN9_AG copies are flanked by the TA target site CC duplications. CC This family is characterized by a target site specificity: CC MARINERN9_AG elements are inserted preferentially it the 10-bp CC acatTAatgt palindrome. CC The consensus sequence has 19-bp terminal inverted repeats CC (3 mismatches) and 21-bp subterminal inverted repeats (pos. 29-49 CC and 329-309). CC Putative classification: a nonautonomous Mariner/Tc1-like DNA CC transposon. The genome harbors several subfamilies of CC MARINERN9_AG. XX SQ Sequence 358 BP; 112 A; 69 C; 65 G; 112 T; 0 other; gccggcatcg cgaataaatg aacacatttt aacagtcgtt taaagttcat tttggtttaa 60 ccgagttcat ttgttgttca ttcctgttaa aataaacgat cgtcaaaaca gttaacgact 120 gctcgatatc gcatataggc aaacacgggg aaaaaattcg agcagtataa ttaaacgact 180 gttaatttgt atcgcatacg ctcttgacag tcgtttgttt aacggcggca taggaccagt 240 cgttaaaatt aacaaatgaa ctcaatgcgt tcgcgatgcc aaattctagc aatttttaac 300 gactctggtt aatttttaac gactgttaat tttaaaccaa ttctttcgcg atgccagc 358 // ID GYPSY14-I_AG repbase; DNA; ANG; 6035 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY14-I_AG is an internal portion of retrotransposon GYPSY14_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY14-I_AG; GYPSY14_AG_LTR; Gypsy clade; RNase-H; KW integrase GYPSY14_AG; mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6035 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY14_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 170-170 (2003). XX DR [1] (Consensus) XX CC GYPSY14_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY9_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, CC GYPSY13_AG, CC GYPSY15_AG, GYPSY16_AG, and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY14-I_AG consensus was reconstructed after multiple CC alignment of 8 copies. CC The consensus encodes the 382-aa GYPSY14_AG1p gag-like protein CC (pos. 1218-2363) and the 1235-aa GYPSY14_AG2p (pos. 2288-5992). CC The sequence of the LTRs flanking GYPSY14-I_AG is deposited as CC GYPSY14-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1218..2363 FT /product="GYPSY14_AG1p" FT /translation="MLPKIKQYLKRLQDIEEKLKPGDKKFRRCLIEHYGVT FT ANNTFEKIKNLVSKLKIAEAPVELETITKFANISYNNINSYIKFQLQNKVQ FT FKFKTLVKISIALRRWKRKTMEDFDIKTAATLVQLYDGNADSLENFLGSAN FT LLRKLFPKDTKQILVEFLKTRLTGKAQLGLAPNITDFETLIQDVKSRCQEK FT INPDKVIAELKSICHTDTKTLCDEVEILCTKLKVMYLKQNIPEQVANDMAV FT KRGIETLIDKVKLTETRIILQAGQYTSISDATEKALESERMHNSAQQVFAF FT RKNFSAPNRSFDNRYSNQYTRNTYNRSQNRGNFTSNWQTNNRHQHNNFHNT FT NNRRFFRNTNRNNNRIYTTQATEQEHFLEENPQHTDYQS" FT CDS 2288..5992 FT /product="GYPSY14_AG2p" FT /translation="QDLHYTSHRTRTFFRRKPAAYRLPILAINLNANNLIK FT VRVEMSTEEYSHLIIDTGADVSLFKINKIKSNQHISAQNKINLTGITTHTI FT DTLATTFTTIHFGETSVKHQFHLIPKELDIGADGILGRDFFANYRCVIDYE FT HWLLNFTHNGMAVCQPIEDKSNDGFILPTRSEVIRKVKLPNIDEDSIVFAE FT EIRPGLFCGNSIISKENQYVKFINTTEKNIFINHKSFRPQVDQVKNYNVTK FT PSGNKEITRFDSIINSINMKNVPQYIESDLKNLLAKYTDLFCVNEDKISIN FT NFYKQSIQLKDNIPCFVPNYKQIHSQSDEIKQQVDKMLKNDIIEHSVSSYN FT SPILLVPKKSETEKKWRLVVDYRQLNKKIMPDKFPLPRIDEILDQLGRAKY FT FTTLDLMSGFHQIPLDKESRKYTAFSTATGHYQFKRLPFGLNVSPNSFQRM FT ITIAMAGLTPESAFVYIDDIIITGCTTKHHLDNLKKVFDRLRHYNLKLNPQ FT KCKFFQTEVTYLGHKITDKGIYPDESKFEAIKNYPKPVNSDEVRRFVAFCN FT YYRKFIRNFAEIAKPLNELLKKGKIFNWSSECQQAFDTLRNNLLSPMVLKY FT PDFTKEFIITTDASDTACGAILSQICEGNDHPVAFASKSFTKGEKNKPTIE FT KELTAIHWAINYFKPYVYGRKFKIRTDHRPLVFLFNMKNPTSKLTRMRLDL FT EEFEFEVEFVAGKTNVGADALSRIVTTSDRLKSLQECTNEANNNQALVVKT FT RAMTRANREETNPVAQNDTSEKIGKPAFWNTENPSEVNKLLKIQSKINKDS FT IKITLFNNNYNKELESTIIRLNQNGSQAFEAALLRLCTYAKKLNRDKLAIS FT MEDEIFTQYSPQTIKEIVNRTIFGCEIIGFVPPKWITARKEIEEILHNYHM FT LPSGGHVGQYRLYLKIRDKYKWKNMKEDIKGFVKNCEMCKINKITRHTIEK FT PVVTTTPLRPFEIISIDTVGPLPKTAKHNRYAVTVQCDLTKYVILIPTGNK FT EANTIARALIENVILTYGKFAEMRSDQGLEYNNEVLKKVAEALGIKQTFAT FT AYHPQTIGALERNHRTLNEYLRAFTNEHGDDWDNWVKFYEFTYNTTAHTDT FT GYTPFELLFGRKANLPQDIVTGKIEPVYNHELYYNELKYKLQKSNEVARKK FT LIEQKEKRHIAPNCSINPLIIEPGDKVYLKNENRRKLDPLYLGPYIVSKIQ FT NPNCTIVSKYSKKESTVHKNRLVKR" XX SQ Sequence 6035 BP; 2348 A; 1124 C; 1133 G; 1430 T; 0 other; tggcgaccgt gatcgtgaaa tccaacgacg acacgaatct gtgataaatc cacgcaatac 60 caaatcggat gtgcacagaa taccaacaac aaaacatcca tcggtataca aaaaaaaata 120 agtctaacgg taacgtgtgt aattcctcga ctacgcaaac ccgaagctag caagtgtgct 180 tgagagtaca ttgaccgacg tttggagcaa cgaccgagga tcacgtcgaa tcaggactag 240 attcccaagc ggtcaggtga aatcgaatgc gtgtcaagag cgagagaaaa aaaaaataag 300 aaccgtacgg gaatagcaca ccacggcaaa tagtgatcgg taatagtcgg agagaacaga 360 ggtgtgagaa aattaatagc gtaaaagcgg gcgtaatgag agtggtggat tgctatcatc 420 atcagcagtc gaaggcatga gaagagtgcg acgaaagtag tgatctgtac ggcgagtgca 480 gtgtacagtg taccacgagg ccagtgaaca gtacgagaag tgtgcgacaa aagtagtgat 540 ctgtacggcg agtgcagtgt acaatatgag aagtgtacca ccaggccagt gaacagtacg 600 agaagtgtgc gacgacggcg gtagacggca ggagtatact gtgtgacgac agcagtgaac 660 agtacgagaa gtgtgcgacg acggcggttg aaggcaggag tatactgtgt gacgacagca 720 gtgaacggaa agagaagtgt acgacgagag aaacgagcag taagagaagt gcatcacatt 780 atcagcgagc ggtacgagaa gtgtgttgtg acgacggcag cggaaggtga gaattaacga 840 gagcagcgga cggcgtttgg cgagagcagt tgacggcgta tgaagtgttc gagaatatcg 900 gacggcaaat gcagcagcga aatacgataa tgacccctcc cccccgtaag gtaggatatg 960 cgtagtgaaa tagaattggg ttacaaagaa tatttttggg tttaattata ctaaggagct 1020 aaggattaag ggttaaatta taattgtaat tgctcaatta attttttttc tcgcgcaaaa 1080 aaaaaaaaaa aaaaaaaaaa tatatataaa aaattaaaat atatatgcat acatatatag 1140 aaaaaagtca ctattcaata taccccttca cgattattta ccttcgttgc ctcaaaatag 1200 tcgacatgct cgtttaaatg ttacctaaga tcaaacaata tttaaaaaga ttacaagata 1260 ttgaagaaaa attaaagccg ggtgataaaa agtttcgcag atgtttgatt gaacactatg 1320 gtgtcactgc aaacaacacg ttcgaaaaaa taaaaaattt agtatccaaa ttaaagatag 1380 cagaggcgcc agtagaatta gaaacaatta ctaaatttgc aaatatcagc tacaacaaca 1440 ttaattcgta cataaagttt caattacaaa ataaagtcca atttaaattt aaaacacttg 1500 ttaagattag cattgcatta cgccgttgga aaaggaaaac catggaggat ttcgatataa 1560 aaaccgcagc aacattagta caattatatg acggaaatgc ggacagctta gaaaattttc 1620 tgggttcagc aaatttatta aggaagcttt ttccgaaaga cacgaagcag atactagtag 1680 agttcctcaa aacacgatta acaggtaaag ctcaactagg gttagcacca aacataactg 1740 attttgaaac ccttatccag gatgtaaaat cacgatgcca agaaaaaata aaccccgaca 1800 aggttatagc agaattaaaa tcgatttgcc atacagacac gaagacccta tgtgacgaag 1860 tggagatact atgtacaaaa ttaaaagtga tgtatctcaa acaaaacata ccggagcaag 1920 tggctaacga catggcagtc aagcgaggaa tagaaacact cattgacaag gtcaaactca 1980 cggaaacaag gattatatta caagcagggc agtacacctc tatatcagat gcaacagaaa 2040 aggcactaga aagcgaaaga atgcacaatt cagctcaaca agtgttcgca tttagaaaaa 2100 actttagtgc tcccaatcgc tcctttgaca atagatattc aaatcagtac actcgtaata 2160 cgtacaacag gtcacaaaat agaggaaatt ttacttcaaa ttggcaaaca aataatcggc 2220 atcaacataa taattttcac aacaccaata accgacgatt tttcagaaat accaacagaa 2280 acaataacag gatttacact acacaagcca cagaacaaga acatttttta gaagaaaacc 2340 cgcagcatac cgattaccaa tcttagccat aaatctaaat gccaataacc ttatcaaagt 2400 aagagtagaa atgtcaacag aagaatacag tcacctaatc atagacaccg gagcagatgt 2460 ttctttgttt aaaataaata aaatcaagtc caaccaacat atatctgcgc aaaacaaaat 2520 taatctgaca ggtataacta ctcatacaat agatacttta gccactactt ttacaacaat 2580 acatttcgga gagacgtctg tcaaacacca attccacctg ataccaaaag aattagatat 2640 tggggcagat ggcattttgg gcagagattt ttttgcaaac tatagatgtg taattgatta 2700 tgagcattgg ctactcaatt ttactcacaa tggcatggcc gtctgccaac ccatagaaga 2760 taaatccaat gatggtttca tattacccac gcgaagtgaa gtgatacgaa aagtaaaact 2820 gccaaacatc gacgaagatt ctattgtgtt tgcagaagaa ataaggccag gattattttg 2880 tggcaattca atcatatcga aggaaaacca atacgtaaaa tttataaaca caactgagaa 2940 aaacattttc atcaaccata agtccttcag accccaagtc gatcaagtga aaaattataa 3000 tgttacaaaa ccctccggta acaaggaaat aaccagattt gatagtataa taaatagcat 3060 aaatatgaaa aatgtcccac aatacataga aagtgactta aagaacctcc tagcaaaata 3120 tacagacctg ttctgcgtaa atgaagataa aatttccatt aataattttt acaaacaatc 3180 gatacagcta aaagacaaca taccgtgttt cgttccgaat tataaacaaa ttcattccca 3240 atcagatgaa ataaaacaac aagtcgataa aatgttaaaa aatgatatta tcgagcattc 3300 agtatcatcc tataattccc caatattgtt agttccgaaa aaatcagaga cagagaaaaa 3360 atggagactt gtcgtagatt accggcaatt aaataaaaaa ataatgccag ataaatttcc 3420 attgcccagg atcgatgaaa tattagatca actcggtagg gcaaaatatt tcacaacact 3480 agatctcatg tcaggtttcc accaaatacc tttggataaa gaatcgcgga agtatactgc 3540 gttttccacc gctacaggcc actatcaatt caaaaggtta ccctttggac taaacgtgag 3600 cccaaatagt tttcagcgca tgattacaat agcaatggct gggttaaccc ctgagagcgc 3660 gttcgtttac atcgatgata taatcataac aggttgcact actaagcacc accttgataa 3720 tttaaagaaa gtttttgata gactacggca ttataatctt aaattaaatc ctcaaaaatg 3780 taaattcttc caaaccgaag ttacttacct tgggcacaaa attacagaca agggaatata 3840 ccctgacgaa tccaaatttg aagcaattaa aaattacccc aaaccagtta actcagacga 3900 agttagaaga tttgttgctt tttgcaatta ctaccgtaaa ttcattcgta actttgcgga 3960 aatcgcgaaa cctttaaacg aattgttaaa aaagggaaaa atattcaatt ggtcgtccga 4020 atgtcaacaa gcttttgata cccttcgcaa caaccttcta tcacctatgg tcctcaaata 4080 tcctgatttt acaaaagaat tcattatcac tactgatgcg tcggacacgg catgtggggc 4140 catactatcg caaatatgcg aaggaaatga ccatcctgtg gcatttgctt ctaaaagctt 4200 taccaagggt gagaaaaaca agccaaccat agaaaaggaa ttgacagcta tacattgggc 4260 tatcaattat ttcaaacctt atgtctatgg tcgaaaattt aaaattcgta ccgaccatag 4320 acccctagtt tttctgttca acatgaaaaa tccgacctcc aaactaacac gaatgagatt 4380 ggaccttgag gagtttgaat ttgaagtaga attcgtagcc ggtaaaacta acgttggggc 4440 tgacgcactc tcaagaatag taactacctc tgatcgtctg aaatcattgc aagaatgcac 4500 taatgaggcc aacaacaatc aagctttagt agtaaagact agggcaatga ctagagctaa 4560 tagagaggaa acaaatcctg tagcacaaaa cgacacaagc gagaagatag gcaaaccagc 4620 attttggaac accgagaacc catcagaggt taataaatta ttaaaaattc agtcaaaaat 4680 aaataaagat tcaattaaaa taacactttt taacaacaac tacaacaagg aactagaaag 4740 tacaataatc agattaaacc agaatggaag tcaggcattc gaggctgctc ttctaagatt 4800 atgtacttat gcaaaaaagc ttaatagaga caaattagct atctccatgg aagacgaaat 4860 ttttactcaa tactcgcctc aaaccattaa ggaaatcgta aatagaacca ttttcggttg 4920 cgagattatt ggatttgtac cacctaagtg gataacagca agaaaagaaa tagaagaaat 4980 tctacacaat tatcacatgc taccttcggg aggacatgtt gggcaatatc gcctctatct 5040 aaaaataaga gacaaataca aatggaaaaa catgaaagaa gatataaaag gcttcgttaa 5100 aaactgcgaa atgtgtaaga taaacaaaat aacacgacac acaatagaga aaccagtagt 5160 aacaacaaca cccttaaggc cgttcgaaat aatttcaata gatacagtgg gccctcttcc 5220 aaaaacagcc aagcacaaca ggtacgctgt tactgtacaa tgtgacctca caaaatatgt 5280 aatcttgata ccaacaggaa ataaagaggc caacacaata gcaagagctc tcatagagaa 5340 tgtcatttta acatacggta aatttgcaga aatgcggtcg gatcaaggtc tagaatacaa 5400 caacgaagtg ttgaaaaaag tagcagaagc actggggatc aaacagacgt ttgcgactgc 5460 ttatcaccca caaacaatag gcgctttaga acggaaccat agaactttaa acgaatattt 5520 gagagcattt accaacgagc atggcgatga ttgggataac tgggtcaaat tctatgagtt 5580 tacttacaac accacagcac atacagatac tggctataca ccatttgaac tattatttgg 5640 gagaaaagca aatctcccac aagacatagt gacaggcaaa atagaacctg tgtataatca 5700 cgaactttat tacaatgaat taaagtacaa actacagaaa tcaaacgaag tagcgcgaaa 5760 gaaattaatt gagcaaaaag aaaaaagaca catcgctcca aactgcagca tcaacccgct 5820 aataattgaa ccaggagata aggtgtactt aaagaacgaa aacaggagaa aactagatcc 5880 tttatattta ggtccatata ttgtttctaa aatacaaaac ccaaattgta caatcgtaag 5940 caaatacagc aaaaaggaaa gtactgtaca taaaaatagg ttagtcaaac gatgaaaatc 6000 tatgctatac gattttcatt tctaaagggg ggagg 6035 // ID RETRO981_AG_LTR repbase; DNA; ANG; 229 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO981_AG DE retrotransposon - a consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; BLASTOPIA; KW INVADER; Long terminal repeat; MDG3; RETRO981_AG_I; KW RETRO981_AG_LTR; retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-229 RA Jurka J. and Drazkiewicz A.; RT "RETRO981_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 23-23 (2002). XX DR [1] (Consensus) XX CC Related to BLASTOPIA, MDG3 and INVADER from Drosophila CC melanogaster. CC 4 bp target site duplication. XX SQ Sequence 229 BP; 72 A; 44 C; 59 G; 54 T; 0 other; tgtaggaaat aggagacagt tgacagttgt cagacaaggg atgtaaggcg aatgtcaagg 60 aatgaattta ggatcaccct gtgctgatgt caatcgaact gacgacgcag taggtttagc 120 tagagcgaaa cattcacgcg ggagaagcca agcgaaccag acgaaagtga ataaagtgga 180 ttgtaaaacc atcgctttcc gcgtatcctc ttttcttcat cgttctaca 229 // ID RETRO931_AG_LTR repbase; DNA; ANG; 173 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO931_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; RETRO931_AG_I; RETRO931_AG_LTR; ROO; ROOA; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-173 RA Jurka J. and Drazkiewicz A.; RT "RETRO931_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 18-18 (2002). XX DR [1] (Consensus) XX CC Related to ROO and ROOA from Drosophila melanogaster. 5 bp CC target site duplication. XX SQ Sequence 173 BP; 65 A; 21 C; 44 G; 43 T; 0 other; tgttacgcgt aatcttgcgg aaaaatgttt aacgggaatt tggagaatag agaatgaact 60 gtggagaggt gagcttcggc tgtggagaga tgaacttcgg caattaaaga ttgaaaccta 120 tgaaataatt tagtagcaat aaagaactca tcgaaataag agctatcgaa aca 173 // ID Clu-240_AG repbase; DNA; ANG; 1558 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-240_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1558 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1455-1455 (2010). XX DR [1] (Consensus) XX CC 2bp TSD. >93% identical to consensus. XX SQ Sequence 1558 BP; 423 A; 368 C; 304 G; 451 T; 12 other; cccaaccgac attgtcgcgt tgcgagcggt tttttttgta tagaaaaagc gttctctata 60 gcgcggtcaa atcgttctcg atagcagaac tatagagaag gtttttgaaa tcgtgcgaca 120 gcgtgtatca tacatttggt ttctgtatgc tacatcactg ctcaatttct ttcattctct 180 ccattctccc tacaaagact cttactctca tttgtccatt ccgcacatgc ttacgctctt 240 tattttacct tcttactcgc acatacacat ttacatactc acacaattca cactgcctaa 300 aaccggtgcg ggatacacac gtacccgtct tgcctgcctt ctctcctcgt tcttcctttc 360 ctccagcttt gccttcgtcg aatggtgtgt acgcgtcctg cttgcctccc tctcgctgct 420 cccacgtgct gcgctagcgn gctggatgnt cgtccctacc ttcctgccgc ttctcctcta 480 ccaggcgcta gcgcgttgga ggatcggatc gttcctgcct tctttcctct cctgttctgt 540 tttcggcact cctagggata gagggaaata ttanctattt tgccaacagc tgttcaattt 600 aacgcgaagc acgatactta ccaaccaatt gaatattgct tctgtcgtgg tccgtgctca 660 cttccgttcg tgcgtgtgtt acgatttctt anagaaaacg aaattgattt tatatcacca 720 attatcaata caattcagtt tgataaagat tgatataacc ctttattaaa aactgtactt 780 accaaatagt ttttccgtac tggtttcacg agagaaaaaa cgggcacgca ccgacgcgtc 840 ctaccgcttg caaacttgat ttcacactac cacattttca tgtgaacgcg aagcagcgcc 900 ttcctcgtgt ggctatacag gtattccccg atatacgcca tactcgatat acgcgatttc 960 gctatacgct ttttttctaa atttgacggt tctttgagca agtngtacta atttgagaaa 1020 tcgaatgtca aatgcaaaat aaattccctt ttggtcgaat attaaaaacc atttcaaang 1080 gtataaaatt gnaanattca gttgaaatca gatcaaataa ctaatttagt ggctaaaacc 1140 taccacttca tgcaaaatta tacgaaaatt agctataatt tgactgaaaa cctcgaaatt 1200 cgacnttcgc tattgatatt cgagatacgc tattgcctcc ggtccgcatt gatagcgtat 1260 atcggggagt acctgtacga tgatacaaag agcgggcgag aggagancgc ttgcgtgaac 1320 gcttggtgta gcaagcgaaa ttggacaagt cccagaaaag cgagacgcaa gagtttctct 1380 ccaacgatgg aaagggaaga gccatatgaa acatgaggat ggttggtgcg acacagccaa 1440 gtcgcgagca aaaacngacc aacacccaga agagcgaggc agaatgaaaa attttntgaa 1500 tgttagaatg agaaaaatga tagatacaca taaaaatcgc gcgataatgc tggttggg 1558 // ID BEL18-LTR_AG repbase; DNA; ANG; 366 BP. XX AC . XX DT 02-MAY-2003 (Rel. 8.04, Created) DT 02-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL18-LTR_AG is a long terminal repeat of the BEL18_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL superfamily; BEL18; BEL18-I_AG; BEL18-LTR_AG; KW RETRO937_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-366 RA Jurka J. and Drazkiewicz A.; RT "RETRO937_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 22-22 (2002). XX RN [2] RA Kapitonov V.V. and Jurka J.; RT "BEL18_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Direct Submission to Repbase Update (13-APR-2003).. XX DR [1] (Consensus) XX CC BEL18-LTR_AG is a long terminal repeat of BEL18_AG (its internal CC portion is BEL18-I_AG). XX SQ Sequence 366 BP; 119 A; 74 C; 79 G; 94 T; 0 other; tgttattgtt taacacagga aaacatcgaa tgtcagcgca agtgtcaagc agtgacattt 60 gacccacgtt actgctcgac tgtgggactg ctcgactaac gatgaggatc gcaagcccgc 120 gaatttgcaa tggtggaaaa gccataaaag ggaaatttgg gaaagcactt tctctttttc 180 tcgtcatcac cgcggaacca gaagcagtgg aattttttta accactttgt tataagtaaa 240 tttacagtac caccgttacg tactgcgtta atagaattca ataaagtgaa gttaagagaa 300 cctaactaac gtcgcgagtg aatcatttcg ggaaaaaaaa tatcaccgga cgttgctaac 360 gcaaca 366 // ID DNA-5_AG repbase; DNA; ANG; 349 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; DNA-5_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-349 RA Jurka J.; RT "Non-autonomous DNA transposons from mosquito."; RL Repbase Reports 10(9), 1430-1430 (2010). XX DR [1] (Consensus) XX CC TA tsd. >95% identical to consensus. Likely Mariner/Tc1. XX SQ Sequence 349 BP; 123 A; 62 C; 65 G; 99 T; 0 other; tacagtctgt tcccgagtta cacggttctc gacttacgcg gattcggaga tacgcggttt 60 tctaaatttg acagattaaa tgtcaaatca gtacaatttg cttcaagttc ggtataaatt 120 gcatttttgc taacaaattg aaaccgctta aaagccagaa atattagaat tttctgcacg 180 aatcatatca aataaatgat aaagtgtata aaagtactaa attaaatcaa aaactccgag 240 taatcaatag tattttagtc aaaaaacgtg aaattcgact tacgcggata ttcgagttac 300 gcggattcgt cgggaacgca gaaaccgcgt aactcgggaa cagactgta 349 // ID Copia-6_AG-I repbase; DNA; ANG; 4099 BP. XX AC . XX DT 01-SEP-2010 (Rel. 15.09, Created) DT 01-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Copia-6_AG internal region. XX KW Copia; LTR Retrotransposon; Transposable Element; Copia-6_AG-I. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4099 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1456-1456 (2010). XX DR [1] (Consensus) XX CC Consensus sequence out of four sequences of internal region of CC Copia-6_AG. The four sequences within this family have a CC p-distance at the nucleotide level considering the full-length CC alignment of 0,0009 (sd=0,0003). The LTRs are 149 nucleotides CC long and present a very high degree of identity (p-dist=0,0033, CC ds=0,0038). In two of these sequences the 3? and 5? LTRs are CC identical indicating that they have inserted recently and have CC not had time to accumulate mutations between the LTRs. XX FH Key Location/Qualifiers FT CDS 40..4089 FT /product="Copia-6_AG-I_1p" FT /translation="SRIFSPLFRRSFKMNPNSTGSSSAAGSSISTSSLPGI FT ERLIGRENWETWKFAVQTFLELEDLWCAVKPKKNDDGSYESVDTAKDRKAR FT AKIILLLEPVNYVHVKEATTAKEVWSKLEKAFEDSGLTRRVGLLHKLIKTD FT LESCDSMSDYVNRIVSTAHQLNGIGFPISEEWVGNLLLAGLTEQYRPMIMA FT LENSGIVITGDIIKTKLLQEVPPTSVEPAFAARTKHVNAGKQTKKSNTAKG FT PKCRKCSKFGHIAKDCYSTKGNDSFCVVLSTCGSKEYGKWYFDSGASVHMT FT NNSDFLMHAKTSSGTVVAANGENMQITAKGSCVLKPSCQKGEIPVDDVQLI FT PNLSVNLLSVNQIVKKGYSVTFTNEGCEVVNRNGDIIATGSHDNDLFKLDE FT RKEGEKTALTVSSTGSLELWHQRMGHLNINGVRSLANGIVTGVNIVGDTMA FT DCKECPMGKHSRHPFSKIGSRAAEILELVHSDICGPMEVKSLGGSRYYIVF FT VDDKSRRMWTYFLKSKSEAEVNKIFQDFHKMVERQSGRKLKVLRTDNGKEY FT VNTGFTNYLKKHGIVHQTSNAYTPEQNGMAERANRSIVERARCMLHMAKLS FT KSFWAEAVAAAVYLLNRSPTKGHNVTPQEAWSGKKPNLSHIRIFGTRAMKF FT IPKQYRKKWDAKSEECILTGFDEFTKGYRLYNIRSKKVTVSREVNFINEGV FT AVPIQEKSSRRHMILLEHEETVSLSPNATSQMEAVLSENEDEDSDSEYFTD FT ANNETSEDTDGTTNEFDETVVDNDAAVNSEPLVIPTQSQILRRSSRTRKVP FT ERYNDSIIPHGSGLFSNVTSSKIMSSNSSEDPITHQDAMSRSDSERWKVAM FT QEEYQALIDNSTWRLTTLPEGRKAIKCKWVFKTKHDAAGKVNRYKARLVIK FT GYSQRKGVDYNETYSPVVRHSSLRYLFALAARNNLLVDQMDAITAFLQGDL FT EEEIYMEQPPCFEQPGKQNMVCRLNKALYGLKQSSRVWNTKLDAALKQLGL FT EQSKYDPCLYFYNGNGNMLFVAIYVDDLMIFSNNEEMKNQLKTKLSSMFRM FT KDLGPAKHCLGIRVNYLNDGIALDQEAYIETILSRFKMQDCKAVATPMNSS FT IKLTKEMSPQTEEEKEEMSAVPFQEAVGCLMYLAQCTRPDILFAVNQLSRY FT NNNPGSRHWQAVKHLMRYLRGTASMKLKYYRKGNEQITGYSDADWAADTED FT RKSTSGYIFLMQGGAVSWCCKRQPTVALSTCEAEYMALSAAVQEASWWKGL FT LEQFGKKQSIQIFCDNQSTICIAKNGGYTPRTKHIDIRHHYIRDALDRNVV FT NLHYINTEEQVADGLTKALQRIKQERNRRSMGITQQSA" XX SQ Sequence 4099 BP; 1340 A; 791 C; 984 G; 984 T; 0 other; aggttatggg cccagaaccc agaagcgtac agtgattgaa gcagaatctt ttcgccgtta 60 ttccgaagaa gttttaagat gaacccgaac tccaccggat cctcaagcgc tgcaggtagt 120 agcatttcaa catcaagtct tcctggcata gaacgactca tcgggagaga aaattgggaa 180 acatggaagt ttgccgtgca aacgttcctg gaacttgaag atctctggtg tgcagtaaag 240 ccgaagaaaa acgacgatgg aagctacgaa tccgtcgata cagcaaagga tcgaaaggca 300 cgagcgaaaa tcatcttact tcttgaacca gtgaactacg ttcacgtgaa ggaagcgaca 360 acagcgaaag aagtttggtc caaactagaa aaggctttcg aagactctgg cctcacaaga 420 cgagtcggat tgttgcataa attaatcaag acagatctag aatcatgcga ttctatgtcg 480 gattatgtta atcgtattgt atcaacggcg catcaactga atggaattgg tttcccgatt 540 tcagaggagt gggtcggaaa tctattactg gctggattaa cagaacagta tcgcccaatg 600 atcatggccc ttgaaaactc cggtattgtc atcactgggg acatcattaa aacgaaactt 660 ctacaagagg ttcctcccac atcagttgaa cctgcgtttg cggcaaggac gaagcacgtt 720 aatgctggta agcaaactaa aaaatcgaat acagctaagg gaccgaaatg tcgaaaatgt 780 tcgaaatttg gccatatagc gaaggattgc tacagcacga agggaaacga ttcgttctgt 840 gtagtgcttt ctacgtgtgg atcgaaggaa tacggaaaat ggtatttcga ttccggagca 900 agtgtccaca tgacgaacaa tagcgatttt ttgatgcatg cgaaaacatc tagtggaact 960 gtggtagcag ccaacgggga gaacatgcaa atcaccgcga agggatcctg cgttttgaag 1020 ccttcgtgcc aaaaaggtga aattcccgtt gatgacgtgc agctaatccc gaatctatcc 1080 gtgaaccttt tatcggtaaa tcaaattgtg aaaaaaggct actccgttac gttcaccaat 1140 gaaggatgcg aagtggttaa ccgaaacggc gatatcattg ctactggtag ccatgacaac 1200 gatctgttca agcttgacga acgtaaagaa ggtgagaaaa ccgcgttgac agtttcttca 1260 acagggagct tggaactatg gcatcaaagg atgggccatc ttaacatcaa cggtgtccga 1320 agccttgcaa atggaatagt gactggcgtc aatattgttg gagataccat ggccgattgc 1380 aaagaatgtc caatgggcaa acatagccgt catcctttta gcaagatagg atcgcgggcg 1440 gctgaaatac tcgaattggt tcattctgac atttgcgggc cgatggaagt caaatctcta 1500 ggaggaagcc gatattatat tgtatttgtg gatgacaaat cacgccggat gtggacatat 1560 ttcttgaaat ccaagtcgga agctgaggta aacaaaattt tccaggattt tcacaagatg 1620 gtagaacggc aatctggacg aaaattgaag gtactcagaa cagacaatgg aaaagaatat 1680 gtcaacacag ggttcacaaa ctacttaaag aagcacggca ttgttcatca aacatccaac 1740 gcatacactc cggaacagaa tggcatggcc gaacgagcga ataggtcgat tgtggagcgt 1800 gcaaggtgca tgttacacat ggcgaaactt tctaaaagtt tttgggcgga agctgtggct 1860 gctgctgtgt accttctgaa tcgttctcca accaaaggcc ataatgttac tccgcaggaa 1920 gcgtggtctg gtaagaaacc taacttgtcc catattcgga tctttggtac tagagcgatg 1980 aaatttattc cgaagcaata tcgcaagaag tgggacgcta aatcagaaga gtgcattctg 2040 accggtttcg atgagtttac caaagggtat agattgtaca acatcagatc gaagaaagta 2100 acagtcagtc gtgaagtaaa tttcattaat gaaggtgttg ctgttcctat ccaagaaaaa 2160 tcaagcagaa ggcatatgat tctccttgaa catgaagaga cagtttctct ttcaccgaac 2220 gctacatcac aaatggaggc agttttgagt gaaaatgagg acgaggacag cgacagcgaa 2280 tacttcacgg acgcgaacaa tgaaacttct gaagataccg atggaactac taacgaattt 2340 gacgaaacag tggtcgataa tgatgcagct gtgaacagcg aacctctcgt tataccaact 2400 caatcgcaaa tcctgaggcg aagcagtcgg acgcgtaaag tcccagagag gtataatgat 2460 tctataatcc cacatggctc tggtcttttc agcaatgtta ccagttcaaa gatcatgagc 2520 agcaattcaa gcgaggatcc aatcacacac caggatgcaa tgtcgcgtag cgattcagaa 2580 cgttggaaag tagcgatgca agaagaatat caagcgttga ttgacaacag cacatggaga 2640 ttgacaactc ttccagaagg taggaaagca atcaaatgta aatgggtgtt taaaacaaaa 2700 cacgatgcgg ctggaaaagt caaccgctac aaggcgcgct tggtgataaa gggatattct 2760 cagcgaaagg gggtagatta taacgaaaca tattcacccg tagttcgtca tagttccctg 2820 agatatttat ttgcactagc ggccagaaat aatctcttgg tggatcagat ggatgcgata 2880 actgcttttc tacaaggaga tttggaagag gagatataca tggagcaacc accgtgtttt 2940 gagcagcctg gcaagcaaaa catggtatgt cgattgaaca aagcgttgta cggactaaaa 3000 caatcaagtc gtgtctggaa tacgaagcta gatgcagcac tgaaacaact gggtttggaa 3060 caatcgaagt atgatccatg tttatatttc tataatggca atggaaatat gctgtttgta 3120 gccatttatg tggacgattt aatgattttt agtaataatg aagaaatgaa gaatcagctg 3180 aagacgaaat tgagcagcat gttccggatg aaggatttgg gaccagctaa acattgttta 3240 gggattcgtg tgaattattt aaatgacgga attgcacttg accaggaagc ctacatagaa 3300 actatcctat cccggttcaa aatgcaagac tgcaaagctg ttgctactcc tatgaactct 3360 tccataaagc taactaagga aatgtcgcca cagacagaag aagaaaagga agagatgtca 3420 gcggtgcctt ttcaagaggc tgtaggttgc ctgatgtacc tagctcaatg caccagacca 3480 gatatcctgt ttgcagttaa tcagctgagc cgatacaaca ataatcctgg atcgcgtcac 3540 tggcaagctg taaaacatct tatgcgatat ctaagaggaa cggcatcgat gaaactcaaa 3600 tattacagaa aaggtaacga acaaataact ggatattcag atgctgattg ggccgctgat 3660 acagaagata ggaaatccac cagtggatat attttcttga tgcaaggagg agcggtgtca 3720 tggtgttgca aacgacaacc aactgttgca ttatcaacct gcgaggcgga atacatggca 3780 ttgtcagcag cggtacaaga agcatcgtgg tggaaaggat tgttagaaca atttggtaag 3840 aagcaatcga ttcagatatt ttgtgataat cagagcacta tctgtattgc aaaaaatgga 3900 ggatatacac cacgaacgaa gcatatcgat ataagacatc attacatcag ggatgctttg 3960 gatcgaaatg ttgtgaatct ccattacatt aacactgaag aacaagttgc agatggtctt 4020 acaaaagcat tacaacgaat caaacaagaa cgtaatcgac gatctatggg aattacacaa 4080 caatcggctt aaggaggag 4099 // ID Ag-I-2 repbase; DNA; ANG; 5557 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE An I clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW I; Non-LTR Retrotransposon; Transposable Element; Ag-I-2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5557 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-5557 RA Kojima K.K. and Jurka J.; RT "I clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 4 CC sequences with >99% identity. XX FH Key Location/Qualifiers FT CDS 308..1606 FT /product="Ag-I-2_1p" FT /translation="MGGPGTAWGEYPTTPPNIRGRQAPIWMDGEDRLGELL FT YLVLSPNEKSVLPKNPFIIEKSLTNHAGNIEPCKPIDKGSKYLVKTRSRAQ FT FEKLQSLTKLIDGTSVNIIAHPYLNSVQCVIVCNSLEGLSDDDILSEMKEQ FT KVVSVKRFLRKENGEIKPTNSFLLKIDAVQVPHEIRVGCLVVKTRTYYPRP FT MICTQCLIIGHTKNRCNHETCCANCGEKEHGVCSSPIKCKNCSGSHVSLSR FT SCPTYIDEQMIIKIKIDNSISYTEAKSVFENKRAHSVNERIQNAQQESVKD FT QIIADLKKKVAQLEEKMDIFLKANDKLQKDNDKLKHMYLQTIKPKTNTNTT FT SEPIDSSDNENTAPTRSHAHIVNLASNLNTQIDTDNGDNAYDKPIQKRKVD FT SYSSTDEQHSDKTIPNTDENSSEFVKPKTTRRPKNRKTD" FT CDS 1738..5316 FT /product="Ag-I-2_2p" FT /note="apurinic-like endonuclease, reverse FT transcriptase, and ribonuclease H." FT /translation="MNKNIDELKILINKHNPSFIALQETNHKPNTLHHSPL FT NNYTWRFHNKPTLQHSVCLGVRKNISHTQLNISSDLPLIAISTKVPFDITI FT VSAYFPNKMHPTNELSLYLTQTLNQINGPILLVGDLNASNTYWSNTRTNTR FT GKILHNLFDLNNLQVIKNSSPTRISPSDGTGSCIDHCVVSSCSTEQFSLVV FT EEDLHGSDHFPILIFDKHGDTRPKLRPRWKYNEADWSSYRFLIDRAIDGSA FT LLSVDSFCSIILMAAELSIPRTSGKIPKKSVPWWNDSVEKAITARRKALRK FT LRKASTISNNPNTSYLALKYQEANQIAKNEIEKAKKVSWEKFVTSIDPSLS FT CKEVWSMVKNLTGNSQFSFPTLLNNNEPITSPSCIAEHFAQHFYEASATSK FT YSATFIQRHKSKTISSISFVDTNNQSYNKPFSTEELIWALRKCRGNSAGID FT DIGYPLLKQLPERALHCLLEIYNNIWRTGEIPNNWKTSLIVPIPKAEKDSL FT SVDSYRPISLLCCMSKVLERLVNRRLIQELESRNLLSDKQHAYRIGRGTET FT YFACLENVLTEAKNKNYHTDCAVVDLSKAFDRTWRPAILEQLKKWGFGGSL FT PRFIKNFLTDRTCKVLIGSYTSNIKVLENGVPQGAILSPTLFLISMESLLQ FT TIPTNIDTFIYADDIILITSDKSETKNRTKLQKALNKLHHWCCLTGHDLST FT TKSKILHICNKSHKRTLKPIKIKYDIIPNARAVKILGVIIDSRLKFRQHLL FT YVKKMIKSRLNILHMLGCGNKRSARRTLLTIFNSWFVPKLLYGIEFITMGD FT ANFGCKLAPMYHSALKKITGAFITSPNTAVLCESGQLPLDHVVVLKIVNLA FT GRLIEKGIEANSFVDRANKVLKQLTGHSMPNIARLTSCSTRPWYDKIPTID FT WTLQKLSNHEQQIVQNYQMLIANKYKIYNKIYTDGSVYNSSAGCGITSPNA FT RCSIKLPENTSIFSAEAIAIMIATEEATVNDRPNVIFTDSASVLKALEKGS FT LRNPYIQTIESLSKDRSIEFCWIPGHKGIAGNEVADRLANEGRMIKDHVDG FT TISRSDASLWWKGVINQSWQNIWNKSPITNLKKIKQTTAPWNDPLDHSDQR FT VVTRLRIGHTRLTHTYKLKKESPLICPFCGCDTTVEHILINCFGYSAERQK FT HRLGDHLDIVLSNNYKEIEKLLNFLKETNLYKQV" XX SQ Sequence 5557 BP; 1996 A; 1151 C; 968 G; 1442 T; 0 other; cagttctgtg cgatactcgg agccgatcgg acgtgtttac gagagactct actgagggca 60 atacacagtg ttatcagcta tcgttagttt gaagcaacga cctccgttgc tgtttttttt 120 ttttgggtga gagaaagaga cactcacctg gaaactggaa gattgaagag tatagaattt 180 ctctcgctta aacagcgctg cctctcgtgg cgcgcacaag caagtgctga aacaggtttt 240 tttttctctt tctaattgat attgtaagta gaaacacttt atctaatttt tcaaacgatt 300 taacaccatg ggaggaccag gcactgcctg gggggaatat cccacaactc caccgaatat 360 tagaggtaga caagcaccta tatggatgga tggagaagat agactgggag aattattata 420 ccttgttctg agtcccaacg agaagtctgt tttaccgaaa aatccattca taattgaaaa 480 aagtctaacc aaccacgccg ggaatattga accctgcaag ccaatcgaca aaggttcgaa 540 atatctggtg aagaccagaa gccgagctca atttgaaaag cttcaaagct taacaaaact 600 catcgacggt accagtgtca atattatagc gcacccatac ttaaacagtg ttcagtgcgt 660 tatcgtatgt aacagcttag aaggtctttc agacgatgat atcctatctg aaatgaagga 720 acaaaaagta gtgtcagtga agcgtttttt gcgtaaggaa aatggtgaaa ttaaaccaac 780 taactctttc cttttaaaaa tagacgctgt tcaagtccca catgaaatac gggtaggatg 840 tttggttgta aaaacccgaa catattatcc tcgaccaatg atttgtacac aatgtctcat 900 cataggccac actaaaaata gatgcaatca tgaaacttgt tgcgcaaatt gcggtgaaaa 960 agaacacgga gtctgctctt ctcccatcaa atgcaaaaac tgttcgggta gccatgtttc 1020 acttagtaga tcctgcccaa cgtacataga cgaacaaatg ataatcaaaa ttaaaatcga 1080 caacagcatc tcttatacag aggccaaatc cgtctttgaa aacaaaagag cccattcggt 1140 aaacgagaga atacaaaacg cacaacaaga atctgtgaaa gatcaaatta tagcggattt 1200 gaaaaaaaaa gtagcccaac tcgaagaaaa aatggacatc tttttgaaag ctaatgacaa 1260 actgcaaaaa gataatgaca agctcaaaca catgtatctg caaacaatca aaccaaaaac 1320 taatactaac actacatccg aacccatcga cagctcagat aacgaaaata ctgcaccgac 1380 gcgatcccac gcacacatcg taaaccttgc aagcaacctt aacacgcaga tagatactga 1440 taacggagac aatgcctacg ataaaccgat acaaaaacgc aaagttgact catactcatc 1500 cacagacgaa caacacagcg ataaaactat accgaatacg gacgaaaaca gctctgaatt 1560 tgttaaaccg aaaacaacta gacgcccgaa aaatcgcaaa acggactaat ttgaaattta 1620 tatctgatca gcctttgata ataatgattt gctatttctt catatgactt tcaattcgat 1680 tcaaaaaata ataatataga tatttctcca tatattattt cttggaatat cagaggcatg 1740 aacaaaaata ttgacgaact taaaatatta atcaacaaac ataatccctc ttttatcgct 1800 cttcaagaaa caaaccataa gcctaatact ctacaccata gtccactgaa caattacacg 1860 tggagatttc ataacaaacc aacacttcaa cacagcgtgt gtctaggtgt tagaaagaac 1920 atttcgcaca cacagcttaa catctcatcc gaccttccac taatagctat ctctacaaaa 1980 gtaccatttg atattaccat agtttctgca tacttcccta ataaaatgca ccccacaaac 2040 gaactttctt tatacttaac acaaactctt aaccaaataa atggcccaat actacttgta 2100 ggcgatttaa atgcttctaa cacgtattgg agtaatacac gaactaacac tagaggaaaa 2160 attctccata acctttttga cctgaacaat ttacaagtaa taaaaaactc gtcacccact 2220 cgcatatctc caagcgacgg cacaggatca tgcattgatc attgcgttgt gtcgtcttgt 2280 tccactgaac aattcagctt ggtagtagag gaggatttgc acggtagtga tcacttcccc 2340 atcctcattt ttgataagca tggggatacg cggccaaagt tgcgaccacg ttggaaatat 2400 aatgaggctg attggtctag ttatcgtttc cttatcgaca gggctattga tgggtcagca 2460 ctgctttcgg ttgatagttt ttgcagtatt attcttatgg cagcagaact aagcatccca 2520 cgaacgtcag gcaaaattcc caaaaagtcg gtaccatggt ggaatgactc ggttgaaaaa 2580 gcaataacag cacgaagaaa agctctacgg aaattgagga aagccagcac tatcagtaac 2640 aacccgaata cttcatacct agctttgaag taccaagaag caaaccagat agccaaaaat 2700 gaaatagaaa aagcaaaaaa agtaagttgg gaaaaattcg tcacaagcat tgacccatct 2760 ctttcatgca aagaagtttg gagtatggtg aaaaatttga caggaaatag tcagttctca 2820 tttccaacac tgttgaacaa caacgaacca atcacttctc catcctgcat tgcagaacat 2880 tttgcacaac atttctatga ggcatctgca acttctaaat actcagctac cttcattcaa 2940 cgccacaaat ctaaaactat atcatccata tcatttgttg atacaaataa tcaatcgtat 3000 aacaaaccat tttccacaga agaactaatc tgggctttga gaaaatgtag aggaaactct 3060 gcgggtatag acgatatcgg ctatccactc ctcaagcagc ttcctgagcg tgcattacac 3120 tgccttctag aaatatataa caacatatgg aggacaggtg aaattccaaa taattggaaa 3180 actagtctga tcgtcccaat tccaaaggca gaaaaagaca gcttaagtgt agacagttat 3240 cgaccaatct ctttgctgtg ttgtatgagt aaagtcctgg aacgtttagt caatcgacga 3300 ctaattcagg agctagagag cagaaatctg ttaagtgaca aacagcacgc ttacagaatc 3360 ggccgtggta cagagactta ttttgcttgt ttagaaaacg ttctcactga ggctaaaaat 3420 aaaaattatc acactgattg tgcagttgtt gacctatcaa aggccttcga tcgtacctgg 3480 cgtccggcga tccttgagca attgaagaaa tggggatttg gaggaagttt accgcgcttt 3540 ataaaaaatt tcctaacaga ccggacctgt aaagtactga ttggatcata tacctctaac 3600 atcaaggttt tggagaacgg tgtccctcaa ggtgctatat tatcacctac gctttttcta 3660 ataagcatgg agtctttatt acaaacgata cctactaata tagacacatt catttatgct 3720 gatgacatta tactaattac ttcggacaaa tctgaaacaa aaaatagaac caaactacaa 3780 aaagcactca ataaacttca tcattggtgc tgcttgacag gtcatgatct atcaacaaca 3840 aaaagtaaaa ttctacatat atgcaataaa tcacacaaac gaacacttaa acctataaaa 3900 attaaatacg atataattcc aaacgcaaga gcagtcaaaa ttttaggagt tatcattgac 3960 tcaaggttaa aatttcgaca acaccttctc tacgttaaaa aaatgatcaa aagcagatta 4020 aatattctgc atatgttagg ttgtggaaat aagagatcgg caagacgcac acttctcact 4080 atttttaaca gttggtttgt tccaaaatta ttgtacggca ttgaattcat taccatgggc 4140 gatgccaact ttggctgcaa gctagcccca atgtaccact ctgctctgaa aaaaattaca 4200 ggtgctttta tcactagccc aaacaccgca gtactttgtg aaagtggaca attacctctc 4260 gaccatgttg tagtccttaa aatagtaaac ctagctggca gactcattga aaaaggaatc 4320 gaagcaaatt cttttgtaga tagagccaac aaggtgttaa aacaactaac aggtcacagc 4380 atgcccaaca tagctagact cacaagctgt tcgactagac cttggtacga caaaatcccc 4440 acaatcgact ggacattgca aaaactaagc aatcacgaac aacaaatagt acaaaactat 4500 caaatgctca tcgccaacaa atataaaata tataacaaaa tttatactga tggctcggtc 4560 tacaacagtt ctgcgggctg tggcataaca tcgccaaatg ctcgctgtag cattaagctt 4620 ccggagaaca catcaatatt ctcagcagag gctattgcta ttatgatagc aactgaagaa 4680 gcgaccgtaa atgacagacc taacgttatt ttcacagata gcgctagtgt tctgaaggca 4740 ttagagaaag gatccttacg aaacccctac atacaaacta tcgaatcact atcaaaagac 4800 aggtcaatcg aattctgttg gataccaggg cataaaggaa tcgcaggaaa cgaagtagct 4860 gaccgtcttg cgaatgaagg taggatgatc aaagatcatg tagatggaac tatctctaga 4920 agcgatgcat ccttgtggtg gaaaggggta ataaatcaga gctggcaaaa catttggaac 4980 aaatcaccca ttaccaacct taaaaaaata aaacaaacca cggcaccttg gaacgatcca 5040 ctagatcaca gtgatcaaag agtggtaaca aggttgagaa ttggacacac ccgacttaca 5100 catacataca aactaaaaaa ggaatcaccg ttaatctgcc cattttgtgg gtgtgataca 5160 acggtggaac acattttaat taattgtttt ggatattccg cagaaaggca gaaacacaga 5220 cttggagatc atttggatat tgtactttcc aacaattaca aggaaattga aaaacttttg 5280 aattttttaa aggaaaccaa tctctacaaa caagtatagc aaatatggat aaaaatattt 5340 ataaaaacat aattgaaatg aaaaaagcac tagaattaac aaacattttc aactttcacc 5400 agtaccataa taactcttgt aattttttct tctttttctt gaacttttaa tttattcttt 5460 tttttgtatc tcaaatagta ttagcgattt ctgaatttta aattacgaga ggcgaatgct 5520 aaaaagcctc gtaaaaaaat aaaacaacaa caacaac 5557 // ID AgaP2-P12MITE326 repbase; DNA; ANG; 326 BP. XX AC DQ301482; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP2-P12MITE326 P MITE, DE complete sequence. XX KW P; DNA transposon; Transposable Element; Nonautonomous; KW AgaP2-P12MITE326. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-326 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-326 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301482; Positions 1 326. XX SQ Sequence 326 BP; 111 A; 63 C; 62 G; 90 T; 0 other; caaagtgggt agaaaggagg tgtcgtcata cagaacattt ggccatcaag gctttagaaa 60 aaagcacaaa atcaggatcg acggcagttg tcaaatgcga cgaatgttgc tggtatacgg 120 gtatgataaa tgaattgttt caatttaatt aatatttgta tagcactaaa aacatatatt 180 ttgcatcaac actttataaa tcggcatttt caacgtgata ttccgcctgc tgcctgtaaa 240 caaatatcga cggctgttgt caaactgaat ctcaaaatgg caacatgcca aacgctaaga 300 agacacctcc tctatattcc accttg 326 // ID AgaP4MITE675 repbase; DNA; ANG; 675 BP. XX AC DQ301484; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP4MITE675 P MITE, complete DE sequence. XX KW P; DNA transposon; Transposable Element; Nonautonomous; KW AgaP4MITE675. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-675 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-675 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301484; Positions 1 675. XX SQ Sequence 675 BP; 202 A; 120 C; 135 G; 218 T; 0 other; caaggtcctt gaaagagaac aggtcgaagc gtttagagta aaatgacaga aagccatggg 60 aaaaattttg acagttaaag tgaacattgt ttatgtttag cgatggacgt tgctatggag 120 cgaatggaag ttttgttcac cattttgcac tcatttcctt ttagaatatt ataagaaata 180 gaatctaatt agaatattac atggctgatt gaatccatgg atctcattta tcattcgtcg 240 ccgggttata agtaaaggac ggtatgcata ttgtatcagt gtgtgtcgat tggatgtttc 300 attttactct cagtgtttca ttgaaagtaa ataggacttc tgttacctag ttggaacatg 360 ttaaactttt cgttttccct gcaggaccgt tgaaaaccgt tggtttgtgg tcacggaatg 420 gcaaacaatc aaaacgatta tcaacgtttc tgaacacaga gtagactgca attacgcctc 480 cttaattctc caggagtgat tggccacaag aaatttaacc aatataatct tcctaatagt 540 aaattcgtcc acttttccgc gttaactatt agccgtttgt ttgtaaacac tttttcatga 600 aactgtcaaa agcgacctga aaatggcaac ctagcaacgg gctttgcatc gacctgttct 660 ccttaataga tcttg 675 // ID GYPSY19-LTR_AG repbase; DNA; ANG; 136 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY19-LTR_AG is an LTR of retrotransposon GYPSY19_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY19-I_AG; GYPSY19-LTR_AG; GYPSY19_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-136 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY19_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 4-4 (2004). XX DR [1] (Consensus) XX CC GYPSY19-LTR_AG is a long terminal repeat of GYPSY19_AG (its CC internal CC portion is deposited as GYPSY19-I_AG). XX SQ Sequence 136 BP; 41 A; 32 C; 29 G; 34 T; 0 other; tgtcatatac gtctgacagc tgtccatcga cagaaactgc tagttgggta gcaaacagct 60 gtcagtgaga ataaacggca ctctgtgtct gaactacgaa cgaaacacat gtgtcttcct 120 tccacgagtt ataaca 136 // ID Mariner-N19_AG repbase; DNA; ANG; 725 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Nonautonomous Mariner DNA transposon - a consensus sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW Mariner-N19_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-725 RA Jurka J.; RT "Putative mariner/Tc1-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 644-644 (2009). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 725 BP; 190 A; 166 C; 188 G; 181 T; 0 other; tacagggtgt gcgggctaaa tttgacacct ggcctttggg tttatatctc agggttgagt 60 tgacgtatcg aaatgattga tatgtcattc gattgctaaa agttgtgcga acaagtgtaa 120 aaaatgtcta gctccgtaaa tatgtggaac ccgtccgaat tgtacaagcg cgtaacgtgt 180 gttatgatgg ttcgtgccgg ccatacgaat gccgaaattc ggaaggcggc gatgtgttcg 240 ctgaacaccg taaaaacaat acggtggcca atgggcggcc gtacgtttgg cagcaggatt 300 ctgcaccatg ccatacggcc gtaaaaacac gccaatggct cactgccaat ttcgaccgat 360 acaccagcac cgacgtttgg ccacccagct cccctgacct taatcctatg gattactttg 420 tgtggggcac agttgagcgg gacaccaaca gggcgtcctg caataccaaa gcggagctgg 480 tggccagaat aaaggccgtg tttgcggcca tccccagaga catggttgtc cgggcttgcg 540 cgcggttccg gaagcgggtg caggcggcca tcgacgcgga gggggggtac ttcgaataaa 600 ataaatgcgg caaacatggt aagctaaact ttgtagaatt ttttttttaa tatttaacat 660 aaatttgata aaaattcctt tcctattccc tcagaaactg tcaaatttat ctcgaacacc 720 ctgta 725 // ID P2_AG repbase; DNA; ANG; 4196 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE P2_AG is a P-like DNA transposon - a consensus sequence. XX KW P; DNA transposon; Transposable Element; P superfamily; P2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4196 RA Kapitonov V.V. and Jurka J.; RT "P2_AG: a family of P-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 2(11), 22-21 (2002). XX DR [1] (Consensus) XX CC The A. gambiae genome harbors many divergent families of P-like CC DNA transposons. One of those families is P2_AG. CC P2_AG elements are flanked by 8-bp target site duplications. CC Terminal inverted repeats are 27bp long. CC The P2_AG consensus sequence was reconstructed from CC several copies that are only ~2% divergent from each other. CC Presumably, P2_AG copies have multiplied in the genome CC during the last 1 million years. CC P2_AG encodes the P-like transposase that is slightly CC damaged by mutations accumulated in genomic copies of P2_AG. XX SQ Sequence 4196 BP; 1543 A; 707 C; 706 G; 1240 T; 0 other; caaagtgtat gaatatcagg tggtctcgtc ccatacaaat tgacaactaa aaacgaaatc 60 gaaaaaaaat tggttttgac aggacgggtg aaaacatatc catggaaact ttacggattg 120 ttagggataa aattcaaagt aggctctggt catatcgcca atgctattat tttgtgaact 180 tgtgaatgta aattttgtga accaagccga taaaaaatct tatcaattgc aataattatt 240 tactttgttt gtgtgattta tcatcattgt cactggtgtt ttttctttgt atttcactgc 300 cgttaataaa tgaaattgaa ttgaaagcgc gttaaatacg taagtttgaa gcttgccaaa 360 tacattcaaa gccatacgca aaataaaatt atatttattt tgttcatgga cctagattga 420 agatgccgaa taattcatgt gcagtttcat tttgtaaaaa cagcacatat cgtgctaaaa 480 cacaaaacct gaacatctcg tttcataggt ttccaaagca tgaagatttg cgtaatcaat 540 ggacattatt ttgccagaga gaagaagttt ggaagccaaa ggataatgat gtaatttgtt 600 catgccattt taccatcgaa aattatcaaa tggccaaatc acctattttg aaaaacagac 660 aaccattgag aatcttgact ccaaatggta aaatcgatta gcccaactac ttttaccttt 720 tttttttttg ttaattttgt aacttactgc taaattaatg taaattacag caattccatc 780 cttagtttgt agaaatcctc atactattcc tgaacaggag caaatagaca gagtagatga 840 acatcaacga atcaacgagc tgtatgaaag aaacgatagc aaaagtgctt cagattttgt 900 cgatcatgaa gcaagtgttt taattagtga tcaagaaaaa attcagaatc ttgaagctgt 960 gatagcaaca ctgaccgaag agattgattt ctgtagaaaa ctaagaaagg acaattcatt 1020 cttattaaaa cagaaaaaaa tactattaca acagaacaaa aaactgaaac ttagaaatac 1080 cgttttgaaa aataggataa gagtgcttga acaaactact atccaacatc atgaaatagt 1140 accgttgatg aaaaaaaaaa attgagtaag acactttctg gaaatcaaat cgatttgatc 1200 ctgaaacgca aaaagcgagt aaggtggaca aaaaaggagt taggagctgc tttaacgtta 1260 cggtattttg ggaaaaggcc ttacatgtat atgacagacg acatgcattt cccccttccg 1320 gcaagagcaa ccattcaacg gtatataaaa tcaattaata tcaagcaagg aattcttgaa 1380 gatgttttga atttgatgga atcttacgcg aaaacattaa ttccaagaga tcgtgaatgc 1440 gttattagtt tcgatgaaat gaaggtgaca catatattag aatacgatgt tggtgccgat 1500 gaagtattag gtcctcataa ctatcttcag gttgtcatgg cccgaggtat ttttaacaaa 1560 tggaagcaac ctatttttat tggttttgac caaaagatga cgaaagacat cattttatcc 1620 ttaattaaaa aattaagtga acgcatgata aacgtggttg caattgtcag cgacaattgt 1680 caaacaaata tttcatgttg gagagatttg ggagctcacg atataaccaa accatacttc 1740 aatcatccga ttaccaacaa aaatgtttat gtatttcccg acgcaccaca tttgattaaa 1800 ttactacgaa attggctaat tgatacagga tttgattata aaggaagcat aataacggca 1860 gacccattac gaaaactagt tgaagaacgc aaggatgctg aaattacacc gttatttaaa 1920 ttaaatgaaa atcatttaac aatgtcacca caagaacggc aaaacgtgcg acgagcggtc 1980 caattacttt cgcatactac agccactgct ttgaggcgat atcaaagtga tgatgcttct 2040 cagactttgg cagaatttat tgaaacagta gactgctggt ttagcatatc aaactcttat 2100 tctccttggg ctaaaatagc gtacaaaaga tcgtacgcgg gaaaagaaga acaagaaaag 2160 tcgttagata aaatgtacga acttatatct aacatgactg ccttaaacag aacaagcatg 2220 cagacctttc agaaatcgat tctgatgcac attacttctc ttaaaatgct gtataaagat 2280 atgcaaaaaa aaacatcaag tagattttat ttcaacgcat aaggtaattt tgaactaaat 2340 atccattatg atttgtaaaa ctgatttttt aaaccttttt ttatttgaca gttgaatcaa 2400 gatatattag aaaacttttt ttcccaattg cgacaaaaag gtggggtgta cgatcaccca 2460 tcaccattga gttgtctgca tagaattaga atgatagtta tcggaaaatc tccaacgata 2520 cttattaatc aaaccaatct ggagacaaac cgaaacacga gattagggga aaatatctgc 2580 ttaatggagc agcatgaaaa cgcaaacgaa gattgtttgt ctgaacaaaa tcaagatttt 2640 atgtctgtaa ctctatttac tgaagcagac gtaattccag atttaccaga caatgcatta 2700 aatgcacatg aaatggatga tgaaagtgaa atccttagta ccatcagttc caccattcca 2760 gatctaccag aacaagatag agatggattt taatatatta tggggtactt aggaaaaaaa 2820 tttcacaaaa aacacccgca cattgatcaa ggaacatatt catttaaaat aaccagcgat 2880 cataactaca gcaaaccacc atcatttgtg aagcacttat ccacgggagg attgtttgtt 2940 ccatctcatt cttttttgaa attgggatcc aaaatggaaa aaatcttcca aaagctgcat 3000 cccgatggta ctctggacaa aaagccaggg atagtaaaaa caggtgtgaa tacaattaaa 3060 aaaaaaatac cgagtttacc cgtagacatg attaaatgct ttatgatgct acgtgtgaaa 3120 ttaagaatta aacacacgaa tcttaaagcc gcaaatgaaa agctattgag atgtaagcga 3180 aaggggccaa aagagcataa agcagctaaa aaaatgaaaa aaatagtcaa ttaaagaata 3240 attttgaaat ttaatatgta ttgttgtttt actttgataa aattatactg atgtacaacg 3300 acaacacaac gacttgggag acgatggcgc gatatcgtaa atacttttag acacttccgc 3360 aatagcccaa acaatcatag cggttatgat tgtagcggtt aataagtaat gactgacatc 3420 aacgttatat aacgcaatat tatttaattt aaaaacttag aactttttta caaacacaat 3480 taattcgtag ttaaactatg aatcatgaaa acaattgaaa actgaaactg atgaaaacta 3540 agtaatgatt tttttctggt tcaaaatttc tcaaaaaacc tttccgctaa aaacttcaga 3600 aaggtctact gtcccgtgac cgctaagaat ttccacacaa tttgtcttta cttctttatg 3660 tgtctccgtc aaaaataaaa tattcgatgt tccatggcgt aaagatgagt tatatatata 3720 tatttttgtg ttgataaatg gtccaaaata ttcgcttgtc ttcacagctc gataatggat 3780 attcgttagc ttttcgcttc tcagagaaaa cggccaacga ccaacgtatc tacctataca 3840 aagggaaaaa tagaaaataa cagcaagtat gagaaacaca taacaacggt tatgtgaaga 3900 caataatgat tgcactccct taataatcac cacaaacaca ctgttttcat taatatcggt 3960 actcttcttt tcctaattta ctttttcaat tctaaatcat cttactttca acaaaataaa 4020 atgagatttc caacagtttt gttccaatcg ctagacaccg taatattgtt ttgcttttca 4080 tcaatacgtg atgtgtgtgt gtgtgagata tgttttcacc cgtcctgtca aactttgacg 4140 ttcaactttt gctataaaaa acaggctatg agaccacctg atattcatac actttg 4196 // ID GYPSY7-I_AG repbase; DNA; ANG; 6534 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 20-SEP-2005 (Rel. 10.1, Last updated, Version 2) XX DE GYPSY7-I_AG is an internal portion of the GYPSY7_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY7-I_AG; GYPSY7-LTR_AG; GYPSY7_AG; Gypsy clade; Gypsy group; KW env; gag; integrase; protease; reverse transcriptase; KW GYPSY46-I_AG; GYPSY46-LTR_AG. XX NM GYPSY7-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6534 RA Kapitonov V.V. and Jurka J.; RT "GYPSY7_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(5), 87-87 (2003). XX DR [1] (Consensus) XX CC GYPSY7_AG is a young family of Gypsy-like LTR retrotransposons. CC GYPSY7_AG belongs to the Gypsy group of the Gypsy superfamily. CC GYPSY7-I_AG, an internal portion of GYPSY7_AG, is flanked by CC GYPSY7-LTR_AG LTRs. The GYPSY7-I_AG consensus sequence was CC reconstructed based on multiple alignment of 5 copies; they are CC ~0.4% divergent from the consensus sequence. CC The consensus sequence encodes the Gypsy7_AG1p 434-aa gag-like CC protein (pos. 507-1808), the 1061-aa Gypsy7_AG2p pol-like protein CC (pos. 1756-4939), composed of the protease, reverse transcriptase CC and integrase domains, and the Gypsy7_AG3p env protein (pos.4962- CC 6506). XX FH Key Location/Qualifiers FT CDS 507..1808 FT /product="GYPSY7_AG1p" FT /translation="MEALAGRIAALEARFSESNVTDDFQDPPLFFTKQDGS FT AVDPESFEKIPGVVKDLPIFCGDPSELNSWINDVDGIIRLYQTISSHSLEK FT QNKFHMICKFIRRKIRGEANDALVASNVGINWNMMRKTLITYYGEKRDLET FT LDFQLMSVYQKGRTLEVYYDEVNRLLSLIANQIQTDDRFNHPEASKAMIGT FT YNKKAIDAFIRGLDGDVYKFIRNYEPTSLAAAYSYCISFQNLECRKMLTKP FT KHFNTPPSAPRNQIPLPTPHLPPRVFQHQQRPMTANNVRPHFAHHPPIQNF FT AGNFTQRPVWNQPNQQRPIFQRTNFNQPNQMKNFTQQRNNFRQNGPEPMEI FT DPSIRSHQVNYANRPNSSNIRPLKRQRAFNIEAVPRRELEPTSYEDNLYDD FT DVESQASYERYMRNVEKQEKLNENSHYDEISREAELNFLG" FT CDS 1757..4939 FT /product="GYPSY7_AG2p" FT /translation="KFSLRRNFSRSRIKFFRLKSALPYFIYHGKAGQQIKI FT LIDTGSNKNFINPLHAKISHDVIKPFFVSSVGGDLLITKYSQAQIFAPYSD FT VNVKFYHLQGLKSFDAIIGYDTIKEMGAFVDAKRDNLVLENFIIPLSLHPL FT QEVNRIEIRDTHLNHQEKEKLHLFLNKFQDLFQPPDEKLPFTTKVEATIAT FT NDTEPIYCKSYPYPLSLKQEVETQIKKLLNNGIIRPSRSPYNSPVWIVPKK FT VDASNEKKYRLVIDYRKINLKTKSDRYPIPDTSTVLANLGNNKYFTTLDLA FT SGFHQIRLAEKDIEKTAFSINNGKYEFLRLPFGLKNAPSIFQRVMDDVLRE FT HIGKICHVYIDDIIVFGKTFDEHLKNLEIVLNTLREANFKIQPDKSEFLRT FT EVEFLGFIVSEYGLKPNEKKIESILKYPEPQTIRELRSFLGLSGYYRRFVK FT NYAALAKPLTKLLRGEDGQGHCKITKNQSKNFPIKLDDDAKRAFKTLKEVL FT SSDDVLAYPDFDHDFILTTDASDKAIGAVLSQNVNGVEKPITFISRTLSKT FT EENYATNEKEMLAIVWALHSLRNYIYGAKIIILTDHQPLTYAMSPKNNNAK FT LKRWKAFIEEHNYELSYKPGKTNVVADALSRIQINSLTPTQHSAEEDDLSF FT IPSTEAPINVFRNQLIFQKGTISSYEFVNPFPKFKRHTFIEPQFSIDFIKD FT KLKRFMIPGIINGIFTDEPTMGIIQETFKNLFNISTMKARFSQTQVQDICD FT QEQQIEEIRKIHNFAHRNAKENSLQAIKKFYFPSMRNKIEQYVKNCETCKV FT EKYERRPPEYIPVKTPIPKYPGEIVHVDIFAYNANFLFISSMDKFSKYLKL FT KPIKSKSIADVKEVLLQLLYDWNLPRQIIFDNECTFVSNVIEQSILNLGVS FT IFKTPVNRSESNGQVERCHSTIREIARCTKGLNPDMSLITLIQQAVYKYNN FT TIHSFTKETPRKVYIGEQSEELSFRDRSKLKEKIESKIIKIFEEKNEKIKD FT DKYQDYEPNQFAYEKNKTMNKRDSRYKTVVVKENHPTYIIDSNNRKIHKIN FT LRKN" FT CDS 4962..6506 FT /product="GYPSY7_AG3p" FT /translation="LIFYYRIAFFTLYGVLQASINIFDLTNNPLAIVPLGQ FT AKIRIGYLRTIHPIDLTELEEIISRVFENSTNSTGKSPLQSLINLKLEKLN FT ATISKIRPRRLRTKRWNSIGTAWKWIAGSPDAEDLTIINTTLNSLILQNNE FT QLLINNGLSRRFQETTNIANHVIDLQNRIQREHQTEIQQIIKIANLDALQA FT HIKTLQEAILAAKHGIPNSELLSIEDLNTVAEFLAQNGIYYTSVEEMLTQA FT TAQVTMNSTHVIFMLKFPRLSYETYEYNYIDSIIQNDKRILIKHNYIIRNL FT THMFELPQPCIDQSSHQLCESKDLEEPSRCIRQLVQGEHTECMYEKVYSTG FT LVKHINNANILLNDATAEISSNCSNINHILNGSYLIQFHNCNIFINGELFP FT STEVSITGKPYISTLGLIAKEDGIRDEPSIEHLRNITLQHREKLHTISLVN FT NSLTWKLHIFGSIGLTTIVLITIAILYFITSIRRTKISLNIPTNNTNRQDV FT HHIETFVKKPTTFHALGRL" XX SQ Sequence 6534 BP; 2503 A; 1267 C; 1070 G; 1694 T; 0 other; ggcgcccgaa tagggacctc tagtgaagtg aaaatcaagt gacgagtagt gatcgcgaaa 60 gagcgtaatc acaccgtgct tagaacatct cccgatcgca gattgcagcc gtgtgtatga 120 gcacctaagt agccggtagc agccaataac tcgttgccgg aggaaaggga caacgccgac 180 acaccgagaa ggaaaaattc accggaggcc cgagcaacag aaaccaaccc cgtcagagga 240 ggcgaagcaa ccagccccgt aagaagggtg aagcatgaat ttattaatgt aagtaaaact 300 ttcaatcata gtaagtataa acatttatta aagcgacaaa aaatattaaa acattataat 360 cagtgttaaa gtgagtgatt agtgagtgaa aagaaaatca ttatcacaaa agttaagtca 420 aactggagga attagttcaa tcattacgac aattctcagt cacagtaaaa tctatagata 480 acgaagaaaa actaacaaac aaagatatgg aggcactagc tggcagaatt gcagctttag 540 aagcacgttt tagtgaaagc aatgttacag atgattttca agacccacca ctttttttta 600 caaaacaaga tggtagtgca gtagatcccg aatctttcga aaaaatccct ggagttgtaa 660 aagatctccc aattttctgc ggtgacccaa gtgaacttaa tagctggatc aatgacgtag 720 atgggataat ccgactatac caaactatat ctagccatag tttggaaaag caaaataaat 780 tccacatgat ttgtaaattc atacgtagaa aaattagagg tgaagccaac gatgctttag 840 tagcatctaa cgtagggata aattggaata tgatgagaaa aactctcata acttattatg 900 gagagaagcg agatttggaa actctcgatt ttcaacttat gagtgtctac caaaaaggtc 960 gaactttgga agtttattac gacgaggtta atagacttct ttcacttatt gcaaatcaga 1020 tacagacaga cgatagattt aaccatccgg aagcttcgaa agctatgatt ggaacataca 1080 acaagaaagc gatcgatgct tttatcagag gtctcgatgg ggacgtttat aaatttattc 1140 gtaactacga accaacatcc ttagcagcag cctacagcta ttgcatttct tttcaaaacc 1200 tagagtgccg taagatgcta acaaaaccaa aacattttaa cacacccccg tcagccccca 1260 gaaaccaaat accattgccc acacctcatc taccaccaag agtgttccaa caccaacaaa 1320 gaccaatgac agcgaacaac gtaagacctc attttgcgca ccacccaccg attcaaaatt 1380 ttgcaggaaa ttttacacaa cgtcctgttt ggaatcaacc aaatcagcaa agaccaattt 1440 ttcagcgcac aaattttaat caaccaaatc agatgaaaaa ttttacacag cagagaaaca 1500 attttcgcca aaatggacct gaaccgatgg aaatagaccc atcaattagg tcacatcaag 1560 ttaattatgc gaacaggccg aactcctcaa acattcgtcc attgaaaaga caaagagctt 1620 tcaatattga agcagttccg cgacgtgaat tagaaccgac ttcatatgaa gataatctct 1680 acgatgatga tgtcgaaagt caggcgtcat acgaacgata tatgagaaat gtagaaaagc 1740 aagaaaaact aaatgaaaat tctcattacg acgaaatttc tcgcgaagca gaattaaatt 1800 ttttaggtta aaatcagctt taccatattt tatataccat ggtaaggcag gtcaacaaat 1860 taaaattcta atcgacactg gatctaataa aaatttcatc aaccctttac atgcgaaaat 1920 ttctcacgac gttataaaac cattttttgt atcatctgtg ggaggagatt tactcatcac 1980 aaaatattca caagctcaaa tatttgcccc ttattccgat gtaaatgtca aattttatca 2040 tttgcaggga ctaaaatcat ttgacgccat aataggttat gataccatca aagaaatggg 2100 agcatttgta gacgctaaaa gagacaatct agttcttgaa aattttataa tacctctctc 2160 acttcatcca ttacaggaag ttaacagaat tgaaataaga gacacacatc ttaaccacca 2220 agaaaaagaa aaattacatt tatttcttaa caagtttcaa gatttattcc agccacccga 2280 cgaaaagttg ccctttacaa caaaggtaga agcaaccata gccacgaatg atacggaacc 2340 aatttactgt aagtcatacc catacccttt gtccctcaaa caggaagtgg aaacacagat 2400 aaaaaaatta ttaaataatg gtataattcg accatctagg tcaccatata attcacctgt 2460 gtggatagtt cccaaaaagg ttgacgcatc taacgaaaaa aaatatcgac ttgtgatcga 2520 ttacagaaaa ataaacctga aaactaaaag cgatagatat cccattcccg atacttcaac 2580 agtacttgcc aatctaggaa ataataaata ttttacaaca ctcgatctag catcgggatt 2640 tcaccagatt cgtttagcag aaaaagatat cgaaaaaacc gccttttcca tcaataatgg 2700 aaaatacgaa tttttaagat tacctttcgg tctgaaaaat gcaccttcga tttttcagag 2760 agtcatggac gatgttctta gagaacatat tggaaaaatt tgtcacgtat acatagacga 2820 tataatagtc tttggaaaaa cattcgacga acatctgaaa aacttggaaa ttgttttgaa 2880 tacattacga gaagccaatt ttaaaataca gccagacaaa tcagagtttt taagaacaga 2940 agttgaattc ttaggattca ttgtttcaga atatggcttg aaaccaaatg agaaaaagat 3000 agaaagtatc ttaaaatacc ccgaacctca aactattcga gaacttagat catttttagg 3060 actgtctgga tattacagaa gatttgttaa aaattatgca gctttagcaa aacctttaac 3120 aaaactttta agaggggagg atggccaagg ccactgcaaa attacaaaaa atcaatctaa 3180 aaattttccg ataaaattag atgatgatgc caaacgcgct ttcaaaactc ttaaggaagt 3240 tttatcatcc gatgatgttt tagcataccc cgattttgat catgatttta ttttaactac 3300 cgacgcttct gacaaagcaa tcggagctgt tctttcccag aacgttaatg gtgttgaaaa 3360 accaataaca ttcatatcta gaacactatc aaaaacagaa gagaattatg ctacaaacga 3420 aaaagaaatg cttgctatag tttgggcttt acattctcta cgtaattaca tttacggtgc 3480 aaaaataata atattaacag atcaccaacc tttaacatat gcaatgtcac caaaaaataa 3540 caatgcaaaa ttaaagcgat ggaaagcatt catagaggaa cataactatg agctaagtta 3600 caaacctgga aaaaccaacg tggtagctga tgctctttca cgcatacaaa ttaactcact 3660 aactcctaca caacactctg ccgaagaaga tgatctttct tttatccctt ctaccgaagc 3720 tccaattaat gttttccgaa accaattaat ttttcaaaaa ggtactatta gttcctacga 3780 gtttgtaaac ccctttccta agtttaaaag gcacactttc atagaaccac aattttcaat 3840 cgattttata aaagacaaac ttaagagatt catgatacct ggtataataa atggcatatt 3900 cactgatgag ccaactatgg ggatcattca agaaaccttt aaaaatctat tcaatatatc 3960 aaccatgaaa gcaagatttt cacaaactca agttcaagac atttgtgatc aagaacaaca 4020 gatagaagaa attcgtaaaa tacataactt tgcccataga aacgctaagg aaaattcatt 4080 acaagctata aaaaaatttt atttcccttc catgagaaac aagatagaac aatatgttaa 4140 aaactgcgaa acttgcaaag tagaaaaata cgaaagaaga ccccctgaat acataccagt 4200 taaaacacca atcccaaaat atccaggaga aattgttcat gttgatatat ttgcgtataa 4260 tgcaaatttt ttattcatct cgtcaatgga caaattttcg aaatatttga aattaaaacc 4320 aattaaatca aaatccatag cagacgttaa ggaagtactg ctacaattat tatacgattg 4380 gaatttgcct agacaaatta tatttgataa cgaatgtaca tttgtatcga acgtcataga 4440 gcagtccata ctaaatttag gtgtatcaat ttttaagaca ccagtgaata gatcagagtc 4500 aaatggacaa gtggaacgtt gtcactccac gatcagagaa atcgcaagat gtacaaaagg 4560 tttgaatcca gacatgagct taattacctt aatacaacaa gccgtgtata agtataataa 4620 tactattcat tcttttacaa aagagactcc cagaaaagta tatattggag agcaatcaga 4680 agaactttca tttagagata gatcaaaatt aaaagaaaaa attgagagta aaattataaa 4740 aatatttgaa gaaaaaaatg aaaagattaa agatgataag taccaagatt acgaaccgaa 4800 tcaatttgcg tatgaaaaaa ataaaactat gaataagcgt gacagtcgtt ataagacagt 4860 ggtagttaag gaaaatcatc caacgtatat aatagattcg aacaatcgaa aaatccacaa 4920 aataaattta agaaaaaatt aatgatataa ttatgattta attaattttc tattacagaa 4980 tcgcgttttt cacactatat ggtgttctac aagctagcat aaacattttc gacttaacaa 5040 acaacccatt ggctattgtt ccgttaggac aagcaaaaat taggatcgga tacttgagga 5100 cgattcatcc aattgatctt accgagctag aagagataat ttctcgagtt tttgaaaata 5160 gcacaaacag tacaggaaaa tccccattgc aaagtttaat taatttgaag ctcgaaaaac 5220 ttaacgccac aatttctaag attaggccac gtagacttcg aacgaaaaga tggaacagta 5280 taggtaccgc ctggaagtgg atagctggca gtcccgacgc agaagatcta acgataatca 5340 acaccaccct gaattcgctc atcctacaaa acaacgagca gctattaatc aataatggtc 5400 tcagcagaag attccaagaa acaaccaata ttgctaatca tgttatcgac cttcagaata 5460 ggatccaaag ggaacatcaa actgagatac aacagatcat taagatagca aacctagacg 5520 cattacaagc ccatataaaa acactccaag aagccatact agccgctaag catgggatac 5580 cgaatagcga gctactatca atagaagact taaacaccgt tgcagaattt ctggcacaaa 5640 atggcattta ctatacatca gttgaagaaa tgttaacaca agccacagca caagttacca 5700 tgaattcaac acacgtgata tttatgctaa agtttccacg tctatcctat gaaacttatg 5760 agtacaacta tatcgactct atcatacaaa atgataagag aatcttaatc aagcataact 5820 acataatccg gaatctaacc catatgttcg aattaccgca gccctgtatc gatcagagca 5880 gccaccagct ttgcgaaagt aaagatctgg aagagccttc acgctgcata cgacaactcg 5940 tacaagggga gcatacagaa tgtatgtacg aaaaggtgta ttcaacggga ttagttaaac 6000 acattaacaa tgcgaatatt ctattgaatg atgccactgc cgaaatttca tccaactgca 6060 gcaatataaa ccacattctt aatggatcat atctcataca atttcacaac tgcaatatct 6120 ttattaacgg agaactcttt cccagcaccg aagtttcgat aaccggtaaa ccatatatat 6180 caacccttgg cctcatcgct aaagaagacg gcatcagaga cgaaccttca attgaacatc 6240 ttcgaaacat aacattgcag cacagagaga aactacatac catcagcctg gttaataatt 6300 ccctcacatg gaaacttcat atctttgggt caattgggct aacgacaatt gttctgataa 6360 caatagcaat tttatatttc attaccagta taagaagaac gaaaataagc ctcaacattc 6420 caacgaacaa caccaaccga caggatgtcc accacataga aaccttcgtg aaaaaaccca 6480 caacattcca tgctctcggc agactttgag ggcaaagtca tctaagaagg gagg 6534 // ID BEL8-I_AG repbase; DNA; ANG; 5419 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL8-I_AG is an internal portion of the BEL8_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL8-I_AG; BEL8-LTR_AG; BEL8_AG; Bel clade; RING finger; KW integrase; peptidase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5419 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL8_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 45-45 (2003). XX DR [1] (Consensus) XX CC BEL8_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL8-I_AG, an internal portion of BEL1_AG is flanked by CC BEL8-LTR_AG CC LTRs. The BEL8-I_AG consensus sequence was reconstructed based on CC multiple alignment of 12 copies; they are less than 1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes one 1742-aa BEL8_AGp Bel-like CC protein. CC BEL8_AGp is composed of the putative peptidase (pos. 125-261), CC RING CC Zn-finger (pos. 320-389), reverse transcriptase and integrase CC (pos. CC 1427-1596) domains. CC BEL8_AGp CC MPAQPLILSSRRTILIAILARYEEFLQNYQPDRDSIEVETRMAKFDQICKDLENLQQQLEDSATTAEEMT CC HNAALREDFERRLIRVQSALKAKYREIPRSEVSQQGAGRNPLQGIKLPTIALPEFDGDYMQWLTFRDTFE CC CLIHDNVDLPPIQKFHYLRAALKGEAAQVIEAITISASSYELAWKTLAERYSNEYLLKKRHLQAMFGITP CC AKRESATTLHQLVDEFERHKKTLNHLGEKTDGWSSILEHLLCTKLPSNTLRDWEEFASTNDNPSYDSLIA CC FLHRRMRVLETLLVNKPEPSPIETPIPPRRTIFPRTASFATTDRDVNKCPLCNMPHTITKCQRFNAMNPA CC QRYRKVLDARLCLNCLRDNHRARDCSSQYKCRHCNLAHHTMIHTESTPSTSSTTFPMLAAQDEPSHTTHA CC TDDHTASIQRSYAAAIKQSPSQILLQTALLNVTDAHGILHPVRALLDSASQPNLMSNRLAQRLALKGSTV CC NITLKGAGLSTRTVRRSVRAQIASRVEHFDLDVDFLIVDKVIADLPAHDVSTRGWNIPSEFVLADPQFDK CC SAPIDLILGARHYASFFTNVKSHELAPNLPTMLNSVFGWVMIGPTSPQNPASPTDCTAASTIVCMASLEE CC SLERFWKLExLSVNDSYSPDERRCETLYKETTQRDESGRYIVRLPKQTDFTEKLGLSKTTALRRFELLER CC RLERNPQLKEDYHAFMKEYLELGHMSLMNKDSGDERAYYLPHHPVFKASSTTTKVRVVFDGSAKTSTGYS CC LNDILCVGPIVQDELLDIVLRFRTYQIALVGDIAKMYRQILLHSDDRRLVRIFFRFSPQAPIQVYELNTV CC TYGLAPSSFLATRTLIQLADDEGTEYALAPAALKRNFYVDDFIGGANNVREAVQLRKELSALLAKGGFEL CC RKWTSNNLSVLSGLSTEYIGTHSSLHFIPNETVKALGISWKPESDELCFESNTEADEATSTKRSILSSIA CC KMYDPLGLIAPVIVRAKMLMQELWLLKSGWDEPVPNHICKKWKAIQSDWKTLSEYRTNRYALLPDATVEF CC HTFTDASEAAYGACVYARCENAAGEVRISLLASKSRVAPLKRVTLPRLELSAAVLGAHLHHRVKEAMQIV CC CAESFFWSDSTVTLKWIASPPNSWKTFVANRVAEVQHYSHPRQWRHVPGTSNPADLVSRGMSAAHFTQNQ CC LWNNGPDWLVQPSSHWPSSDPEPSDEADLETRQVSAALVCTQTHPWFGISSSFTRMVRIIAYCIRFVRNT CC KQKARSQRPIPHTNASKTITPKYVDAAKTVLCRLAQQDAFSAEIKQLKKGEALMKQSPLRKLTPFLDTEE CC VIRVGGRLNLSQLPYQSKHPAVLPKNHKFTRLLAEDYHEEMKHASGRLLLSRIRELYWPLDGRRLVKSIA CC RNCFRCIRQDPALARQPVGQLPPSRITPSRPFSVTGVDYAGPFYLKPAHRKAAATKSYLCVFVCFATKAV CC HLELVGDLTTAGFLAALRRFTSRRGLPAHIHSDNGKNFEGAERELKELFELFNDEQHRNTVATRCADRGI CC TWHFNPPKAPHFGGLWEAAVKTAKRHLYRHLGNTRLSYEGYCTVLHQIEAAMNSRPLLPLSDDPNELAAL CC TPAHFLIGTSMFAVPEPDYTQLKSCTLDDLQKWQLLVQRFWKHWATEYLQEMQKCYASGGSNNSNILPGR CC LVILMDESLPTTRWPLARIVKIHPGEDKIVRVVTLKTAKGIITRPITKICVLPLRTDSENHV. XX SQ Sequence 5419 BP; 1453 A; 1419 C; 1277 G; 1265 T; 5 other; ttggtgccgt gaccaggatt gtgccgatak cctgaatatt cgcaattttt tcgataaagt 60 gctaaacacc tttgcctttc actgcccgac cgttacactg cactttacac tacactcgcg 120 tacacgataa tcgcagtttt agtaagatgc cagctcaacc gcttattttg tcgtcgagac 180 gaacgatttt gattgccatt ttggcccgtt acgaagagtt tcttcaaaac taccagcccg 240 atagggattc tatcgaggtg gaaactcgta tggccaaatt tgaccaaata tgcaaggatt 300 tggagaacct tcaacagcag ctggaggaca gtgcaaccac tgctgaagag atgacacaca 360 atgccgccct tagggaggat tttgaacggc gcttgattcg cgttcaatcg gcattgaaag 420 cgaaatatag agaaattccg cgcagcgaag tgtcgcagca aggagccggt aggaacccgc 480 tacaaggcat aaaactgccc accatcgctc tgccggagtt tgacggcgat tatatgcaat 540 ggctgacctt tagggatacc tttgagtgtt tgattcacga taatgttgat ttgccgccaa 600 tacaaaaatt ccattatcta cgcgcagccc taaaaggtga agcagcgcaa gtgatcgaag 660 ccattacaat aagcgcctcc agctatgaat tagcttggaa aacactcgct gagcgatatt 720 cgaatgaata tctgctgaaa aaacgccact tgcaagcgat gttcggcatc acaccagcga 780 agagggaaag tgcgacaacc ctgcaccagt tagtggacga gttcgagcgt cacaaaaaaa 840 ccctaaatca cttgggagag aaaactgacg gctggagcag catattagag catttacttt 900 gcacaaaatt gccctctaac acattgcgcg attgggaaga atttgcttcg accaatgaca 960 atccgagcta cgattcgttg attgcctttt tgcaccgccg tatgcgcgta ctcgagacgc 1020 tattagtaaa caaacccgaa ccatcaccta tagaaacacc gataccacca agacgcacca 1080 tttttccgcg cactgctagt tttgctacca ctgaccgcga tgttaataaa tgcccgttgt 1140 gcaatatgcc gcacaccata acaaaatgcc agcgatttaa tgccatgaat cctgcccagc 1200 ggtaccgtaa ggtacttgat gcccgcttgt gtttgaattg tctgcgagac aatcaccgtg 1260 cccgcgattg ctcgtcacag tacaagtgtc gtcattgcaa cttggcgcat cacacaatga 1320 ttcacactga aagcactccc agcacatctt ccactacatt tccaatgcta gctgcgcaag 1380 atgaaccttc acacacaaca cacgccactg atgatcacac ggctagcata caacgcagct 1440 acgcagctgc aataaaacaa tcaccttcac aaatattatt acaaactgca cttctaaatg 1500 taaccgatgc acacggcatc ctgcatcctg tgcgtgcact cttagacagc gcatcacagc 1560 ccaatttgat gagcaatcgc cttgctcaga ggttggcttt gaaaggtagc acggttaaca 1620 taaccctcaa aggagcagga ctatccacca gaacggtgag gaggtcggtt cgagctcaaa 1680 ttgcttcacg tgttgaacac tttgacttgg atgtcgattt tctgatagta gacaaggtga 1740 tcgctgatct gccggcgcat gatgtttcca ctcgcggctg gaacattcct tcggaatttg 1800 ttttggctga cccgcagttc gataaatcag ccccgattga tctcatcctt ggtgcccgtc 1860 attacgcttc cttctttacg aacgtaaaat cgcacgagct tgctccgaac cttccaacta 1920 tgctgaacag cgtgtttggg tgggtcatga ttggtcccac ctctcctcag aatcctgcat 1980 ctccgaccga ttgcaccgcc gcgtccacaa tcgtctgcat ggcatccctg gaggagtctc 2040 tcgaacgctt ttggaagctg gaagrgttaa gcgtcaatga ttcgtactca cctgatgagc 2100 ggcgatgcga aacrttgtat aaagaaacca ctcagcgcga cgagtcgggt cgatatattg 2160 tacgattgcc caaacagacc gacttcacgg aaaagcttgg cctgtctaaa actaccgctt 2220 tgagacgctt cgagctgctg gagaggaggc tagaacgcaa cccacagctc aaggaagact 2280 atcatgcctt catgaaggag tatttggagc tggggcacat gtcgctcatg aacaaagata 2340 gtggggatga acgggcgtac tacctaccgc accatcccgt atttaaagcc tccagtacca 2400 ccacgaaagt aagggtcgtg ttcgacggat ctgcaaaaac aagcaccggt tattccttga 2460 atgacattct atgtgttggt ccaatcgtgc aggacgagct gcttgatatt gtgttgcgat 2520 tccgcaccta ccaaatagca cttgtgggag atatagctaa aatgtaccga caaataytgc 2580 tgcattctga tgatcgtcga ttggtgcgca tattctttcg attttcgccg caagctccga 2640 tccaagtata tgagctcaac accgttacat acggactagc accttcctcg tttctggcta 2700 cacgcacact tatccaacta gcagatgatg aagggactga gtatgcgctt gcacctgcag 2760 ccctgaaacg aaacttttac gtggacgact tcattggtgg tgccaataac gttcgcgaag 2820 ctgttcagct gcgtaaggag ttatcagcgc tacttgccaa aggtgggttt gagttgcgca 2880 agtggacatc aaacaatctg agcgtgctct ccggcttaag caccgagtat atcggcacac 2940 actcatcgct gcattttata cccaacgaga cggtcaaagc actcggcatc tcgtggaagc 3000 ctgaatcgga tgagctgtgt tttgaatcca acactgaggc tgatgaagcc acgtcgacca 3060 agcgatctat tttgtcgagc attgccaaaa tgtacgatcc gctcggattg atagcaccgg 3120 tgatcgtgcg tgctaagatg ctgatgcagg agctatggct actcaaatcc ggctgggatg 3180 aacctgttcc taatcacatc tgtaaaaaat ggaaggcgat tcagagcgac tggaaaacgt 3240 tatccgagta caggactaac cgttacgctc tcttaccaga tgcaacagta gaatttcaca 3300 catttaccga tgcttckgag gccgcctacg gagcatgtgt ctacgctcgt tgtgaaaacg 3360 cggcgggaga agtccgcatc agcctattag cttcgaagtc tcgagtggca ccactgaagc 3420 gcgtcacgtt gccgaggctt gaactaagcg cagctgtcct gggcgcccat ctgcatcatc 3480 gcgtcaagga ggcaatgcag atcgtgtgcg ccgaatcgtt tttctggtcc gactcaacag 3540 tgacgctaaa atggattgcg tcacctccca actcctggaa gacgttcgtg gcaaatcgag 3600 tagctgaggt gcaacactac tctcatccaa ggcaatggag gcacgttcct ggcacatcca 3660 atcctgctga cttggtttcc cgaggcatgt cggcagcaca cttcacgcag aatcagcttt 3720 ggaataacgg tccagattgg cttgtgcaac cttcgtccca ttggcccagc tcagatccag 3780 aaccaagcga tgaggcggac ctagaaacac gccaggtgag tgccgcttta gtttgtacac 3840 aaactcatcc atggtttggc atttcttcat ccttcaccag aatggtacgc atcattgcat 3900 actgcatacg gtttgtgcgc aacaccaagc agaaggcgcg atcacagcga ccgataccgc 3960 acaccaatgc atccaagacg atcacgccca agtacgtgga tgctgcaaaa actgtgcttt 4020 gcagactagc ccagcaagat gcattttccg cggaaatcaa gcagctaaaa aagggagaag 4080 cattgatgaa acaatcacct ttacgaaaac ttaccccatt cctggataca gaagaagtaa 4140 tacgggtggg aggacgattg aacttgtcgc aactaccgta tcagtccaag catccagctg 4200 ttctaccgaa gaaccacaaa ttcacccgtc tacttgcgga agattatcat gaagagatga 4260 aacatgctag tggaaggcta ttgctatccc gcattagaga actgtattgg ccactggacg 4320 gacgtcgctt ggtaaaaagc attgcaagaa actgcttccg ctgtattcgg caagatcccg 4380 cactcgcccg gcagccggtt ggccagcttc caccatcccg catcacaccg agccgacctt 4440 tttctgtaac cggagtggat tacgccggtc cattctactt gaagccagcg caccggaagg 4500 cagcagctac taagagctat ctgtgcgttt tcgtgtgttt cgctacgaaa gctgtgcact 4560 tggaactcgt aggagacctc acaacggcgg gattcttagc agcgctacgc cgattcacat 4620 cacgacgcgg attgccagcc cacatccatt ctgataatgg gaaaaacttc gaaggcgcag 4680 aacgtgaact gaaggagctt tttgagctgt tcaacgacga acaacaccgc aacaccgtgg 4740 ctactagatg cgctgaccgg ggaatcactt ggcatttcaa cccaccaaag gctccacact 4800 tcggcggatt atgggaagca gcagtaaaga cggcgaagcg acacctctat cgtcacctgg 4860 gcaatacgcg gctgtcgtac gaaggctact gcactgtgct ccaccaaatc gaagcagcga 4920 tgaattcccg tccgctgttg cctttgtccg acgatcccaa cgagctagct gcactcacac 4980 cggcacactt ccttattggc acatcgatgt tcgccgtgcc tgaaccggac tacacccagc 5040 tgaaatcctg cacgctagat gatcttcaga agtggcagct tttggttcag cgtttttgga 5100 agcattgggc cactgagtat ctacaagaaa tgcagaaatg ttatgcaagt ggtggcagca 5160 acaacagcaa catacttccc ggcaggttag tgatcctcat ggacgaatcg ttacccacca 5220 ctcgttggcc tctcgcgcgt atcgttaaaa tccatcccgg tgaagacaag atagtacgcg 5280 tcgttacgct taagacagct aagggaataa ttacgcgacc gatcacgaaa atatgcgttt 5340 taccgctcag aactgatagc gaaaaccacg tgtagtctag gaagcaactt tttgttgaca 5400 ttcgtcaagg tggggagga 5419 // ID BEL13-LTR_AG repbase; DNA; ANG; 240 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL13-LTR_AG is a long terminal repeat of the BEL13_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL13-I_AG; BEL13-LTR_AG; BEL13_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-240 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL13_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 34-34 (2003). XX DR [1] (Consensus) XX CC BEL13-LTR_AG flank an internal portion of BEL13_AG (deposited as CC BEL13-I_AG). XX SQ Sequence 240 BP; 74 A; 53 C; 54 G; 59 T; 0 other; tgttaagaat gcttggtatc atggccgctg acagccaccg agtgacgcaa cctgcgctct 60 gacagattac cgcgtttttg aactgtcaaa ccacaagccg aagagttgtg tgagtgtgtg 120 ttcgctagga gatcgagaac agagcacatg cagaaagttg atcgcaaata aaacatcgta 180 aaaaacccac cgaacagtgc cgcattatag tttgaatcat ttcgcttatt tcaattaaca 240 // ID R7Ag1 repbase; DNA; ANG; 6470 BP. XX AC AB090820; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon R7Ag1 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; gag-like; R7Ag1. XX NM R7Ag1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090820; Positions 1 6470. XX FH Key Location/Qualifiers FT CDS 959..2539 FT /product="R7Ag1_1p" FT /translation="MDKQLRGRTISVDERPAVVIRKLGSEKKLGTIVEEPS FT SAGVPARTMATGGVKSAGTATKLATSTPVSTGEVRRMLADAKADNETTVGI FT VKRLEEQIQLLRLQMEASNEQLKEAQREAREAREDARVREAEHREELRKEK FT ELFNALLAQTLGGTSGARLESQQELQREQELLRRMESQQRQEQRQQLEDQQ FT RQRWRQQQQKQQRQQRLPAQQWPTVQQSVRAQRQGVTESASSAVPDEAGTW FT VEVVRGNQRGNKQNGVNLPQQSAQRQPAHRQHQQWPHQQNGQQQQQRMGIH FT QQEKRRPRRKRPDEIVVVPAPGVSFKEMYVKIRTNPRIADFQRQIGVGRRT FT PRDHLLLPLSRDVDSAALKDIIQEVIGERGSVTVRTEMAEVVLTGIDNMID FT EEAIKKALMTTLGKQSLVATVNLWERRDMTKRARVRLPRAEAELVKDRRLE FT LGYTYCSVHEAPKVSGQLTRCFRCLERGHIAATCTGEDRSKRCLRCGDQTH FT KASGCTNEVKCMLCGGAHRIGAAACGGQPSN" FT CDS 2544..6209 FT /product="R7Ag1_2p" FT /note="endonuclease and reverse transcriptase." FT /translation="MEVLQINVNRSRSAQDLALNTMRVERADVCLMVELHS FT VPRNNGNWVADRDGKVAIIASSETYPVQQVVSVTQSGIAAARINDVLFICC FT YVSPSAGVSEFEEVMQRIDVLARGHPRVVFAGDLNAWHTAWGSCRTNAKGE FT AVVQLVDSLGLEVLNTGTAPTFLGNGVARPSVVDVAFASSSIAGVNTIPEH FT RWRIISIYSYSNHVYIRFAVGELLQRPAADSRRQEGPSTRESGTRWRTRHF FT DAELFGVALDVASFTERVTSAESLERVMTEACDAAMARVFPSQGHSGRPAY FT WWTPAIEVLCENCRLAKERLEAAIDEEEQIAAASDLLQVRTALDSSITTSK FT KEHFDEILRGLAEDETGQWYRNVLSRLSGSWTARERDSSVLEVIVSTLFPQ FT HPPVDWPASPGQVLERGEEEPVRDVNEQELLDIASSLNPRKAPGLDGVPNA FT ALTAAIRKHTDIFKKLFQECLDNERFPDEWKKQKLALIPKPGKPPGLASSF FT RPILLLNNPGKVYERLLLSRINDVIEDPESPRLAENQYGFRRGRSTVQAIQ FT LVVDAGSHAMSFGRTNNRDKRCLLVVALDVRNAFNTASWQCIATALEDKGV FT PRQLRNILRDYFANRELVYDTADGPVTRRVTAGVPQGSILGPTLWNIMYDG FT VLRVELPEGASVIGYADDIVVMARGCTPQEAALVAEQAVDAIAAWMEDHHL FT QLAPEKTEGVMISSLRRGQLKVPFRVGDTIIHSKQSIRYLGVQIHDHLSWK FT PHVELSTAKALRVVGVVTAVMRNHSGPQVAKRRLLAAVAESIIRYAAPVWS FT EATDLQWCQRKLAQVQRPLARGVTSSFVSVAYETGVALAGLVPFRLLVRED FT ARCHRRLLAAPGASRKDIRLEERQGTFQEWQRAWDAAAAAPTASRYAVWAH FT RMIPDLHLWMSRRHGEVDFHLSQVLTGHGYFREYLHVCGFAPSAECPRCPG FT SVESVAHVLFQCEVFHEIRVELLGYGTSDPVNENNLGMKLLESPERWNSIQ FT EAARKITKVLQQLWREDELQLNLQAHLAALPTRAAAVDAGPLDGEQVSVDG FT VAELFRSNRGRARRTRRGRRRAEERVEVRLASAMAAAEREREDSILMAAVR FT AEEAGEAPPPIPMRRRGLPPSPRTVRARHERRLYLQRLYRQRAREGTLPTV FT PHGRNRRSRSAPSEADTIRRRMRRREMERLRRTARRVPSNQGVREVLSADA FT LAAITEATTSGR" XX SQ Sequence 6470 BP; 1443 A; 1666 C; 2157 G; 1204 T; 0 other; cagtcgcaat cgagagctcg acccgaacgg acgtgtttgc gcagtgctcc gtgtggtgag 60 agggtgttcg tgagagcgag agagtgggag cgattgatat tcgtgtatcg aataatagtg 120 agcgtgtgta tgcgaagcgg aagcgagggt gaacttctgg aactatctgg aactttctcg 180 gctgttgccg tgcgtggctt ccttttgccc gtgcgtttgt gtatgcgttt gtgagtgtgt 240 ccgcaagcga gtggaatatt ctcggctgtt gacgcgcgtg gcgcttcgtt gtccgggctt 300 gtgtttcgtg taagcgagtg tgtgcgcaag aacgaggaga gttttctgga actatctgga 360 actttctcgg ctgttgccgt acgtggcttc cttttgcccg cgcgggtgtg catgcgtgtg 420 tgagtgtgtg cgcaagcgag tggaacattc tcggccgttg acgcgcgagg tgctccgttg 480 cccgtgcgtg tgcttcgtga aagcgtgagt gtgtgcgcga gcgagagtag aatattctgg 540 aactatctag aactttctcg acagttgccg cgcgtggctt ccttttgccc gcgcggatgt 600 gcatgcgtgt gtgagtgtgt gcgcaagcaa gtggaacatt ctcggccgtt gacgcgcgcg 660 gtgctccgtt gcccgcgcgt gtgcttcgtg aaagcgtgag tgtgtgcgcg agcgagagta 720 gaatattctg gaactatcta gaactttctt gacagttgcc gcgcgtggct tccttttgcc 780 cgcgcgtgtg tgtatgtgtg tgtgagtgtg tgcgcaagcg agtggaacat tctcggctgt 840 tgccgcgtgt ggcgctccgt tgctcgcgcg tagcgttccg tgtcactgtg tgtgcgcttc 900 tcatagcatc acagcgccgg ggacattcgt ggtggcgggg tggtagcaac ccgccaccat 960 ggataagcaa ctgagaggaa ggaccatatc ggtcgatgag cgtcctgcgg tcgttataag 1020 gaagctgggg agcgagaaaa aactcgggac catcgtcgag gaaccatcat cggcaggagt 1080 gccagcgagg accatggcga cgggaggagt gaaatcggcg ggaacggcga cgaaactggc 1140 gacttcaaca ccggtttcca ccggagaagt gcgtcggatg ctggcggatg caaaagctga 1200 caatgagacg acggtcggca ttgtcaagcg gttggaggag caaatccagt tgctgcgctt 1260 gcaaatggag gcctccaacg agcagctgaa ggaagcgcaa agggaggcaa gagaggcgcg 1320 cgaagacgct cgggtacgcg aagcggaaca ccgcgaagag cttcgcaagg agaaggagct 1380 gttcaacgct cttctagcgc aaacccttgg tggaaccagc ggagctcggc tagagagcca 1440 gcaggaactg cagcgagagc aggagctgct tcggaggatg gaaagccagc aacgacagga 1500 acagcggcaa cagctggaag atcaacagcg ccaaaggtgg cgtcagcagc agcagaaaca 1560 acaacggcag cagcggctac ctgcgcagca atggccgacg gtgcagcaga gcgtgcgtgc 1620 tcagcgtcag ggcgtgacgg agtcggcatc ctcggcggta cctgacgagg caggaacgtg 1680 ggtggaggtt gttcgcggca atcagcgcgg gaataagcag aacggagtga atctgcccca 1740 gcagtcagcc cagcggcagc cagcacaccg gcagcatcag cagtggccgc accagcaaaa 1800 tgggcagcag cagcagcagc ggatgggcat tcatcagcag gagaagcggc gtccgcgacg 1860 aaaacgcccg gatgaaattg tcgttgtgcc cgccccagga gtgtccttca aggaaatgta 1920 tgtgaagata cggaccaacc cgcggattgc cgatttccag cggcaaattg gggttggcag 1980 aagaacgccg agggaccacc tcctgctgcc tttgtcccgc gacgtcgata gcgcggcgct 2040 gaaggacatc atccaggagg tcatcgggga acgtggatcg gtaaccgtca gaacagagat 2100 ggctgaggtc gtcctgactg gaatcgacaa catgatcgac gaggaggcga tcaaaaaggc 2160 gctcatgacc actcttggaa agcagtcatt ggtggccacc gtgaaccttt gggagcgccg 2220 agacatgacg aagcgggctc gcgtgcgcct cccacgagca gaagcggaac ttgtcaaaga 2280 tcgccgactg gagctgggct acacgtattg ttcggtacat gaagccccaa aagtatcggg 2340 tcagctgact cgctgcttcc ggtgtctgga gcggggacac atcgccgcga cgtgcacggg 2400 tgaggatcgg tccaagcgtt gtctacggtg tggtgaccaa actcacaagg cgtcgggttg 2460 caccaacgag gtcaagtgca tgctgtgtgg cggcgcccac cgtattggtg ccgcagcctg 2520 cggtggacaa ccctcgaact gaaatggaag tgctacagat caacgtcaat cgcagtagga 2580 gcgcgcaaga cctggcgctc aacacgatgc gggtggagcg ggcggatgtc tgtctgatgg 2640 tggaactgca cagtgtcccc aggaacaatg ggaactgggt ggccgacagg gacgggaagg 2700 tggccatcat agccagcagc gaaacgtacc cggttcagca agtggtctcc gtgacgcagt 2760 ctgggatcgc ggctgcccgg atcaatgacg ttctcttcat atgttgctac gtgtcgccct 2820 cagcgggcgt ctctgagttc gaggaggtaa tgcagcgcat cgatgtgttg gcgagaggtc 2880 acccgcgcgt cgtctttgcg ggggatctca atgcgtggca caccgcttgg ggaagttgcc 2940 gcaccaatgc aaagggagag gctgtggtcc agctcgtcga cagcttgggg ctggaggtgt 3000 taaacaccgg caccgcccca accttcctgg gcaacggagt ggctcgcccg agcgtggtgg 3060 acgttgcctt cgccagcagc agcatcgccg gagtcaacac cattccggag catcggtgga 3120 ggattattag catatactcg tacagcaacc acgtgtacat tcggtttgct gtaggggagc 3180 tgctccagcg gccagcggca gatagtcgtc gacaggaggg tccttccacg cgagaaagcg 3240 gcacgagatg gcgtacccgt catttcgacg ccgagctttt cggtgtagcg ctcgacgtag 3300 cttcgttcac agagcgggtt acaagtgccg aaagcttgga gagagtcatg acggaagctt 3360 gtgacgcagc catggcgcga gtgttccctt cacaaggtca ctcgggacga cctgcttatt 3420 ggtggacgcc agcaatcgag gtcctgtgtg agaactgccg cctcgctaag gaacgccttg 3480 aagctgccat cgacgaggaa gagcagatcg ccgcagccag cgaccttctc caggtgcgga 3540 ccgccctgga ctcatccatc accaccagca agaaggagca tttcgatgaa atactgcggg 3600 gcctcgcgga agacgagacg ggacaatggt accgcaacgt acttagccgc cttagtggaa 3660 gctggacggc gagggagcgc gattcatcgg tgctagaggt catcgtttcc actctgttcc 3720 cccagcatcc tccggttgac tggccagcgt caccaggcca agtcctggag aggggagagg 3780 aggaaccagt tcgggacgtt aatgaacagg agctgctgga catcgcgagc tcgctgaacc 3840 cgaggaaggc tccaggactg gatggtgtgc caaacgccgc tttgacggcc gccatccgga 3900 agcatacgga catcttcaag aagttgttcc aggaatgctt ggacaacgag cggttcccgg 3960 atgagtggaa gaagcagaaa ctggccctga tccccaagcc gggcaagcca ccggggctcg 4020 cttcatcctt ccgcccgatt ctgctgctga acaacccggg caaagtgtac gagcggttgc 4080 tgttgtcgcg aataaacgat gtcatcgagg atcctgaatc accgaggttg gcagaaaacc 4140 agtacgggtt caggaggggc cgttcgacag tgcaggcgat tcagctggtg gtggatgcag 4200 gcagtcacgc gatgtcattt ggccgtacaa acaacaggga taaacgctgc cttctagttg 4260 tggcgctgga tgtgcgtaat gcgtttaaca ctgccagctg gcagtgcatc gctacggcgc 4320 tggaggacaa aggcgtgccg aggcagctcc gcaacatcct tagagactac tttgccaaca 4380 gggagctcgt ctatgacacc gcagacgggc ccgttacacg ccgagtgact gcaggtgttc 4440 cacaggggtc cattctgggc ccgaccctgt ggaacatcat gtacgacggc gtgttgcggg 4500 tcgagctccc tgaaggggct agcgtcatcg gctatgcgga tgacatagtg gtcatggcac 4560 ggggttgcac accacaggag gcggcattgg tggctgaaca ggcggtggac gcgattgcgg 4620 cttggatgga ggaccatcac ctgcagctcg ctccggagaa gacggaagga gtaatgatct 4680 ccagtctgcg aagaggtcaa ctgaaggtgc cgttccgcgt aggggacacc atcatacaca 4740 gcaaacagtc gatccggtat ctgggggtcc agatccatga ccacctgtcg tggaagccgc 4800 acgtggagct gtcgacggct aaagccctcc gcgtggtagg tgtggtcacc gcagtaatga 4860 ggaaccacag tgggccccag gtggccaagc gtcggctgtt ggcggcagtg gcggagtcga 4920 tcatccggta tgctgccccc gtgtggtccg aggcgacgga tctgcagtgg tgccagagga 4980 agctggccca ggtgcagagg cccctggctc gcggtgtcac cagctcgttc gtgtcggtgg 5040 catatgagac cggagttgca ctggcaggcc ttgtgccgtt caggctgctg gtacgggagg 5100 acgcaaggtg ccatcggagg ctcctagctg ccccgggcgc cagccgcaag gacatccggc 5160 tggaagagag gcaggggact ttccaggagt ggcagcgagc gtgggatgcg gcggccgcag 5220 ctccaacggc cagtcggtac gcggtttggg cccaccgaat gatcccggac ctgcacttgt 5280 ggatgagtag gcgacatgga gaggttgatt tccacctctc acaggtgtta accggacatg 5340 gatatttccg ggaatacctc catgtctgcg gttttgcccc atcggcggaa tgtccacggt 5400 gcccggggtc ggttgaatca gtggcgcatg tgctgttcca gtgcgaggtc ttccacgaga 5460 tccgggtgga gctgttgggc tacggcacca gcgacccagt gaacgagaac aatctcggca 5520 tgaagctgct ggagagcccc gagcggtgga acagcatcca ggaagctgca cgcaaaatca 5580 cgaaggtgtt gcaacagctt tggcgtgagg acgagctgca gctcaatctc caggcacacc 5640 ttgcagcact accgacgaga gcagcagcgg tagacgccgg tccgctggat ggtgagcagg 5700 tgtcggttga tggggtggct gaactctttc gttccaaccg aggtcgagcc agacggacca 5760 gaagaggtcg tcggagggcg gaagaacggg tggaagtgcg ccttgcatcc gcaatggcag 5820 cagccgaacg agaacgtgag gactctatcc tgatggcggc ggtgcgggcg gaggaagcag 5880 gagaagcacc accacccatc cccatgagaa gacgcggact gcctccatct ccgagaacgg 5940 taagagcgcg gcacgaacgg agactgtatc tgcaacgtct ctaccgtcaa cgcgccaggg 6000 aagggactct accaacagtc ccacacggta gaaaccggag gagcaggtcg gccccttcgg 6060 aagccgatac catccgacgg cggatgagaa ggcgtgagat ggagcggcta cgccgaacag 6120 cgcgaagggt gccgtccaat cagggagtgc gtgaggtgct atccgccgac gccctggcag 6180 ccatcacgga agcaacaacc tccggccgtt agaaggagga ttatggggta ggaagtgcat 6240 gggacagcaa aagggtacgg cgcaaaagaa aaataaaaaa tgcgttgggg ttttaattcc 6300 gtaaggaaaa aagaataagg aacgaaataa ataagaacca ataaaggcga tgtcacaagt 6360 gacacatact cctggttcca agccccgcta gggaacgggt ccaggagcag gagtggggat 6420 ttaacgttaa gttaatcttg caaataaatc cttaagtatt aaaaaaaaaa 6470 // ID GYPSY64-LTR_AG repbase; DNA; ANG; 187 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY64-LTR_AG is an LTR of retrotransposon GYPSY64_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY64_AG; GYPSY64-I_AG; GYPSY64-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-187 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY64_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 168-168 (2004). XX DR [1] (Consensus) XX CC GYPSY64-LTR is a long terminal repeat of GYPSY64_AG (its CC internal portion is deposited as GYPSY64-I_AG). XX SQ Sequence 187 BP; 55 A; 40 C; 56 G; 36 T; 0 other; tgttggcagg acagcccact gtgcggtgtg aaccccttcg gctagcagtg cgtccacctg 60 catttctata cgcatctcgg ggtgacaagt gacagctgcc ggatgccggg ggcacatggg 120 aaaaaggagt attgtacaag agacgcatga gataaggtag tgaaataaag tacaagaaaa 180 cgtaaca 187 // ID GYPSY69-I_AG repbase; DNA; ANG; 4642 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY69-I_AG is an internal portion of retrotransposon GYPSY69_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY69-I_AG; GYPSY69-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY69_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4642 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY69_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 177-177 (2004). XX DR [1] (Consensus) XX CC GYPSY69_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, is phylogenetically grouped with CC representatives of the mag lineage of other organisms. CC GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, GYPSY23_AG, CC GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, GYPSY28_AG, CC GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, GYPSY59_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG and GYPSY68_AG, are other CC members of this same lineage in Anopheles gambiae. The CC GYPSY69-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 351-aa CC GYPSY69_AG1p gag-like polyprotein (pos. 201-1253) and the CC 1166?aa GYPSY69_AG2p pol-like polyprotein (pos. 1853-4621). CC The sequence of the LTRs flanking GYPSY69-I_AG is deposited as CC GYPSY69-LTR_AG. XX FH Key Location/Qualifiers FT CDS 201..1253 FT /product="GYPSY69_AG1p" FT /translation="MQNRENTTLSSSMNTLSAVSINIPECKPITNTEEIDR FT KSFKDWKDLLEASMDLAGITDEYTKNNVFKVKAGPMLLEVLDSTPKQPLSE FT TSNAPYSNAMKRLNDFFGSREYRLVQRQKLRALTQDAEESDLKYLKRVIAT FT AKLCDFDEEKLMEGVTEVIQLHALNAKVREAGRKVLRKGGTLAALVDKIRG FT YEVDKTNEEIFKRTHQPASDFRIAAVARSGVESNSAAGSGTEPRGRNFDYQ FT GQSSWRNFGNPSNKYGRNNSRYENKVVPCWRCSSIFHVPSKCHAANKICRN FT CKKTGHIERACRQIPTPSNTLKRRISRDDERPVNTKKIATVTKDEEQKEDT FT NVVSDYSS" FT CDS 1853..4621 FT /product="GYPSY69_AG2p" FT /translation="MLAVTVEDEFPKFNIPPVVLSYDITKPPSRRIFTNIP FT LPFREETQRRLQNLLATGIIEVVTDSMDRSFCSSLLVVPKGKDDIRLVVDL FT RGPNKCIIRTPFRMPTLEEILSDLHGAKWFSTIDLTSAFFHIEIAEQSRHL FT TNFFGGNATYRFKRLPFGLCNAPDIFQETLQTKILSGCHGQKCYLDDILVH FT GHTKEEHDSNLKAVLHQLKAHNVRINTNKCVLGKNKVKFVGYLISDEGLCV FT ENEKLKAVQNFRRPETINEVKSFLGFMNFSERFIYMRADKTKYLREVAKSE FT VFYWSAAEESEFNFLKNEALKSIATLGYFSSKDDTELYVDASPVGLGAVLV FT QFDVECKPRIIACASKALTSTEQRYPQTQREALAIVWGVERFTFYLTGKSF FT AIRTDSEANEFIFGESYRQSKRAVTRAEAWALRLQTYDFTVKRIPGHLNVA FT DALSRLIAQTQEAEPFDQDDDKHLLYTLGGGIVSITWKDIEQESENDPELV FT AVRTASKSGRWPDNLRRYEAERRSIRSLGSIVFKDDKIITPTKLRSQIMLT FT AHQGHIGVSAMKRIMREYFWWPNMSKDVEVFVKNCNTCLVIARKNPPVPLT FT NRTLPDGPWQVLQIDFLSIPGCGAGEFLVVVDTYSRYLAVIEMHSTDAKTT FT ILALSRIFFTWGLPLTIQSDNGPPFQSNEFITYWEEKGVKINKSIPLNPQS FT NGAVERQNQGIIKALAGAKQDGNNWKDALSAYVHVHNNMKPHSRLGVTPFE FT LLVGWRYRGFFPSLWESKVTKEIDRSNVREQDNLSKLVSKTYADTRRGAKD FT SDITVGDKVLLSFPKKNKTDSTFSSDSYTVLVREGSKVVVQNDRGVQYTRN FT VGDCKKASQRTQLIEEIGPNSTSLSTQKDSEKESEHPSEDRSRRSPIKRSQ FT RAVRKPEKLKDMFLYRIYS" XX SQ Sequence 4642 BP; 1559 A; 895 C; 1022 G; 1166 T; 0 other; atggcgcaaa ccacgctccc aaatttaaat caaagaaagt gagcatatta cgcaatgctc 60 ttaaggcgga gaaaaggcgg aacctcttac ttgcggctaa gctgaagtac caaaataaag 120 ctagagcaaa acatttacga gaaataaagc agttcaaagc cgacattgtc aaagaggcga 180 ttcattcaac ggcgatcgat atgcaaaatc gtgaaaacac caccctttct tcctccatga 240 acacactttc ggctgtttct ataaatattc cggagtgtaa accaatcact aatacggaag 300 aaatcgatcg caaatccttc aaagattgga aagatttgtt agaagcatct atggacctcg 360 ccggtatcac tgatgagtat accaaaaata atgtttttaa agtaaaggca ggacctatgc 420 tgctagaggt cctagactcg actccaaaac agccattatc tgagacgtca aacgccccat 480 attcaaacgc tatgaagcga ctgaacgact ttttcggttc tcgagaatat cgcttggtgc 540 agcggcaaaa gttaagagca ttgacgcagg acgcggagga gtcggactta aaatatctta 600 agcgagtgat agccacagct aaattgtgcg attttgacga agaaaaacta atggaaggcg 660 tgacagaagt tatacaacta catgcgctca acgccaaagt gcgagaggca ggtagaaagg 720 ttctacgcaa aggaggtacc ctagcagctc tggtggacaa aattcgagga tacgaagtgg 780 ataagacaaa cgaggaaatt ttcaagagga cacatcagcc agcatcggac ttcagaatag 840 cggcggttgc acgaagcgga gtcgaatcga attccgctgc tggatccgga acggaaccgc 900 ggggaaggaa cttcgattac caaggacaat ccagctggag gaattttggg aatcccagca 960 ataaatacgg acgaaataat tcaagatacg aaaataaggt cgtaccctgc tggagatgct 1020 ccagtatatt tcatgtacca tctaagtgtc atgcagcaaa taagatctgc cgaaactgta 1080 agaaaacggg tcacattgaa cgtgcttgcc gacaaatacc gacaccatca aacacactga 1140 aacgtcggat cagtcgggac gatgaaagac ccgtcaatac aaagaaaatc gccacagtga 1200 cgaaagatga ggagcaaaaa gaagacacga acgtcgtaag tgactattcc tcataattca 1260 gcttacccga ttgaattttt atctgcccat gttgtgaatt tgataaacca tattttatct 1320 gaaggaataa ataggagtga actaaaaaaa aaaaaaaaaa agtaaaaaca ttaacattaa 1380 tgactattat atttttcaga ttaattaacg gttctcgatt gaaataaacg agttgcatga 1440 cttggatatc caaagaggat gtatcttagg acgcattgca aataacattt ccgtgttatt 1500 tttgatagac tccggagcgg acgtcaatac tgtggacgaa gacacttttg ataagttgcg 1560 taacaatgaa caatccagag agcaactata ttgcgtttca aatgatacag ataaaccact 1620 ccgtgcgtac gccagtccgg gtgagatcaa agtcgtaggc acttttgtag cagaattgta 1680 catttcagaa gaaagaccat gtctttttga aaaattttat gttattaaag gcgctaaacc 1740 actgttgggt agagaaacat ctctgagata cagtgttttg caaatagggc tcgatgtgac 1800 aattaactgt gcaagttctg gcaaatatta ccccgataaa tttccaggag aaatgctcgc 1860 agtaacagtg gaagacgaat tccccaaatt taatatccct ccggttgtct tatcttacga 1920 catcaccaaa ccaccatcac gtcgtatatt tactaacatc cctttaccat ttcgcgagga 1980 gacccaacgt agactgcaaa accttttagc aactggaatt atagaggttg tcacagactc 2040 aatggataga tcattttgct cgtcattact ggtcgttcca aaagggaaag atgatattcg 2100 cttagtagta gacctaagag ggcccaacaa atgcattatt cgcacaccat tcaggatgcc 2160 aactttggag gagattctct ccgaccttca cggcgccaaa tggttctcca ctatcgacct 2220 tactagtgca ttttttcata tagaaatagc ggaacaatct cggcacctaa ctaatttttt 2280 cggaggaaac gctacgtatc gttttaaacg actaccattt ggtctttgca atgctccaga 2340 catatttcaa gagacattac aaactaaaat attatccggg tgtcatggcc aaaaatgtta 2400 cttagacgac attttagtac acgggcatac aaaagaggaa cacgatagta atcttaaagc 2460 ggttcttcat caacttaaag ctcataacgt gcgcattaat actaacaaat gcgtgcttgg 2520 taagaacaaa gttaaattcg ttggttatct tatatcagat gagggattat gcgttgaaaa 2580 cgaaaaatta aaagcagtcc agaacttccg tcgtccagaa accataaacg aagtaaaaag 2640 ttttttgggg tttatgaatt tctcggaacg gtttatttac atgcgcgcgg ataaaactaa 2700 gtatttgaga gaagtagcta aatctgaagt tttttattgg tccgccgcgg aagaatctga 2760 gtttaacttt ctaaaaaacg aagctttaaa atctatagct acgttaggtt attttagttc 2820 caaggatgat actgaattgt acgttgacgc gtctcccgtg ggccttggtg cagtattggt 2880 acagttcgat gtggaatgta agccacggat aatagcatgt gcttcaaaag cattaactag 2940 tacagaacag agatacccac agacacagcg agaagcgcta gcgatcgtgt ggggagtaga 3000 aaggtttact ttctatttga caggtaaatc attcgccatc cgcacagatt ccgaagctaa 3060 tgagttcatt tttggtgaga gctatagaca aagtaaacgg gcagttacac gggctgaggc 3120 ttgggctcta agactacaaa cctatgactt tactgtaaag cggatcccag gtcatttaaa 3180 tgttgcagat gctctatcaa ggttgattgc tcaaacacaa gaagcagaac cttttgatca 3240 agatgatgat aaacatctac tatacacact tggcggggga attgtgagta tcacgtggaa 3300 ggacatcgaa caagagtcgg aaaatgaccc agagctcgta gcggtgagaa cagctagtaa 3360 atcgggtcgc tggccagata acctacgtag atatgaagca gaacgcagat caattcgttc 3420 gttagggtca atagttttca aggacgataa aattattacc ccaactaaac tgagatcaca 3480 gattatgttg accgctcatc aaggacacat aggtgtttct gcaatgaagc gtataatgag 3540 ggaatatttt tggtggccaa acatgagcaa agacgttgag gtgtttgtga aaaattgcaa 3600 cacgtgttta gttattgcaa gaaaaaatcc accagtgccc ttgactaatc gcactttgcc 3660 ggatggacca tggcaagtgc tacaaataga ttttctatcc ataccaggtt gcggggctgg 3720 agagtttttg gttgtggtcg atacctattc tagatattta gccgttatcg aaatgcatag 3780 tacagacgca aaaactacta ttttagcact aagtagaatt ttctttacct ggggattgcc 3840 actaaccatt cagagtgaca atggccctcc tttccagagc aatgagttta ttacgtactg 3900 ggaggaaaaa ggcgtaaaaa ttaataagtc aataccctta aaccctcagt caaatggggc 3960 tgtcgaaaga cagaatcaag gtattatcaa agctttagct ggagccaagc aagatggaaa 4020 taattggaaa gacgccctta gtgcttacgt tcatgtacat aataacatga aaccacattc 4080 cagactggga gtaactccgt tcgaattatt ggtaggatgg cggtatagag gtttcttccc 4140 aagtctgtgg gaatcaaagg taacaaaaga gatagataga tctaacgtac gcgagcagga 4200 taatctttcc aaattagtaa gcaaaactta cgcagatact cgtcgaggag cgaaggattc 4260 ggatataact gtaggagaca aggtcctact ctcatttcct aagaaaaaca agacagatag 4320 tacattttcc tctgactcat acacagtgtt agttagagaa ggatcaaagg ttgttgttca 4380 gaacgatcga ggagttcaat atacccgcaa cgtgggggat tgtaaaaaag cgtcgcaaag 4440 aacgcagtta atagaagaaa taggacctaa tagcacatca ttgtcgactc agaaggattc 4500 cgaaaaggag tcagaacatc cttcggaaga taggtcaagg agaagtccaa ttaagcgatc 4560 acaaagagca gtgagaaagc ctgaaaaact aaaggatatg tttttatatc gaatttatag 4620 ctagagtagg agaggaggcg aa 4642 // ID GYPSY32-I_AG repbase; DNA; ANG; 4536 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY32-I_AG is an internal portion of retrotransposon GYPSY32_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY32-I_AG; GYPSY32-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY32_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4536 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY32_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 56-56 (2004). XX DR [1] (Consensus) XX CC GYPSY32_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY29_AG, GYPSY30_AG, GYPSY31_AG, GYPSY33_AG, GYPSY34_AG, CC GYPSY35_AG, CC GYPSY36_AG, GYPSY37_AG and GYPSY38_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY32-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. CC The consensus encodes the 1442-aa GYPSY32_AGp gag-pol like CC polyprotein CC (pos. 129-4454). CC The sequence of the LTRs flanking GYPSY32-I_AG is deposited as CC GYPSY32-LTR_AG. XX FH Key Location/Qualifiers FT CDS 129..4454 FT /product="GYPSY32_AGp" FT /translation="MPTKDEMLCALESKGIEVPVTASIPQIRNMFSENVLQ FT LTSAEAVATTSRGEICPPSSTAIAATTTTAVATVAANCASTILKGDNDTDS FT ATILSASVACAPIALNNENSSSLPANADNENELNEMRRRLELLELRQRVQT FT LEYQSGKFSASDLKLDGIIEPFTGDDASKCIIEWLDELDHHFSLCRVQDSD FT KFFYVYRLLKGSAAMVAKASRASTLQQLKDELVANFYVTPTTEGVYRQLRN FT RRLSPHETALRYVLDMQRIASRASVPEPELINIIFEGLGSPSHTAGMRFLV FT EKVEDLKPLLNKFEAIRPRYVAKSSETPFPSDRKRVTTNRGTTPTTVRCFN FT CSQFGHHQSACTRQRRPPGACFRCFQLGHNYKSCPNAETSAAAYPDAANRV FT DIDNAETLPLNELHEVSASFLKSGKIIMTVKEYSLFDTGSPVSFINEKIVP FT LCLLSDPVPSNYKGLGNSHLYTRGQIHCRIQFRETILHHMFVILPHTSMAW FT PIIIGKDLLPALNVHLMYFKDHIPLKRTCIIPTKTIEEQKRNVSKSGILDN FT AFSEICAIDTFESEFKLDTGPSLSLEETSTILSIFENSYINNTSKNNELSN FT HCMKINLTNQTPIFTKPRRLSYGERNQVKEIVSNLLKEKVIRPSNSPYASA FT LVLVRKKNGEVRMCVDYRPLNKITVRDNYPLPLIETCLEHLSNKKFFSLLD FT LKSGFHQIRMEESSIPYTSFVTPDGQYEYTKMPFGLKNAPSEFQRFINSIL FT REFIDDEKVVVYLDDIIIASLDFQSHLKTLQAVLERIKQCGLELRVDKCKF FT AHQELDYLGYKANAFGIRPSDRHIQAIRNYPMPINVKQLRRCLGLFSYFRR FT FVPSFSCIAKPLTNLLQKDELFNFDTSCCEAFETLREKLTQSPILAIFDPK FT KETELHCDASSFGFGAVLLQKQEDNKLHPVAYFSKTTSKEEAKLHSYELET FT LSVIYALKRFHVYVHGLPLKIFTDCNSLVETLKNRNASAKIARWSLFLENY FT DYTIHYRSGTSMAHVDALSRTEAVGAISDLDLDFQLQIAQSQDPLINTLRQ FT KLEAGSVQGFILQDGLVYRQSSTNHLQLYVPREMVDNIIRHNHEKIGHLAI FT SKTCQTISQHYWFPHMKPRVENFIKNCLKCIVYSAPPRTNNRNMYSIPKTP FT VPFDTLHIDHLGPLPNITSRKKYILVVIDAFTKFTKLYATATTNTNEVCEA FT LTQYMSYYSRPKRIISDRATCFTSTAFKEFVDSNDITHVLNATGSPQANGQ FT VERVNRVLRPILSKLCNSSDHSDWSSHLRSAEHALNNTVHSSTNFLPSILL FT FGIEQRGQILDELHEFLNDKHVTTNRDLNLLRSEALSNIEYSQYRNEQYVA FT NRTKPAPSFSEGDLVAIKYTDSTNANKKLVCKFRGPYIVHKVLPHDRYVIR FT DVDGFQITQMPYDGVLEVDKLRKWSNNI" XX SQ Sequence 4536 BP; 1369 A; 1036 C; 846 G; 1285 T; 0 other; tctcagaagt gggatacgtc ctatcgacta cacctgtttt gctgtttgca ccctcacgat 60 aaaaagtata gcgaaaagtg tgtgcgtgtg tatgtgtgtg cgaggtgacg ccacgatccg 120 ccatcgcgat gcctacaaaa gatgagatgt tgtgtgctct tgaaagtaag ggcatagaag 180 tccccgttac ggcttctatt ccacaaattc ggaatatgtt ttctgaaaac gttctacaac 240 taacatcagc ggaagcagtc gcaacaacct cacgaggtga gatatgcccc ccgtcgagca 300 ccgccatagc cgccaccacc accaccgccg tagctaccgt cgcggcaaat tgcgcatcca 360 ccattttgaa aggcgacaac gataccgatt ctgccaccat tttgagtgct tccgtagctt 420 gtgctcccat cgcgcttaac aatgagaatt cttcttcgct tcctgctaac gccgacaacg 480 aaaatgaact caacgaaatg cgacgacgct tggaactttt agagcttcgc caaagggtcc 540 aaacccttga atatcaatcg ggaaaatttt cggcaagcga tttgaagcta gatggaatta 600 ttgagccttt tactggagac gatgcatcaa agtgcatcat tgaatggctt gacgagctag 660 atcatcactt ttctctgtgc cgcgtccaag attctgataa attcttctat gtgtaccgtt 720 tactgaaagg atctgctgct atggtcgcta aagcctctcg tgcttcaaca ctgcaacaat 780 taaaagacga gctggttgca aacttttatg tgacacccac gaccgaaggc gtctaccgac 840 agctgcgtaa ccgtcgtctc tcgccccatg agacagccct gcgttacgta ttagatatgc 900 agcgaatcgc cagtcgcgca tccgttcctg agcctgaatt aattaacatc atcttcgaag 960 ggcttggcag tccgtcccac accgctggaa tgcgttttct ggtcgaaaaa gtggaggatt 1020 tgaagccact tttgaacaag ttcgaggcga ttcgaccacg atatgttgct aaatcatccg 1080 agacgccctt tcccagtgat cgcaaacggg ttacaactaa tcggggaaca acaccaacaa 1140 ctgtccgttg tttcaattgc tctcagtttg gtcatcatca gagtgcttgt acacgccaac 1200 gccgcccacc aggagcttgt ttccgttgct tccagctggg acacaactat aaaagctgtc 1260 cgaatgctga aaccagcgct gctgcttatc cggatgcggc caatcgcgtt gatatcgata 1320 atgcagaaac actgccatta aatgaattgc acgaggtgag tgcttctttt cttaagtcag 1380 ggaaaataat catgacagtc aaagaatatt ccttatttga tacaggtagc ccagtaagct 1440 ttattaatga gaagatcgta cctttatgcc tgttatctga tccagttcct tccaactaca 1500 aaggtttagg aaattcgcat ctttatactc gtggccaaat acactgccga attcaattca 1560 gggagacgat tctccatcac atgtttgtta tactgccaca tacatctatg gcatggccaa 1620 taatcatagg gaaggatcta cttccagcac ttaatgttca tcttatgtat ttcaaagatc 1680 acatcccgct taaacgaact tgcattatcc ctacaaaaac tatagaggaa caaaaaagga 1740 acgttagcaa atctggtata ttagataatg cctttagtga aatctgtgct attgatacat 1800 ttgaatctga atttaaactt gatactggcc cttctctatc cctcgaagag acttcaacca 1860 ttttgtctat ttttgaaaat tcttatataa ataatacttc taaaaataat gaactatcaa 1920 atcactgtat gaaaattaat ctaactaatc aaactcctat tttcacaaaa cctcggagat 1980 tatcatacgg ggaacgtaat caagtgaagg aaatcgtttc aaacttactt aaagaaaagg 2040 ttataagacc tagcaattct ccttatgctt cggctctggt actcgtgcga aagaagaatg 2100 gagaggttcg aatgtgcgtg gactatcgac cgttaaacaa aattactgtg agggataatt 2160 acccactacc gttgattgaa acatgtttag agcatcttag caacaaaaag tttttcagtt 2220 tactagatct caaaagtggc ttccaccaga tacggatgga agaatcttcc attccctaca 2280 cttctttcgt cacccctgat ggtcagtacg agtacaccaa aatgcccttc ggtcttaaga 2340 atgccccctc tgagttccag agatttatta actccattct tcgtgaattt atcgatgatg 2400 aaaaagtggt cgtctacttg gacgacatta taatagcctc cttggacttt cagtctcact 2460 taaaaaccct tcaggcagtt ctagagagaa tcaaacaatg tggccttgag cttcgcgttg 2520 acaagtgtaa atttgcgcat caagaacttg actatctagg atacaaagca aatgctttcg 2580 gtattcgacc tagtgatcgg cacattcaag ccattagaaa ttatccgatg cctattaatg 2640 ttaaacaact cagaagatgt ttaggactct tctcttattt tcgacgcttt gtcccttcgt 2700 tttcttgcat tgctaaacca ctcactaact tacttcaaaa agatgaatta tttaactttg 2760 atacaagctg ttgtgaagca tttgaaacat tgcgagagaa acttactcaa tcccctatcc 2820 ttgctatttt tgatcccaaa aaggagactg agcttcattg cgatgccagc tcgttcggct 2880 ttggtgctgt attattacag aaacaagaag ataataaact tcatcctgta gcttatttct 2940 cgaagacaac atcaaaagaa gaggcaaagt tgcatagcta cgagctagag acgctttccg 3000 ttatctatgc acttaaaaga ttccatgtat acgtacatgg gttgccttta aaaatcttta 3060 ctgactgcaa ttcattggtc gaaaccttaa aaaacaggaa tgcatcggct aaaatagcaa 3120 gatggtcttt gtttttagaa aactatgatt acacaattca ttaccgatct ggaacttcta 3180 tggctcatgt agacgctcta agccgaactg aagccgttgg agcaatcagc gacttagatc 3240 ttgattttca acttcagatt gctcaatccc aagacccctt aataaatacc ctacgacaaa 3300 aacttgaagc tggttctgtt caaggattta tcctccaaga tggtttggta tatcgtcagt 3360 cttccacaaa tcatcttcaa ttatatgtgc cacgagaaat ggtagacaat attatccgac 3420 acaatcacga aaagattggt catcttgcca tcagcaaaac ttgccaaacc attagccaac 3480 actattggtt ccctcatatg aaaccaagag ttgagaactt cattaaaaat tgtttaaaat 3540 gcatcgtcta ttcagcacct ccaagaacta acaaccgaaa tatgtacagt attccaaaga 3600 cacccgtacc attcgacacg ttgcatattg atcatttagg acccctaccc aacattacat 3660 ctcgcaagaa atatatattg gttgtcatag atgccttcac caaatttaca aaactatacg 3720 ctaccgctac aactaatact aatgaagtat gtgaagctct cactcaatat atgtcatatt 3780 atagcagacc gaagcgaatt ataagtgacc gtgcaacatg ctttacttca accgcattca 3840 aagagtttgt tgattctaat gatatcactc atgttcttaa tgcaacgggt tccccccaag 3900 cgaacgggca ggtggaacgt gttaatcgtg tacttcgccc cattttaagc aaattatgca 3960 attcttctga ccattctgat tggagttcgc atttgagatc cgctgaacat gccctcaaca 4020 atactgtcca tagttccaca aactttcttc cttcgattct tcttttcggc attgagcaac 4080 gaggccagat cttagacgaa ttacatgaat tccttaatga taagcatgtt acaacaaacc 4140 gtgacttaaa ccttcttcgt tctgaagcgt tatctaacat tgaatattca caatatcgaa 4200 acgaacaata cgtagccaat agaaccaagc cagcccctag tttctctgaa ggagatctgg 4260 tagctatcaa atacacagat tccactaatg caaataagaa actcgtctgc aagttccgcg 4320 gcccttatat agttcataaa gtacttcccc acgaccgtta tgtaattaga gatgttgatg 4380 gatttcagat cacgcagatg ccttatgacg gagttttaga ggtagacaaa cttaggaaat 4440 ggtctaataa catataacat gatgtattac tagacagcat cgtattgctt taattagaat 4500 agaattgagg tcaatcctag gtcaggatag ccgagc 4536 // ID RETRO14_AG_LTR repbase; DNA; ANG; 230 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO14_AG DE retrotransposon - a consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; BLASTOPIA; KW INVADER; Long terminal repeat; MDG3; RETRO14_AG_I; RETRO14_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-230 RA Jurka J. and Drazkiewicz A.; RT "RETRO14_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 1-1 (2002). XX DR [1] (Consensus) XX CC Related to BLASTOPIA, INVADER, and MDG3 from Drosophila CC melanogaster. CC 4 bp target site duplication. XX SQ Sequence 230 BP; 83 A; 36 C; 70 G; 41 T; 0 other; tgtaagaaag gaaaagcgct agtggtaggg agatagaaat aacggtagcg agatcaggag 60 aacgagagcg tactaggaac gagttgcgag agagagaacg cgcagcgtta aaatcgggag 120 cgcgggttaa gcgcgaggag ttgaattcgg accgcgaagc gaataacatg ttgtgaactg 180 aagtcatcaa taaagcgtta tttgtcttaa accgaataaa aaactccaca 230 // ID BEL15-I_AG repbase; DNA; ANG; 5669 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 18-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE BEL15-I_AG is an internal portion of the BEL15_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL15-I_AG; BEL15-LTR_AG; BEL15_AG; Bel clade; RING Zn-finger; KW integrase; peptidase; reverse transcriptase. XX NM BEL15-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5669 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL15_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 37-37 (2003). XX DR [1] (Consensus) XX CC BEL15_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL15-I_AG, an internal portion of BEL1_AG is flanked by CC BEL15-LTR_AG CC LTRs. The BEL15-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 6 copies; they are less than ~1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes one 1741-aa BEL15_AGp Bel-like CC protein CC (pos. 311-5533). CC BEL15_AGp is composed of the peptidase A16 (pos. 120-290), CC reverse transcriptase (pos. 770-900) and CC integrase (pos. 1450-1600) domains. XX FH Key Location/Qualifiers FT CDS 311..5533 FT /product="BEL15_AGp" FT /translation="MANAVNLLSRRRTLEEKIQRVIAFADNFVPERDEFRL FT GLFISDTERVAAEFDTVQQLIEDGAAPEAREMESHFRATTEDALMAARASL FT QALSRPSHNVIPASSTIATSGVRLPTISLPEFDGNEMQWATFRDTFEALIH FT CNEEVLTIQKFHYLRAALKGEAAKLLESIPLCASNYNIAWKSLVDRYANEY FT LQKKRHLQAMFNIGKVTKESNASLHRLVDDFDRHVKMLHQLGEPTAQWSTV FT LEYVLCTKLPDETLRTWEDYASTLSSPNYSMLIEFLQRKMRTLESISMNHP FT ATREATHPSFVRRAPQHLSSCSTMASSSKGCPHCQHDHALSSCYKFCRLPL FT SERFQIAIEKKVCHNCLRKGHLARNCASSSRCKHCGERHHSLLHRSSAVGT FT EPKLVYAEGQESTGRNDRYQAQSLNVTKHPIRSEEVFLLTVRLSIVDADGK FT EHSVRALLDCASQPNLMTEKLVKLLQLQRCPSNVKISGAGKISRDVRGSVF FT AEIRSKRQPFSCGVQFLVMDKLTSNLPSETVSVGHWCIPKGLELADPEFNT FT SQPVDLVIGVKHYYSFFPSAARVHLGDELPLLIDSVFGWIVAGSATLQCPE FT PQVTSSNAICMMSLEESIERFWKTDSLVMKDGYSPEERRCEQIFRDTTARN FT ETGRYIVRLPRHPDFGIRLGASKVSAVRRYDLLERRFAKNSKLKEEYHAFM FT KEYLELGHMSLVRDGDAVPAESYYLPHHPVFKESSTTTKIRVVFDGSSKTT FT SGYSLNDALCVGPVVQDDLLDQLLRFRTYKVALVGDIAKMYRQILLHPDDR FT PLVRILFRFEPQQPVQTYQLNTVTYGLAPSSFLATRALIQLADDEGNAYPR FT AGPALRKNFYVDDFIGGAQSVEEATCLRTELAELLQKGGFELRKWTSNCVD FT VLHGLDEAQVGTTTKMSFDSHEAVKTLGISWVPQGDWLVFEGVCQPDDDVI FT TKRSVLSAIAKMYDPLGMIAPIIIRAKMIMQEIWVSSRDWDESLPEDIVCK FT WKQFQKEIRSLSQYRMDRFILLPEARNIELHTFADASSAAYGACTYVRCED FT AGRVRVMLLASKSKVAPLKNLTIARLELCACVLAAHLHHRIKGAIDVAVNA FT SYFWTDSAICMHWIKAPPSTWKTFVANRVAEIQHFTSGAKWRHVAGVENPA FT DLVSRGMEVSEFNNSRAWRHGPAWLEHSQDAWPMPNPQSIPEVAEQERKEL FT ISAVTVCHNELFLYWSSYTRLVNVVSYCMRFIAMLPHLRQLRGRRVLRSSE FT GNAQLKDVSSRKVLSARERAAAVNALTRLAQRESFAEELRDLQGGKRVKKQ FT SELKRLTPFVDEKGIIRVGGRLNLSQLPYQSKHPALLPKGHPLARLIAEHD FT HKMLLHGGGRLLLSVIREKFWPLNGRMLVKSVVRSCIKCIRYQPTLAEQHT FT GQLPAARIIPSRPFAVTGVDYAGPLYLKPAHRRAASLKAYLCVFVCFATKA FT VHLELVGDLSTDGFLAALRRFTARRGVPDHLHSDNGKNFEGARNELRELFA FT TLTSEAAQSTIASSCADQGISWHMIPPRAPHFGGLWEAAVKTAKRHLFRHL FT GSTRLSFEGYYTVLHQIEAAMNSRPLLPLTDDPNDLAALTPSHFLIGSSLT FT ALPDPDMTMAHTNAHEHLAKLQLLVQKFWKHWQKEYLQELQKDPRVARQAD FT QIQPGRMVILMDELLPSTRWPLARVIEVHPGPDGLVRVVTLRTIKGLIKRP FT IAKICPLPVEERGNNTLPAAT" XX SQ Sequence 5669 BP; 1408 A; 1286 C; 1506 G; 1469 T; 0 other; tttggtgccg tgaccaggat tggtgtattg tgaagtgaaa ttgttgctcg ttttttgtga 60 ttttgctgcg tgtaaaaatc tgttgacttt acgttttttc gtcaccgacc atagacaacg 120 aacagtttct ttcgaaattc gttgtccagt gcttgttgcg tgttgtgact tgattgttga 180 acattgtgta attgtttgat tggtgtgata attgatcata cgcacggaca gaacatttgt 240 gaatttgtgt ttcgtgttca gtgtcctgtg tttttagcgc catcgttcca tctgccatta 300 cgagactgat atggctaacg ctgtaaattt gctgtccaga cggcgaaccc tggaagaaaa 360 aattcagcga gttatagcat ttgctgataa ttttgtgcct gagcgggatg aatttaggct 420 tggcttgttc atctccgaca ccgaacgtgt tgcagcagag ttcgatacag tgcagcagtt 480 gatcgaggat ggagcagcac ccgaagcgcg cgaaatggag agccattttc gcgctacgac 540 tgaggacgcc ttaatggccg cgagggccag cctacaagcg ttgtcgcggc catcgcataa 600 tgttattccc gcctcatcga ccattgctac atctggagtg aggctgccaa ccatttctct 660 tccagaattt gacggcaatg agatgcaatg ggcgacattt cgggacactt ttgaagcact 720 aatccactgc aacgaagagg tgctaactat ccaaaagttc cattatcttc gagctgcgct 780 caaaggtgaa gctgcaaagt tgctggaatc gattccgttg tgtgcatcta actacaacat 840 tgcctggaaa tcgttggtgg acagatacgc caacgagtat ctacaaaaga agcgtcatct 900 acaggcaatg ttcaacatcg gcaaggtgac caaggaatcg aacgcatcgt tgcacaggct 960 ggttgacgat tttgatcgtc acgttaagat gctgcatcag cttggcgaac caacagcgca 1020 atggagcacc gtgctagaat atgtgttgtg caccaagctt cccgatgaga cgctacggac 1080 gtgggaagat tatgcttcca ccctcagcag cccgaactac agcatgctaa ttgagttcct 1140 gcaaagaaaa atgagaacat tagaatcgat ttctatgaac catccggcaa cgagagaagc 1200 tactcatcct agttttgtac ggcgagcccc acagcacctt tcttcctgct caaccatggc 1260 gagcagttca aaagggtgcc cgcattgcca gcatgatcat gccttaagca gttgctataa 1320 gttttgccgt cttcctctgt ctgagcgttt tcagatagct attgagaaga aagtttgcca 1380 taattgctta agaaaaggtc atttggcaag gaactgcgct tcatcgtccc ggtgcaaaca 1440 ctgtggtgag agacatcact ctcttttgca tcgttcgtct gcagtcggta cggaaccgaa 1500 actcgtgtat gcggaaggac aagaatctac aggaaggaat gatcgttacc aagcacagtc 1560 gctcaacgtt actaagcatc ctattcgatc ggaggaagtg tttttgctca ctgttcgttt 1620 gagcatagtt gatgctgatg gtaaagagca ttcggtacgc gctttgctag actgtgcttc 1680 tcaacccaac ctcatgacag agaaacttgt caaattgctg cagctacaac ggtgtccttc 1740 taacgttaaa atatcgggag ctggaaagat atctcgtgac gttcggggat cagtgtttgc 1800 tgagatacgc tccaagaggc aaccattcag ctgtggtgtt cagttcctgg taatggacaa 1860 gctgacatcc aatttgcctt ctgagactgt aagtgtcggt cactggtgta tcccaaaagg 1920 cctcgagcta gctgatcccg aattcaacac atcgcagccg gttgatttgg tgataggtgt 1980 caagcactac tattcattct tccccagtgc agccagagtt catttgggtg atgagttgcc 2040 gctattgatt gatagtgtgt ttggttggat tgttgctggt tcggctacgt tacaatgccc 2100 ggaaccacag gtaacaagtt caaacgctat ctgtatgatg tcgctggaag agagcatcga 2160 acgattctgg aagacagatt cattagtgat gaaggatggc tactcgcctg aggaacgaag 2220 atgcgagcag atattccgtg atacaacggc gagaaatgag actgggcgtt atatcgtacg 2280 cttaccccgt catcccgatt tcggcatcag actgggtgct tccaaggtaa gcgcagtacg 2340 aagatatgat ctgttggaga ggaggttcgc taaaaattcc aagttgaagg aagagtacca 2400 tgcgtttatg aaggagtatc ttgagctcgg gcacatgagt ttagttcggg atggagatgc 2460 agtacctgct gagtcgtatt atttgccaca tcatcctgtg ttcaaggagt ctagcaccac 2520 aacgaaaatc agggtcgtgt tcgacggttc ttctaaaacc accagcgggt actcgttgaa 2580 tgatgctttg tgcgtgggac cagtggtgca ggacgacttg ctagatcagc ttttgcggtt 2640 ccgcacgtat aaggtggcat tagttggcga tatagcaaaa atgtaccgcc aaatacttct 2700 tcatcctgac gatcgaccgt tggtgcgaat cttgtttcgc ttcgagccgc agcagccggt 2760 gcagacctac cagctgaata ctgtaacgta tggtctcgca ccttcctcct ttctcgctac 2820 gcgcgctctt attcagctgg ctgatgatga gggtaatgca tacccacgag cgggccccgc 2880 tctacgaaag aatttttacg tcgacgactt catcggtgga gcccaatcag ttgaggaagc 2940 cacctgccta cgaactgaat tggctgagct actacaaaag ggcggatttg agctgcggaa 3000 atggacgtcg aactgtgttg atgtgctgca tgggctggat gaggcacagg ttggaaccac 3060 aaccaaaatg agcttcgatt ctcacgaagc cgtgaagaca cttggcatca gttgggttcc 3120 acaaggtgat tggctggtgt ttgaaggtgt gtgccagcca gacgacgatg tgatcaccaa 3180 gcgatctgtg ttatctgcca tcgcaaaaat gtacgatcct ttggggatga tagcgccgat 3240 aatcatccgt gctaagatga ttatgcagga gatatgggtg tcgtcacgtg attgggatga 3300 atcgttgcca gaagacatcg tatgcaagtg gaagcaattt cagaaggaga tacgatctct 3360 atcacaatat cgaatggaca gattcatact gcttcctgag gcgcgaaaca tcgagctgca 3420 cacttttgct gatgcatctt ctgcagccta tggtgcttgc acatatgtgc ggtgtgaaga 3480 cgctgggcga gttcgagtca tgcttttagc gtcgaagagt aaagttgctc cgttgaaaaa 3540 tctgactatt gctcggttgg agctgtgcgc ctgcgttcta gccgcgcact tgcatcaccg 3600 cataaaaggt gccattgatg tggccgtaaa tgcatcatat ttttggaccg actcagctat 3660 ctgcatgcat tggatcaagg cgccaccgag cacatggaaa acgtttgtgg caaaccgtgt 3720 agcggaaatc cagcatttca ctagtggtgc caagtggagg cacgtagctg gagttgaaaa 3780 ccctgctgat ttggtatcgc gtgggatgga ggtgtccgag ttcaacaaca gtcgagcgtg 3840 gaggcatgga ccagcttggc ttgagcattc gcaggatgct tggcccatgc ccaatccaca 3900 aagcattcct gaagtggcag agcaagagag gaaggagtta atttcggcag ttaccgtatg 3960 ccacaacgaa ttgtttctgt attggtcatc ttatactcgc ctggtgaacg ttgtcagcta 4020 ttgcatgcga ttcattgcca tgctccctca tctaaggcaa ttaagaggaa gaagagtttt 4080 acgatcctcc gaggggaatg cgcaactgaa ggacgtttca tctcgtaaag ttctcagcgc 4140 tcgggagcga gcagcagcag ttaatgcctt aacacgcctt gcccaacgtg agtcttttgc 4200 tgaggagcta cgcgatctgc aaggaggaaa gagggtcaaa aaacaatcag agctgaaaag 4260 actcaccccg tttgtggacg aaaagggtat catccgggtt ggtggacgac tgaatttgtc 4320 acagctgccg tatcagtcga aacatcccgc actgctgccg aagggtcatc cattggctcg 4380 attgattgct gagcatgatc acaagatgct tcttcatgga ggaggacggt tgctgctgtc 4440 agtcatacga gagaagtttt ggcctttgaa cggaagaatg ttggtcaaaa gcgtagttcg 4500 gagctgcata aaatgtattc ggtaccagcc tactcttgca gagcagcata ctggccagct 4560 gcccgctgcc cgaattatac caagccggcc ctttgctgtt accggggtcg attatgcagg 4620 tcccctgtac ctaaaacctg cgcataggcg cgcagcatcg ttgaaagcgt acctgtgtgt 4680 cttcgtgtgc tttgcaacaa aggcggtgca tttggagcta gtgggagacc tgtcaactga 4740 cggttttctc gcggctctac ggaggtttac ggcgaggaga ggtgttccgg atcatctcca 4800 ttcggacaat gggaaaaact tcgagggagc gaggaatgag ctacgagagc tttttgcgac 4860 tttgacgagt gaagctgcgc agagtaccat cgcttcatcg tgcgcggacc agggaatctc 4920 ttggcacatg attccaccga gagctcctca ctttggcggc ctttgggaag ccgcggtgaa 4980 aacggccaaa cgtcatctgt tccgtcacct cggaagtact cggctctcct tcgagggtta 5040 ctacaccgta ctgcaccaga tcgaggcagc tatgaactct cgtcctctct taccgcttac 5100 ggatgatccc aatgatttag ccgcactaac accttctcat tttttgatag gctcctcatt 5160 aacggcactt ccggaccctg atatgacgat ggcacacact aatgcacatg agcacttggc 5220 gaagctgcag ctattggtgc agaagttttg gaaacactgg caaaaggagt acttgcagga 5280 gttgcaaaag gacccgcgcg ttgccaggca ggcagatcaa atccaacccg gtcgaatggt 5340 tatcctgatg gacgagttgc tgcctagtac ccgttggccg ttagcgcgtg tgatagaggt 5400 gcaccccggt ccggacgggc tggtgcgagt agttaccttg cgtacgataa aaggtttaat 5460 taagcgtccg atagccaaaa tatgtccttt gccagtagaa gagagaggaa ataacacttt 5520 gccagcagct acgtaaccga aatgtccgta tttgtcgatt taatgtttgt gcaccgcgct 5580 tcgtcgttgt ataaagaaaa ccacgcggtt tatgtcaggt ctgagttagt gttagtgtta 5640 gtgttgatgg tacatcaagg cgaggagta 5669 // ID P3_AG repbase; DNA; ANG; 4394 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE P3_AG, a P-like DNA transposon - a consensus sequence. XX KW P; DNA transposon; Transposable Element; P superfamily; P3_AG. XX NM P3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4394 RA Kapitonov V.V. and Jurka J.; RT "P3_AG, a young family of P-like DNA transposons from African RT malaria mosquito."; RL Repbase Reports 2(11), 23-22 (2002). XX DR [1] (Consensus) XX CC The A. gambiae genome harbors many divergent families of P-like CC DNA transposons. One of the youngest families is P3_AG. CC Some P3_AG copies are identical to each other. It's possible, CC that P3_AG is an active DNA transposon. CC P3_AG elements are flanked by 8-bp target site duplications. CC Terminal inverted repeats are 29 bp long. CC The P3_AG encodes a 879-aa P-like DNA transposase, called P3_AGp. CC Predicted exon/intron structure is based on FGENESH and GENSCAN. CC The P3_AGp transposase is 43% identical to the P1_AGp transposase CC encoded by P1_AG. XX FH Key Location/Qualifiers FT CDS join(356..613,678..2198,2260..3117) FT /product="P3_AGp" FT /note="DNA transposase" FT /translation="MPRSCAAAFCKNNAENVKKRGLNITFHSFPSDDSLPK FT WIDFCKRDEHWKPTKISTVCSLHFKPDDYQMAKSSLPQTLPVLKRLKPYAI FT PSLIQPADFIQNEPSNMTAPLKECNQPNVEFQTDSEYFSDVENVPSQTIID FT MKRELDQVKEDNRKLIEVNTNLRDKLHSYFNENKRLKAEIDNLQKHISKKD FT AGIDEAALVTAMKERLKPTLSENQIDIILKKKKRVVWTKEEIGSALTLKYF FT GLRCYKYLAKDRKFPLPADATLKRYTKNLVVKEGILDDVLKLISNLTSTFT FT EKDRLCALSFDEMKVNRIIELDKASDEIIGPHNYLQVVMARGLCNKWKQPV FT YIGFDKKMTKEILLKIIEKLSEININVVAIISDNCSTNVSCWKELGAKDYE FT RPYFQHPTTLNNVYVIPDAPHLLKLLRNWFLDSGFTYNGKHIKADLLFDMI FT ASRNETEITPLYKLSKTHLVMTPQERQNVRRAAQLLSHTTAISLRRYFKNN FT AEATDLANFIEKVDLWFSISNSYSPFAKLDYKKSYTASDDQIKALDEMFEI FT VSNMTVIGKHSLQIFQKSLLMQITSLKLLYDDLHKRHNISFISTHKLNQDV FT LENFFSQLRQIGGVYDHPSPMSCIHRIKMIILGKAPTFLKNQTDLEPSTFS FT CTDEYISSQIRTSIEIENEGSEQANDGDIISASIITSALNQPPVKQSIQSD FT TLSSRSSEMLSSVSSSAIELPEQDSDGLEYIMGYIGRQCFEKFPHLNLGNL FT SLNLNSDHSYSHPPSFVKHLSVAGLFVPSEAFLKQGYKMEKIFQKLHPNGN FT FNKKRYISKRLVKRLQKEFPELPLIVVQQFAKHRINIRIKFLNMKIANEKR FT VNNKRKAPSHSKTAKKCEKLQISCFV" XX SQ Sequence 4394 BP; 1513 A; 768 C; 847 G; 1266 T; 0 other; caaagtgaat gaaagggagg tgagcttatg tagaacattt ggcttgcggg attttagaaa 60 aatgaaataa agtttgacag tttgtaaccg ggtcggctgg acggtttgtt tacatttgat 120 agctattgga ctggtccatg ttgttgatgc aacggcgaga agtgtgtgac taaaattcgt 180 gtaaaagaaa atagtacgct ttcgggtgtt aatattttat tataccaggt aaaatataaa 240 ttgtacaagt tgcagggaat gtattaaatg gatttatttt ctatttttag agaaaaagca 300 aatctggaag cagcgttgtt tgtagctgtg tccacagaag tttgtaggaa gcaacatgcc 360 tcgctcatgt gcagctgcat tctgcaaaaa taatgcagaa aatgtaaaga agcggggttt 420 gaacattact ttccactcgt ttccatcaga cgattctttg cctaagtgga ttgatttctg 480 taagcgggat gaacattgga aaccaaccaa aatatctact gtgtgctctc tccatttcaa 540 acccgacgac tatcaaatgg caaaatcatc tttaccacaa accttgccag tactgaagag 600 attgaaacca tatggtaagg aaagatatga gctttatgca gtaattatca tttcatcaca 660 ttctcatctc tttattagct attccatcat tgatacaacc agccgatttt attcaaaacg 720 agccatcgaa tatgacagcc ccattgaaag agtgtaacca gccaaatgta gaatttcaaa 780 cagattcaga atatttcagc gatgtagaga acgtgcccag ccaaacaatt atagacatga 840 aaagggaact tgatcaagtg aaagaagata atcgaaaact gatcgaagtg aatacaaatt 900 taagagataa actgcattca tacttcaatg aaaataagcg actaaaggca gaaattgata 960 acttacagaa acatatttca aaaaaggatg caggtataga tgaagctgca cttgtcacag 1020 caatgaaaga aagattgaag ccaacattat ctgaaaacca gatagatatt attttgaaaa 1080 agaaaaaacg tgtagtttgg acgaaagagg aaattggctc cgctttgaca ctcaaatatt 1140 ttggattgcg atgctacaaa tatttggcta aagatagaaa gtttccttta cctgcagacg 1200 caactttaaa acgatacaca aagaacctcg ttgtaaagga aggaattttg gatgacgttc 1260 ttaaattaat aagcaattta accagcactt ttactgaaaa agatcgcctt tgtgctctgt 1320 ctttcgatga aatgaaagtt aacagaataa ttgaactgga caaagcatcg gatgagataa 1380 ttggaccaca taactatctg caggtcgtga tggctcgagg actgtgtaac aaatggaaac 1440 aacctgtgta cataggattt gataagaaaa tgacaaaaga aatactcttg aagataattg 1500 aaaaactaag tgaaataaat attaacgttg tagctatcat cagtgacaac tgctctacaa 1560 atgtaagttg ctggaaagaa ctgggagcta aagactacga aaggccatat ttccaacatc 1620 ccacaacttt aaataacgtg tacgtaatcc ctgatgcacc tcatttatta aagctactaa 1680 ggaattggtt tttggatagt ggatttacgt acaacggaaa acatataaag gcagacctac 1740 tttttgacat gatagccagt agaaatgaaa cagagattac acctttatat aagttgagta 1800 aaactcattt agttatgacg ccacaagagc gtcagaatgt tcgacgtgcc gcacagctgc 1860 tctcacatac tactgctatt tccttgcgtc gttattttaa aaataatgct gaagctacgg 1920 acttggcgaa tttcattgaa aaagttgact tatggttcag catatcgaac tcctatagtc 1980 ctttcgccaa attagactat aaaaaatctt atacagcaag cgacgatcag ataaaagcat 2040 tagatgaaat gttcgaaata gtttcaaata tgaccgtgat tggtaagcat agcttgcaaa 2100 tttttcaaaa gtcgttgctg atgcagataa cgtctcttaa attgctttat gatgatcttc 2160 ataaaagaca caacatttcc ttcatatcca ctcacaaggt aattaatgac ataaaaaaca 2220 tcatatattc atattgatct atcatccttt ttaatttagc tcaatcaaga tgtactggag 2280 aattttttct cacagctaag gcagatagga ggggtatatg atcatccctc accaatgagc 2340 tgcattcatc gtattaagat gattatatta ggaaaagcac ctacgttcct taaaaatcaa 2400 acagacttgg agccatctac attttcttgt acagatgaat atatctcatc gcaaattcgg 2460 acatcgattg agattgaaaa tgaaggttct gaacaggcta atgacggtga tataatttca 2520 gcatcaatca ttacctcggc cttaaatcaa cccccagtta aacaaagcat acaatccgat 2580 accctgagtt caagaagcag tgaaatgtta agctccgtca gtagttccgc tatcgagctt 2640 cctgaacaag acagcgatgg actcgagtat attatgggtt atattgggcg tcaatgcttt 2700 gaaaagtttc cgcatttaaa tttgggtaat cttagtctga atttgaatag cgaccattcg 2760 tatagccatc caccttcatt tgtaaagcat ttgtcggttg ctggtttgtt tgttccttca 2820 gaagcttttt tgaaacaagg atacaaaatg gagaaaatct tccaaaaatt gcacccaaat 2880 ggaaatttta acaaaaaacg ttacatatca aaaagattag ttaagcgact tcaaaaagaa 2940 ttccccgagt taccgctaat agttgtacaa caatttgcta aacatcgcat aaatatacgt 3000 atcaaatttc ttaatatgaa aatagcgaat gaaaaaaggg ttaataacaa acgaaaagca 3060 ccatcacatt ccaaaactgc gaaaaaatgc gaaaaattac aaattagttg cttcgtttaa 3120 cggatttata aaacagacag aaatatgaat tacataaatt tacttataaa ttttgattat 3180 gtataaatta taaatgtata tattagtgaa taagcatttt ttttaataat atcatgatgt 3240 attataaacg gtgaaattaa ccaggttgat tctaatgata atattcattt agagactaac 3300 tattccgact agaacatata aatgaaataa atgaagttgt tcgcaatgtt gtttgtttcc 3360 ctagcaaaat accaatagca ggctccaggg cctgccaccg gcatgcggga aagttaactg 3420 gcaggccctg ggacctgcca tcgggctgaa cgtgttaaac gattggcacg gcgaccaatt 3480 cagaaaaaaa gatgttgagt atatttgtaa cagtcatgca tactgttgag catactattt 3540 tgcttctgat gattcttttg agtagcgcta gcatgtacct atgcacttgt tttttcaaca 3600 aataacgtat gtaatttcag ccgtcgtctg gtatcgtagc attctaagag tttggttagg 3660 ctatcttatt cggttctaat gctccagtgt tagatttcgc taatatttac gttgcctaag 3720 ctgtttagtt gtaaattgtt ggcctggaaa aataccaaaa atgagttata tagaaatgta 3780 gcaaatgaag aacttcaagt atcatcaagc tcgtaagatt acagtcaagc ttctgtgctg 3840 catatgagcc ggaatatttc atcgcctttg atgcgggact tcctttctgc agccgaggtg 3900 gaatgtaccg tcaccgtgat gcgatgcaga cgggaccata tacgccgagg ccgatgcaga 3960 aggtgatgcg cttaagggag ggagaatgac ttcctccctt gatgtgcggc tggtgtgaag 4020 cctggccaat ggagtcgcca acgatgaaac ggcagcaggg gccaatctct ttagggttca 4080 caaagcagca cgttcaactg taacgtaagc acgtaatccg aattccgtaa tgaattgttt 4140 ccaatatctg aaaataatac aaatcagtaa ctattccatg aagataagga attacattga 4200 tgagggactt acattagaat actgaaataa tgaaacaatc catacaatgc ctgttgaaga 4260 aataaaactg attgtgcgac gatgttatgc acaggaagaa aaggtaaaca ataccggtta 4320 tatagtgtca aatttgacaa tcgtcatggc gacctaaatc tttcgataag ctcacctccc 4380 tttcattcac tttg 4394 // ID GYPSY55-I_AG repbase; DNA; ANG; 4545 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 27-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE GYPSY55-I_AG is an internal portion of retrotransposon GYPSY55_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY55-I_AG; GYPSY55-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY55_AG; mag lineage; reverse transcriptase. XX NM GYPSY55-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4545 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY55_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 149-149 (2004). XX DR [1] (Consensus) XX CC GYPSY55_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, GYPSY59_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG, and GYPSY69_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY55-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1478-aa CC GYPSY55_AGP gag-pol like polyprotein (pos. 98-4531). The CC sequence of the LTRs flanking GYPSY55-I is deposited as CC GYPSY55-LTR_AG. XX FH Key Location/Qualifiers FT CDS 98..4531 FT /product="GYPSY55_AGP" FT /note="gag-pol" FT /translation="MSQEEDERRSSQGWQNVMAPVTVHAPQDTGQRPQVPL FT TQFVQQNGSVPPFSASSVQNVPESDNSATAQILKLMQQQMAQQQQILLQLM FT QQTKAPIEQTPLEQILDSLCGHIKEFRYDEENGSTFAAWYTRYEDLFLKDA FT ARLKDEAKVRLLMRKLGTSEHDRYTSYILPKHPRDYSLQETVAKLTSLFGT FT KESLLHRRYKCLKLTKLRTEDFITFACRVNRGIVDFELGRLTEEQLKCLVF FT VCGMKGEEDIEFRTRLLHRIEENQDVTLDQLSAECQRMTNLRQDSAMIERD FT RSEQVFSVQRNANRNQRTNTGPSHYTNEQRTGKPSRPCWLCGSLHWTRECT FT YKTHTCTQCGSVGHREGFCKQQRTKKQPHRKKTFRKRANMQSVTVNVNSLH FT HRRKFLSCSVNGSQIRLQLDTASDITVISRRLWKRIGSPQLFPSSVIAKSA FT SGDKLEMDGEFRATVGINGQEKQAVIFVAKGELALLGIDLAEAFSLWSVPI FT DQLCNNVTSTATTPEGIVRKFPGLFKSTMGFCAKAEVTLHLKPNCSPVFCA FT KRPVAYAMRDAVDAELDRLERLNIITPVQHSEWAAPIVVVRKSNGQLRICG FT DYSTGLNASLHPHDYPLPLPQDIYTKLGNSTIFSQIDLSDAFLQVPIAEQS FT RRLLTINTHKGLYLYNRLPPGIKVAPGAFQQLMDQMLAGMERVASYMDDVI FT VGGRTQREHDDVLNETLRRIQEYGFTIRPEKCSFNKRQVRYLGHILDNHGI FT RPDPAKIAAIKDLSAPTDVSGVRSFLGAVNYYGRFIPNMRKLRYPLDNLLK FT EGSSFKWSPECQKAFEQFKSILSSELLLTHYDPRREIVVSADASSVGLGAT FT IGHKFPDGTFKVVQHASRALTKAEKNYSQIDREGLAIIFAVTKFHNFIFGR FT HFTLQTDHKPLLRIFGSKKGIPVYTANRLQRFALTLQLYDFDINYVSTDNF FT GNADILSRLIRNHEKLEEDYVIASIGLEEDIRSVVVNSLSSVALNATDVAT FT ATKSDPIMSKVIQFVRQDWPRNSTFSGELACFYARKEALSEMGGCLLFGER FT VIIPKALRQRCLRQLHHGHPGVQRMKSIARSYVYWPKIDTDIAELVASCNA FT CASAAKSPPHASPVSWPEITAPWQRIHIDYAGPIDGFSYLIVVDAFSKWPE FT VIRTASTTSKATIRILNTMFARYGMPVTLVSDNGRQFISSEFEDFCICNGI FT EHLTSAPFHPQSNGQAERFVDTFKRAITKITSDGTAIEDALDTFLQTYRAT FT PNPQVPNNEAPATVMFGRQIRTCLELIRPVPKPQETNNDEQRRNFVPNDLV FT FAKIYSQNGWKWKPGRILRKCGNVMYRVMTEDHKIIRSHINQLRRRVPSNQ FT QSSKRDQHLLPLHILLDEWNLTPPSSSPDSSSSSSSSSPSSSSLSPPSDSA FT PCNLSPSSAESSSYRTVSQSSASPPPERDCIPEESHRGAQPVPPPRRSFRD FT RKAPRWFDPYLLY" XX SQ Sequence 4545 BP; 1272 A; 1154 C; 1051 G; 1068 T; 0 other; gttggcgacg aggattttta aagtttgtga cgaattttta tcggaagtga atttcgcgta 60 gtgcaaagct attccaccga caaggaaacg ccgcaggatg agccaagaag aagatgaacg 120 tcgatcatcg caaggttggc aaaacgtgat ggccccggtt acagtgcacg caccgcaaga 180 tacaggtcaa cgtccccaag tgccactgac ccagttcgta cagcaaaatg gttcggtccc 240 accattttct gcatcgtcag tacagaatgt tccggaaagt gataattctg caacggcaca 300 aattttgaaa ctaatgcaac agcagatggc gcaacagcag caaattttgt tgcaattaat 360 gcagcaaacg aaagcgccta tcgaacaaac gccccttgag caaatactcg attccttatg 420 cggtcacatt aaggaattca ggtatgacga agaaaacggt tcaactttcg ctgcatggta 480 cacgagatac gaggatcttt tccttaaaga cgctgcacgc ttgaaagacg aggcaaaagt 540 gcgtttattg atgagaaagt tgggaacctc tgaacacgac cgttacacca gctacatact 600 gccaaagcat ccgcgtgatt acagtttgca ggaaacagtt gctaaattga ccagcttatt 660 tggcacgaaa gaatccttgc ttcatcgacg atacaagtgt ttgaagctaa ccaaacttcg 720 caccgaggat tttatcactt tcgcgtgccg ggtcaatcgt ggcatcgtcg attttgagct 780 gggaaggcta acggaagagc agttaaagtg tttagtgttc gtttgcggta tgaagggcga 840 agaagacatc gagtttcgta cacgtctttt acatcgcatc gaggagaacc aggatgtaac 900 ccttgaccaa ctttctgcag agtgtcagcg tatgacgaat ttgcgtcagg atagtgcgat 960 gattgaacgt gaccgcagtg aacaagtatt ttccgttcaa cgtaacgcaa atcggaatca 1020 gcgcaccaac acaggcccgt ctcattacac caacgaacag cgtacaggta aacctagtcg 1080 accatgctgg ttatgtggat ccctgcattg gactcgcgag tgcacttaca agacgcacac 1140 atgcacacaa tgcggatcgg tcggccatcg cgaaggtttc tgtaagcagc aacgtaccaa 1200 gaagcagccc catcgaaaga agacttttcg taaacgagcc aatatgcaaa gtgtcaccgt 1260 gaatgtgaac agcctacatc atcggaggaa atttttatcg tgttcggtca acggttcaca 1320 aattcggctc caactggata cagcatcgga catcaccgtg attagtcgtc ggctttggaa 1380 acgcatcggt agcccgcaat tattcccatc atctgtaatc gcgaaatcgg catctggtga 1440 caaactcgaa atggatggcg aatttcgagc aacggttgga atcaacgggc aggagaagca 1500 agctgtaatt ttcgtggcaa aaggagaatt ggccttgctg ggaatcgacc ttgctgaagc 1560 tttctcctta tggtctgtgc ctatagacca gctgtgtaac aatgttacca gcacagctac 1620 tactccggag ggaatcgttc gaaaatttcc cggcctgttt aaatcaacga tgggtttttg 1680 tgcgaaagct gaggttacac tacacctgaa gccgaattgc agtccagttt tctgcgcaaa 1740 acgtcccgta gcttatgcca tgcgcgatgc agtggacgcc gaacttgata gactggagcg 1800 gctcaatatc atcacaccag tccagcactc cgagtgggca gcaccgattg tagtggtacg 1860 caagtcgaat ggacaactga gaatatgcgg agactattcg acgggactca acgcatcact 1920 ccacccccat gactatccgc tgccactgcc acaggacatc tacaccaaac tgggtaactc 1980 gacaattttt agccaaattg acttatctga tgctttctta caggtcccca ttgccgagca 2040 aagtcgccgc ttgctcacca taaacacgca taaagggttg tacctgtata atcgcctgcc 2100 accagggatt aaagtagcac caggcgcatt tcagcagctt atggaccaaa tgcttgctgg 2160 aatggaacgt gtcgcgagtt acatggacga cgttatcgtt ggtggacgaa cacaacggga 2220 acacgacgac gtactaaacg aaaccctgag acgcattcag gagtatggtt tcaccatccg 2280 cccggagaaa tgttcgttca acaaacgcca agttcgctac cttggtcaca tcctcgataa 2340 ccacggcatt cgtccagacc cagccaaaat agctgcgata aaggatcttt cagcgccaac 2400 cgacgttagt ggtgtacgtt cctttcttgg cgccgttaac tattatggaa gattcatccc 2460 gaacatgcgc aagcttcgat acccgttgga caacctactg aaagaaggca gttcgtttaa 2520 gtggtcccct gaatgccaaa aagccttcga gcagttcaaa agtatccttt catcggaact 2580 gctcctgacg cactacgatc caaggcgtga gatcgttgta tccgcggacg cgtcatccgt 2640 cggattaggc gcaacaatcg gtcacaagtt ccccgatgga acttttaaag ttgtgcagca 2700 tgcgtcccgt gcacttacaa aggcagagaa aaactacagc cagatagatc gggaaggctt 2760 ggcaattatc ttcgctgtga cgaaattcca taattttata tttggacggc attttactct 2820 acaaacagac cacaagccac ttttgagaat tttcggcagc aagaaaggca tccccgttta 2880 caccgccaat cgcttgcaga gattcgcgct caccttacag ctgtatgatt tcgacatcaa 2940 ttacgtatcg accgacaatt ttgggaacgc cgacatcctt tcgcgactga ttcgaaatca 3000 cgaaaagtta gaggaggatt acgtgattgc tagcatcggc ctagaggagg acatacgatc 3060 agtcgtcgtt aactcattga gttcagttgc attaaatgca acggatgtcg caacggctac 3120 taaatccgat cccattatga gcaaagtcat acagtttgta cgacaagact ggccacgtaa 3180 cagtacgttc agcggtgagc tggcatgctt ctacgctagg aaggaagctt tatcagagat 3240 gggaggctgt ctcttattcg gggagagagt catcatacca aaggctctcc ggcaacgatg 3300 cctgcgtcaa ctacatcacg gccatccagg tgtacaaagg atgaagtcga tcgcacgaag 3360 ctacgtctac tggccgaaga tagacactga tatcgctgaa cttgtagcat cttgtaacgc 3420 ttgtgcatcc gcagcgaaat cacccccaca cgctagcccg gtatcatggc ctgagataac 3480 tgcaccgtgg caacgtatcc acatcgatta tgctggcccc atcgacggtt tctcctactt 3540 aatcgtagtc gatgctttct ctaaatggcc agaggtaata aggactgcca gcacaacttc 3600 caaagcgacc atacgaatcc ttaataccat gtttgcgcga tacggtatgc cagtaaccct 3660 tgttagcgat aatggtcgcc aattcatcag ttccgaattt gaggattttt gtatttgtaa 3720 cggcatagag cacctcacat ccgccccgtt ccaccctcag tctaacgggc aagcggaacg 3780 cttcgtggat acattcaagc gcgccataac caaaatcacg agcgatggaa ctgcaataga 3840 agatgcactt gacacgtttt tacaaacata tcgcgccacg ccaaatcctc aagtgccaaa 3900 taacgaagcg ccagcgacag taatgttcgg acgacaaatt cgcacttgtt tagagcttat 3960 acgccctgtg cctaaaccgc aagaaaccaa taacgatgag caacgacgca atttcgtccc 4020 aaatgatcta gtcttcgcta agatttactc gcagaacgga tggaaatgga agccaggaag 4080 aatattgcgg aaatgcggca atgtaatgta ccgtgttatg actgaggacc ataagattat 4140 acgaagccat atcaaccaac tacgtcgtcg ggtcccttct aaccagcaat ccagtaagcg 4200 tgatcaacac cttttgccgc tacatattct cttggatgag tggaacctta cgcctccgtc 4260 gtcgtcacct gattcatcgt catcatcgtc ctcatcgtct ccatcgtcat cgtcgttgtc 4320 gccaccgtcc gattcagctc cgtgtaatct atcaccgtcg tccgctgaat catcatctta 4380 caggaccgtg tcccaatcat cagcatcacc acctcctgaa cgtgattgca tccctgaaga 4440 gtcgcatcga ggagcacagc cggtaccacc accacgccgc tcttttagag atagaaaagc 4500 accgcgttgg ttcgacccgt acctgctgta ttaaaagaag ggaga 4545 // ID HARBINGER-N1_AG repbase; DNA; ANG; 356 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 3) XX DE A non-autonomous DNA transposon: consensus. XX KW Harbinger; DNA transposon; Transposable Element; HARBINGER-N1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-356 RA Jurka J.; RT "Non-autonomous Harbinger-like DNA transposons from African RT malaria mosquito."; RL Repbase Reports 9(2), 639-639 (2009). XX DR [1] (Consensus) XX SQ Sequence 356 BP; 104 A; 79 C; 74 G; 98 T; 1 other; aaggccgggc tacattgatc gtactcgcag gtgtaatttt gattttcact agcgcacctg 60 gcggcggctg accgaagcat ttttccaaac aatttggagc cgcttcagaa aatatgttat 120 ttttccatgt tttttacttt aactggactg gaaaagcttt gtaattgata gatgaatatg 180 ttgtgcacgt atttcagcca ttttcgcatc aaaaaacgat ggaaaaaaca acttaawttt 240 gagatccggc acgaccattc cctcgatgac caaaaaaagc ttccaccagc cgccgccaga 300 tgcgctagtg aaaatcgaaa ttacactagg gagtacgatc aatgtagccc ggcctt 356 // ID TELREP_AG repbase; DNA; ANG; 337 BP. XX AC . XX DT 14-SEP-2004 (Rel. 9.08, Created) DT 14-SEP-2004 (Rel. 9.08, Last updated, Version 1) XX DE Anopheles gambiae middle repeat (telomeric regions) - consensus. XX KW TELREP_AG; Telomeric region repeat; middle repetitive sequence. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Biessmann H., Kobeski F., Walter F.M., Kasravi A. and Roth W.C.; RT "DNA organization and length polymorphism at the 2L telomeric RT region of Anopheles gambiae."; RL Insect Mol. Biol 7(1), 83-93 (1998). XX RN [2] RP 1-337 RA Gentles A. and Jurka J.; RT "Anopheles gambiae middle repeat (telomeric regions) - RT consensus."; RL Direct Submission to Repbase Update (JUL-2004). XX DR [2] (Consensus) XX SQ Sequence 337 BP; 109 A; 63 C; 61 G; 104 T; 0 other; ttttttatca cccaaaatat tgaaaaagaa ccagaacaat taccaacttg cttttacata 60 acgtttttca aaaccattaa ccttgattta tggcattttt ctgatattcg aaaatttccg 120 cagctcaatg agttgcggat ttttcgccat acaaagcgga acgatcgaat ttttgtagca 180 tagttcattt tgctcgtgag tgtcatggta taccagtttt caaaaagacc agtaaaatga 240 acaccttggc ggcgaaatga gtgtcaaatg acaccaaaat gacagccgga tcgatcgaat 300 tttttcgatc gaaaaattgt gctgaggccg agtatta 337 // ID Ag-Jock-12 repbase; DNA; ANG; 4269 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE A Jockey clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW Jockey; Non-LTR Retrotransposon; Transposable Element; KW Ag-Jock-12. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4269 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4269 RA Kojima K.K. and Jurka J.; RT "Jockey clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 12 CC sequences with >96% identity. XX FH Key Location/Qualifiers FT CDS 81..1385 FT /product="Ag-Jock-12_1p" FT /translation="MTKAKKKKREDTSPTVSDEGVNSAKHAPDKRAKIDGE FT NTEEENEVEDNDGFVCVERRKGRTVSPPPSTAPSPAAGPHKLSQRLPPIVV FT SNPRFQELQKMLAHIAKVEYQIVGVGTKVWTRNEEDYKAILTFLQTADCEF FT YTHDLPRDRPFKVVVRGLPLSDEKDIRDALEAEKLQVDAVYRIKRRHEGNG FT DSGSCLYLVHLKKGTITLAALKEIRALLSIRVHWEAFRGGRRGVTQCLRCM FT GFGHGSRHCGMRQRCLNCAQNHNVADCTVVPPTVPKCANCDGNHKANDPEC FT PQRARFQTIRQQATQRRPKTSAHGRNMVPPPPLSSYPPIGRNGSNAASAAL FT AAPAAGVRAPVRDKTPAIVPRIPPGFQYSTALRMAGTDQTASQPNVSSEET FT ALYDAATLMQIFSTMVERLNSCRSRREQVAVIGELIIRYGC" FT CDS 1378..4026 FT /product="Ag-Jock-12_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="MDAKLRIATWNACSVKAKKIPLLDFLQRRAIDVIMIS FT ETFLRPEDNLSFANYHLIRLDRTTGHPGRGGGVAIIIRRGIIFRQLPHYRT FT GIIEAIGIELQTSSGGIILVAAYCPVQCNRRNGLAAAFKRDLITITRSRSR FT FIVGGDLNARHPAWNNNRRNMNGVALFEHAQEGHYAVHFPDTPTFPRSGST FT IDFFLSNVDLDKPTTLDELISDHFPIVTEVVCSISKAPLVFRKNYHNVEWV FT RFGRLVDNDVTDTNVRTVEDIDRAINEFDRAIKKAEAACVREYPVRGEYIE FT LDPHTRALIVERNRLRRQYHNTGNVTCKSLAAAIARQISARLEVIRNENFG FT HSMARMDTRAPAFWKVTKVLRKRPKPIPPLVNTVDNLMAVTPEAKSNALAK FT QFASAHTLGANMPSPHEAEVAASLSLVDEAQSTVPPEGRVTTGRIRTALRR FT LKNMKAPGFDGVLNILLKHLQERAICLISNIFHRCLELGYFPDAWKCARVV FT PILKPGKDPTNPGSYRPISLLSSLGKLFERVILESLQETVETLDLIPEEQF FT GFRPGHSTTHQLFRLKQLFEKNKREARSTVVAMLDVEKAFDNVWHGGLIHK FT LVTFGIPMYLTKIISNYLQQRTFRVALGTTLSDVHLLPAGVPQGSLLGPLL FT YILYTADIPALPCDGKIFLFADDTAIAVKGRSLTEVSRRAQRCLDTYIQYA FT SSWKTRVNVAKSQVMIVPHRRKRSLIPTMHSRVEVNGVMVEWTNKVKYLGL FT TLDSKMLSHGQVESILRQGQTLLMSLYPLISRRSRLSFVNKLAVFKQIIQP FT AVLYGCQVWGTCARVHRQKIQVMLNKMLRMIANCDRYTRNSDLYDICEVAP FT LDSYIQQQTTKLIEKCRDSTHRAVREIASI" XX SQ Sequence 4269 BP; 1224 A; 1018 C; 1019 G; 1005 T; 3 other; gagttcgctt gtgaccatcg tcgcgtacag acgtgttctt gtatcgtgcg accaagtgaa 60 tagcaaacga aaagcgtgcc atgacgaagg cgaagaagaa gaagagagaa gataccagcc 120 caaccgtctc cgatgaaggc gttaattcgg cgaaacacgc gccggacaaa cgcgctaaaa 180 ttgatggtga aaacactgaa gaagaaaatg aagtggaaga taacgatggt ttcgtctgtg 240 tggagcgtag aaagggacgg actgtctccc caccaccttc aactgctcca tcaccagcag 300 caggcccaca caagctgtct cagcggctac cacccatcgt agtgagcaac ccacgtttcc 360 aagagttgca aaaaatgctt gcccacatcg caaaggttga gtaccaaatt gttggtgttg 420 gcacaaaagt ttggacacga aatgaggaag actacaaggc cattctcacg ttcctgcaaa 480 ctgccgattg tgagttttat actcatgatc tgccccgaga tcgtccgttc aaagtggtag 540 tgcgagggct ccccctttcc gacgagaagg acattcgaga tgccctggaa gctgaaaagc 600 tccaagtaga tgcagtctac cgcatcaaac ggcgtcatga gggaaacgga gatagtggaa 660 gctgtctata cctcgtacat ctcaagaagg gtaccataac gctggctgcc ctaaaagaaa 720 ttcgtgccct cctatctatc cgtgtccatt gggaggcatt ccgaggtgga cgacgaggtg 780 tgacccaatg tttacggtgc atgggctttg gccacggctc gcggcactgc ggcatgcgtc 840 aacgttgctt aaactgcgca caaaaccaca acgtcgcgga ttgcactgtg gtccccccga 900 cagtaccgaa atgtgctaat tgtgatggaa atcacaaagc aaatgatccg gagtgccctc 960 aaagggcgcg tttccaaacg atccggcaac aggcgacgca acgtcgtcct aaaacatcag 1020 cgcacggaag gaacatggtg cccccaccac cactatcttc ctacccgccc atcggacgaa 1080 atggttcgaa tgccgcctca gccgctctag cagcacccgc tgccggtgta cgtgctcctg 1140 ttcgcgacaa gactccagca atcgttccca gaattcctcc gggttttcaa tattcgacag 1200 cgcttagaat ggcaggaacg gaccaaactg cctctcagcc aaatgtatcg tccgaggaaa 1260 cagcactata tgatgctgca actcttatgc agattttctc cacgatggtg gaacggctta 1320 atagctgtcg ctcccgtcga gaacaagtcg ccgttatcgg agagctcata attcgctatg 1380 gatgctaaac tccgaatagc aacatggaat gcctgttcgg tgaaagctaa gaaaatcccg 1440 ttattggatt ttcttcaacg acgggctatc gacgttataa tgatatctga aacattcctc 1500 cgccccgagg acaatctttc tttcgccaac tatcatctaa ttcgattgga tcgtacaact 1560 ggacaccctg ggagaggagg aggagtggcc atcatcattc gccgtgggat catattccgg 1620 caattgccgc attaccggac gggtatcatt gaggcgatcg gcatagaact acaaacatcg 1680 agtggcggca ttatccttgt tgcggcgtac tgtccggtac agtgcaaccg cagaaatgga 1740 cttgcagctg cattcaaaag agatctcatc accatcacga gatctagatc gagattcatt 1800 gttggaggcg acttaaatgc acgtcacccg gcgtggaaca acaacagaag aaacatgaat 1860 ggtgtggctc tcttcgagca tgcgcaagaa gggcactacg ctgttcattt ccctgatacg 1920 ccaacgtttc cgaggagcgg gtctactatt gattttttcc tgtcaaacgt tgatcttgac 1980 aaaccgacaa ctctagatga gttaatctca gaccattttc cgattgtgac agaggttgtt 2040 tgctccatat ctaaagctcc tctcgtcttt cggaaaaatt accacaacgt tgaatgggtg 2100 cgatttggta ggctagttga taacgacgtt acggacacaa acgtccgtac tgtggaagac 2160 atagacaggg caattaatga atttgaccgt gccattaaaa aagcagaagc agcttgcgtt 2220 agagaatatc ctgtacgagg tgagtacatt gagctagatc cccatacacg tgcgcttatc 2280 gtggaacgta atcgacttag gcgccagtat cataacactg gaaatgtgac ctgtaaaagt 2340 ttagctgcgg ctattgcaag gcaaatatcc gctcgtctgg aagtgattcg aaatgaaaat 2400 ttcggacact ctatggcaag aatggacaca agggccccgg cattctggaa agtgaccaag 2460 gtcttgagga aacggcctaa gcctatccct cccttagtca atactgttga caacctaatg 2520 gcggttacac ctgaagccaa atcgaatgca cttgcaaagc aatttgccag tgcgcacacg 2580 ctaggcgcaa atatgcctag cccacatgag gcagaagtag cagctagctt atccctagtc 2640 gatgaggcgc agtccacggt acccccagaa ggtagagtaa caactgggcg gatcaggaca 2700 gcgctgaggc ggctgaagaa catgaaggct ccggggttcg atggggtcct taacatccta 2760 ctaaagcacc tccaagagag ggccatttgt ttgataagca atatctttca caggtgtctt 2820 gagttgggct actttcctga tgcatggaag tgcgcaagag tagtgcccat tctcaaaccc 2880 ggtaaagacc ccacgaatcc cggaagctat cgcccaatca gtctcctttc ctcgctagga 2940 aagttgtttg aaagagtgat tcttgagtcc cttcaggaaa ctgtggagac acttgatctt 3000 atccctgagg agcaatttgg cttccgtccc ggacactcta ctactcacca gctatttagg 3060 ttaaaacaac ttttcgaaaa gaacaagaga gaggccaggt ccacggtagt cgcaatgtta 3120 gatgtcgaaa aggcatttga caatgtctgg catggtggac ttatacacaa acttgtcacg 3180 tttggcattc caatgtacct tacaaagata atttcaaact atctacaaca aaggacgttc 3240 agggtagccc tgggtacgac gctctcagac gtacacttac ttccggccgg ggtgccacag 3300 ggtagtctgc tgggtccact attgtacatc ctgtacactg cggatattcc ggcactacct 3360 tgtgatggaa aaattttctt attcgcagat gacactgcca ttgcggtgaa gggtagatct 3420 ctgacagagg tgagtcgtcg tgctcagcgc tgccttgata cttatataca atatgcttca 3480 agctggaaaa cccgtgtcaa cgtggccaag agtcaggtca tgattgtccc acatcgacgc 3540 aaacggtctt tgatcccaac tatgcacagc agggttgagg ttaacggcgt tatggtcgaa 3600 tggactaaca aggtaaaata tttgggcctc acgttggata gtaagatgct ttcccacgga 3660 caggttgaga gtatcttacg acagggtcaa actctcttaa tgagtttata cccactgata 3720 tcgcgcaggt ccagactttc ctttgtgaac aaactagctg tctttaaaca aatcattcaa 3780 ccagcggtac tctatggttg ccaggtttgg ggaacatgtg ccagggtaca tcgccaaaaa 3840 atacaggtaa tgctgaacaa aatgctaaga atgatagcaa attgcgatcg atacactagg 3900 aattcagatc tttatgacat atgcgaagta gctccactcg atagctacat acaacagcaa 3960 acgacaaaac taattgaaaa atgcagagat tctacacaca gggcagtacg tgaaatcgcc 4020 tctatttgaa tttttctttt aggtttaggg tagattaggt taagatagga taattaggtt 4080 aaggtaaaac tctcagttct attaatagaa caagctwcta cagtgtatca aaatacagca 4140 agaaagtgaa gccccaaaat aggacgwata agaaaaatam cctgtaaagc atacgtgctg 4200 catcttgtaa aacttggaaa atcaactgta aaggaaatca aagtacaaat atacgtttaa 4260 ctaactaac 4269 // ID GYPSY4-I_AG repbase; DNA; ANG; 4427 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 21-SEP-2005 (Rel. 10.1, Last updated, Version 2) XX DE GYPSY4-I_AG is an internal portion of the GYPSY4_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW AP protease; GYPSY4-I_AG; GYPSY4-LTR_AG; GYPSY4_AG; Gyspy clade; KW gag; integrase; reverse transcriptase; GYPSY34-I_AG; KW GYPSY34-LTR_AG. XX NM GYPSY4-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4427 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY4_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 77-77 (2003). XX DR [1] (Consensus) XX CC GYPSY4_AG is a family of autonomous Gypsy-like LTR CC retrotransposons. CC GYPSY4-I_AG, an internal portion of GYPSY4_AG, is flanked by CC GYPSY4-LTR_AG LTRs. The GYPSY4-I_AG consensus sequence was CC reconstructed based on multiple alignment of 5 copies; they are CC less than 2% divergent from the consensus sequence. The A. CC gambiae CC genome contains about 10 copies of GYPSY4_AG. CC The consensus sequence encodes the 1423-aa GYPSY4_AGp protein CC (pos. 119-4387), composed of gag (zinc-finger, 264-379), protease CC (417-502), reverse transcriptase (pos. 628-797) and integrase CC (pos. 1145-1290) domains. XX FH Key Location/Qualifiers FT CDS 59..4387 FT /product="GYPSY4_AGp" FT /translation="PACIRVYLCTCVYDTTPFSIMLRKKELRRALIVDVPD FT TATVTQLRQLYASHEPVARSPRAAPPTTSATTPAPACANHQDAAILCLPHY FT NGDDDFAHHENVANAAASTNNTTDAVSALPSAHGVAAALPRGPDDIEAQFE FT KLRQQQQLAELRQKVHQLETQQPAALCVKDFEAFIEPLDADKNPNVIRWFR FT DLERLFALYRVRDADKFFFTLRLLTGTAANVAKELVVTTYDELKKELIDNL FT HVVATPESVYRQLRNRRLRPQESALHYLFDMQRIAGQASIADSELIPIVID FT GLGSPSITSSLHFMPLTMDDFRKKLKLFESCRHLCTTQPPSADARATTNSR FT MERPRPSQEPIRCFNCSRFGHLQNACPRPKRPPGGCFRCFQTGHVYRNCPE FT RRANATVEGNTSSDEALATNQEVSLTFFHPSAKRTTLPCVRSLLDTGSPVS FT FISDTIVPVKMLGPLSATEYCTMIKGPLYSRGKIDCTIRFKNHSVRHSFII FT LPGIAWPVIIGRDLLNSLNIFLTYSSLTTSCITKPLSTELKEVDTILPEKL FT DDAIRSICALDVAEADNELDLGKTLSLEQRSIVNSIVENSYLNYTSDVIPL FT KHPMKINLTHDTPIFTKPRRLSYGERQQVKQIVDKLLAENIIRPSNSPYAS FT ALVLVRKKSGEVRMCVDYRPLNKITVRDNYPLPLIETCLEHLCGKKFFSLL FT DLKSGFHQVPMSEESIPYTSFVTPDGQFEYLKMPFGLRNAPSEFQRFINSI FT LREFIDDGRIVVYLDDIIIASTDLSSHFSTLRSVLEKIKQNNLELRLDKCK FT FVHEEIEYLGYKANFSGIQPSDRHIKALTNYPMPTNLKQLRRCLGLFSYFR FT RFVPSFSCIAKPMTNFFRRTKYLTSIQIACMLLKPYVTNLCILLSFPYSTQ FT NGKPNYTVTQVPLVLALFSFRNRMTISCTLLLTFPKPLQKTSPSYTVMSLK FT LFPSFTLLSASTLMFMGSPLVTDCNSLVETLKNRNASAKIARWSLFLENYD FT YTICHRSGTSMPHVDALSRTEAVGAIGEIDLDFQLQVAQTRDPSIEALKHR FT LESEEVDGFLLQDGLVYRDIPDGQPQLYVPSEMVDNVIRHTHERIGHLGIN FT KTFSKISQHYWFPHMKPTIDKFIKNCLKCIVYSAPHHTNARNMYSIPKEPL FT PFDTIHIDHLGPLPSSSLRKKYILVVIDAFTKLTKLYPTSSTNAKEVCSAL FT SQYMSYYSRPRRIVSDRATCFTSTLFEDFLESHNISHVLNATGSPQANGQV FT ERVNRVLRPILSKLSDAPDQTDWVSKLRSAEYALNNTVHTSTNFCPSVLLF FT GVEQRGKVPDELAEYLDEKFDRASRDLEAIRAKALENIEESQRKNEEYFSK FT KHKPPQCYKEGDLVAIRYSDTTDSGNKKLNPKFRGPYVIHKVLPHDRYVVR FT DVEGCQLTQLPYDGVLEANKLRRWTESSD" XX SQ Sequence 4427 BP; 1170 A; 1185 C; 898 G; 1174 T; 0 other; tctcagaagt gggattacca acaaaaatcg cctacaaaac cgcctgcaag ccgcctaacc 60 agcctgtatc cgtgtgtacc tgtgtacgtg tgtgtatgac acaacgccat tttccatcat 120 gctgaggaaa aaagaacttc gtcgggcgct tatcgtcgac gtgccggata ccgctaccgt 180 cacacaactc cgacagcttt atgcctccca cgagccggtc gctcgttccc cccgtgcggc 240 gccgcccacc acctcagcga cgacaccggc tcccgcttgt gcaaaccacc aagatgccgc 300 cattttgtgc cttccacatt acaatggcga cgacgatttt gcgcatcacg aaaatgttgc 360 gaacgctgcc gcttcgacga acaataccac tgatgccgtt tccgcccttc cttctgccca 420 tggtgtggcc gccgcccttc cccgcggccc tgacgacatc gaggcccaat ttgagaagct 480 gcgacagcag cagcagctag ctgaattacg ccaaaaggtg caccaacttg aaacgcagca 540 gccagccgcc ctttgcgtaa aggactttga agctttcatc gagccactcg acgccgataa 600 gaaccccaat gtcatccgat ggttccgcga cttggagcgt ctctttgcac tttaccgagt 660 gcgcgatgca gataaatttt tcttcaccct tcggctcctc accggcacag ccgctaacgt 720 cgcaaaagaa cttgttgtca ccacttatga tgagttgaag aaagagttga tcgacaatct 780 tcacgtcgtt gctacgcccg aatctgttta tcgccaactc cgtaaccgtc gattgcggcc 840 ccaggaatcc gccctgcatt acttgtttga catgcagcgc atcgcaggcc aagccagtat 900 cgccgattca gaactgatcc cgatcgtcat cgacggcttg ggaagcccgt caattacgtc 960 gagtctgcat ttcatgcctc ttacgatgga cgacttccgg aagaaattga aacttttcga 1020 atcttgccgt catctttgca ccacccagcc cccttccgct gatgcccggg ccacaacgaa 1080 cagccgtatg gaacggcccc gcccatcgca ggaacccatc cgctgcttca actgctcccg 1140 attcggacac cttcagaacg cgtgcccgcg gccgaagcgc ccacccggcg gatgttttcg 1200 ttgtttccag actggacacg tctaccgtaa ctgccctgaa cgtcgggcca acgccactgt 1260 cgagggcaat actagttcgg acgaagctct cgccacaaat caagaggtga gtttgacatt 1320 tttccaccct tctgctaagc gtaccaccct tccctgcgtt cgttcccttc tcgacacagg 1380 aagtcctgtg agcttcatta gcgacacgat agtaccagtt aagatgctag gacctctttc 1440 cgctaccgaa tactgcacta tgattaaggg accactttac tctcgaggaa aaatcgattg 1500 tactatccga tttaagaatc attccgttcg acactctttt attatattac ctggaattgc 1560 gtggccagtc attatcggtc gcgatttact gaactcactt aatatttttc ttacgtattc 1620 atctcttaca acttcatgta ttactaaacc tctatcgacg gaacttaaag aagtagatac 1680 gattcttcca gaaaaattag acgatgctat taggagtatt tgtgcgctcg atgtggctga 1740 agccgataat gaactggatt taggaaaaac actatctttg gaacaacgtt caatagtcaa 1800 ttctattgtt gaaaattcat acctcaacta tacttcagat gttataccgc tcaaacaccc 1860 tatgaaaatc aatctgactc atgatacacc aatatttact aagccgcgaa gactctctta 1920 tggtgaaaga cagcaggtta agcaaattgt tgataaactg ttagcagaaa acatcatccg 1980 gcccagtaat tctccttatg cttctgcgct tgtcctcgtt aggaaaaaga gtggcgaggt 2040 tcgtatgtgt gtggattacc ggcccctcaa caaaattaca gttcgggaca attaccccct 2100 accccttatc gaaacttgtt tggagcatct gtgtggaaaa aaattcttca gtttgctgga 2160 tttgaaaagc ggattccatc aagtcccaat gagtgaggag tctatcccct acacttcttt 2220 tgtgacccca gatggtcaat ttgaatatct gaaaatgcca ttcggtcttc gtaacgcccc 2280 ttccgaattc caacgtttta ttaattctat cttaagggaa ttcattgatg atggcagaat 2340 agtagtgtac ctcgatgaca tcatcatcgc ttctaccgat cttagctctc acttcagtac 2400 ccttcggtcc gtattagaaa agattaagca gaataattta gaacttcgtc ttgacaagtg 2460 caaatttgtc catgaagaaa tagaatactt gggctacaaa gctaactttt ctggaattca 2520 gcctagtgat aggcacatta aagcacttac taattaccct atgcccacta atttaaagca 2580 actcagacgt tgtcttggtc tgttttcata cttccgacgg tttgttccat ctttctcttg 2640 catcgctaaa cctatgacaa acttcttcag aaggacgaag tatttaactt cgattcaaat 2700 tgcgtgcatg cttttgaaac cttacgtgac aaacttgtgc attctcctat cctttccata 2760 ttcgacccaa aacgggaaac cgaattacac tgtgacgcaa gttcctttgg ttttggcgct 2820 attctccttc agaaacagga tgacaataag ttgcaccctg ttgcttactt ttccaaaacc 2880 acttcaaaag acgagtccaa gttacacagt tatgagcttg aaactctttc catcatttac 2940 gctcttaagc gcttccacac ttatgttcat gggctcccca ttagttactg actgcaactc 3000 tctggtcgag acccttaaga accgtaatgc ttccgctaag attgccaggt ggtccttgtt 3060 tctggaaaat tacgactata ccatctgtca tcgctcaggc acttctatgc cccatgtcga 3120 cgcactgagt cgcaccgaag ctgtgggtgc catcggtgag attgaccttg acttccagct 3180 tcaagtagct cagacgcgtg acccatctat cgaagctctc aaacatcggt tagaatcaga 3240 agaagttgac ggattcttac ttcaagatgg gcttgtctat cgcgacatac ctgatggtca 3300 acctcaattg tatgtccctt cggaaatggt cgacaacgta attagacaca ctcacgagcg 3360 aattggccac ctgggcataa acaaaacctt cagcaaaatc agtcagcatt actggttccc 3420 ccacatgaag cccactatcg acaaattcat taagaactgc ctcaagtgca ttgtttattc 3480 tgcacctcat catactaatg cccggaatat gtacagcatc cctaaagagc ccttaccctt 3540 tgataccatc catattgacc atttaggtcc gctccctagt tcttccttac gcaagaagta 3600 tatacttgtt gttatcgatg ctttcactaa attaaccaaa ctttacccaa cctcctcaac 3660 taatgcgaag gaagtgtgtt ctgccctttc ccaatatatg tcttactata gccgccctag 3720 gcggattgtt agcgatcgag ctacttgttt cacctcaacc ttgtttgagg acttcttgga 3780 atcgcataac attagccatg tcctcaacgc caccggatcc ccacaagcca atggacaggt 3840 agaacgggtg aaccgtgtgt tgcgtcctat ccttagcaaa ctatctgatg ctccagacca 3900 gaccgattgg gtatccaagt tgcggtcagc cgaatacgct ttaaacaata ccgtccacac 3960 atctacgaac ttctgcccct ctgtcctact ctttggtgtc gagcaacgcg gtaaagttcc 4020 agacgagtta gccgaatacc tggatgagaa atttgatcga gcctctaggg acttagaagc 4080 cattcgggct aaagcgttag aaaacataga agagtctcaa cggaagaatg aggaatactt 4140 tagcaaaaag cacaaaccac cacagtgcta taaggaaggt gacttagtgg ctatacgtta 4200 ctctgatacg accgatagcg gtaataagaa gctcaatcct aaattcaggg gaccttacgt 4260 catccataaa gtgttgcccc atgataggta cgtggtacgc gatgtagaag gatgtcaact 4320 cacacaacta ccctacgatg gggttctaga agcgaataag ttgcgacgtt ggaccgagtc 4380 cagtgattag gaaattgagg gcaatttatt gttcaggata gccgagc 4427 // ID GYPSY67-I_AG repbase; DNA; ANG; 4715 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 28-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE GYPSY67-I_AG is an internal portion of retrotransposon GYPSY67_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY67-I_AG; GYPSY67-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY67_AG; mag lineage; reverse transcriptase. XX NM GYPSY67-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4715 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY67_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 173-173 (2004). XX DR [1] (Consensus) XX CC GYPSY67_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, CC GYPSY64_AG, GYPSY65_AG, GYPSY66_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY67-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. The consensus encodes the 1529-aa CC GYPSY67_AGP gag-pol like polyprotein (pos. 117-4703). The CC sequence of the LTRs flanking GYPSY67-I is deposited as CC GYPSY67-LTR_AG. XX FH Key Location/Qualifiers FT CDS 117..4703 FT /product="GYPSY67_AGP" FT /note="gag-pol" FT /translation="MNSGDGAMEVPMDEDQRRSSLGRIDYRAMTAQQTGEV FT PSVYDPPGHVQPGSRGYVSQQGQLTPRPVQLGPSPSQPAPSQSAPLPSVQN FT GQMGQQPLDNAVLQQTLHLLQQQLKQQQQLISQMLQQQQSAPPAQLQPAQQ FT YQPAGPSNPELILDALANSIAEFRYEAESGVTFEAWFTRYEDLFAKDASRL FT GDEAKVRLLVRKLGTPEHARYISYILPRSPRDLSFEETVDKLTALFGCRES FT LLSKRYRCLQICKKRTEDLIAFSCRVNRACVEFQFASMNEETFKCLMLVCG FT LKDEADNDLRTRLLARIEERNDVTLEQLSAECQRITSVKGDSAMIAGETSE FT RVFAVHSGEKRSHEKAAQQTNYKRFTPYRTKRPFRAKYAVCSSTSKPAKPC FT WLCGDMHWVRECTYRSHKCLDCARYGHREGHCNTASRKKRFNVRQRNINTR FT VVTVNVRSIRERRRFVSIALNGTAVRLQLDTASDISVIDRRTWRKIGSPPL FT TPSSVTAKTASGATLVLDGEFSCAVSVGSQTRQATLSVCGAANLLLLGADL FT IDVFSLWSVPMDAFCNHVTVAGQQSFQQLFPKVFTGTGLCTKASIKFTLRD FT NVRPVFRPSRPVAYAMEETVSRELDRLEELNVITPVTTAEWAAPVVVVRKA FT NGLVRICGDYSTGLNAALFPHDYPLPVPEDIFARLANCKVFSKIDLSDAFL FT QVEIDPEYRHFLTINTHRGLYTYNRLPPGIKIAPTAFQQLMDIMLSGIQGV FT SVYLDDIIIGGPSEAEHDATVVEVLNRIQNYGFTLRAEKCHFRVNQIKYLG FT HIIDSHGLRPDAQKVEAIRKLPEPTNLTEVRSFLGAINYYGRFVPNMRNLR FT YPLDDLLKAGVEFRWTSECRKAFESFKTILASDLLLTHYDPKQAIIVSADA FT SSVGLGATISHKYPDGSIRVVQHASRALTAAEKAYSQIDREGLAIIFAIKK FT FHKMIFGRKFLLQTDHRPLLRIFGSKKGIPNFTANRLQRFALTLLAYDFEI FT EYVRTDQFGNADLLSRLIHTHAKPDEDSVIACVTLEAEVKSLVTSAIQHVP FT VNFIDISRETIADRLLSKVLQYVQHGWPNNAAYEGELSRFHDRKDSLTAID FT GCLLFRERVVIPKVLQRTCLKQVHLGHPGINRMKAMARSYFYWPSMDHDIV FT DWVNSCHSCQIAAKSPTHIKSTSWPEAPGPWYRIHVDYAGPLNGEWYLVIV FT DSFSKWPEVIPTSSTTTTATISILRNIFARFGNPVTLVSDNGPQFTSTDFE FT SFCRQNGVEHIRTAPYHPQSNGLAERFVDTFKCALRKMSADGLTLREALDT FT FLQTYRATPNAQLNNSKTPAEIMLGRRPRTLLELLLPPRQASQPSAGLRVR FT ELNPGESIYAKEYRLNDWKWVTATVVDRQGRYMYLVRTAEGKTFRRHINQL FT RRRISDGSFKDTTRAHSPLPLDLLFDAWHFNPSPVPASSGQLEADKPSSPP FT APVSSCVVPSSNFNIESSRPTLIKCPRRRATSRRTDQTIDSVHEPRRSSRN FT RRPPSRFDVYRTF" XX SQ Sequence 4715 BP; 1280 A; 1152 C; 1170 G; 1113 T; 0 other; gtggcgacga gaatttgttt taaaagttta gtgtctgtaa agtaaaagtc tgtgttttaa 60 cgtgtaagag tgtttaaaaa aaaaaaaact ccagcatcac gtcgaagcgg cgcaggatga 120 acagcggaga tggagcgatg gaagtcccga tggatgaaga ccagcgacga tcatcgttag 180 gccgtatcga ttaccgtgca atgacggccc aacaaaccgg tgaggttcct tccgtgtacg 240 acccgccggg ccatgttcaa cctggatcgc gcggctacgt gagtcagcaa gggcagttaa 300 caccgaggcc agtgcagcta ggaccatcgc catcgcaacc agcgccatca cagtcagcac 360 cattgccatc cgtacaaaat ggccagatgg gtcaacaacc tctggataac gccgtcctgc 420 agcaaacctt gcacttgctg caacagcaat tgaagcaaca gcaacagtta atttcgcaaa 480 tgttgcaaca gcagcaatcc gcgccgccag cccagctaca acccgcacag cagtaccagc 540 ccgccggtcc tagtaaccca gaacttatac tcgatgcttt ggccaatagc atagcagagt 600 tccggtatga agctgaatct ggtgttacgt tcgaagcttg gttcacacgc tacgaggacc 660 tgtttgccaa agacgcttcg cggctaggcg atgaggctaa ggtaaggctt ttagtgcgta 720 agttgggaac accagaacac gcccgttata taagctatat tttaccccgc tcaccacgtg 780 atttatcgtt tgaggaaacc gtcgataagc tgacagcact ttttgggtgt agagaatctc 840 tccttagtaa gcgctacaga tgcctccaga tttgtaaaaa gcgcacggaa gatttgatcg 900 cgttctcttg tcgggtaaac cgagcatgcg tcgagttcca gtttgcgagc atgaacgaag 960 agaccttcaa gtgcctaatg ctggtgtgtg ggctcaaaga tgaggctgac aatgacctgc 1020 gaactagact ccttgcgcgc atagaggagc ggaacgacgt tacgcttgaa caattatccg 1080 cagagtgtca gcgtattaca agtgtgaaag gagacagcgc tatgattgcc ggagagacga 1140 gtgaacgtgt tttcgcagta cacagcggag agaagagatc gcacgagaaa gcagcgcagc 1200 aaactaatta caagcggttc acgccgtacc gtaccaaacg accgttccgt gcaaagtatg 1260 ctgtgtgttc atcaacaagc aagccagcga agccctgttg gttgtgtggt gacatgcact 1320 gggtgcgtga gtgcacttac cgttcgcaca agtgtctcga ctgtgcaaga tatggtcatc 1380 gtgagggaca ctgcaacaca gcgagcagga agaagcggtt caatgttcgg caacggaata 1440 tcaacactcg tgtagtaacg gtcaacgtcc gaagcatacg agagcgccgc agattcgtgt 1500 ccatcgctct caacggaaca gctgttcgat tgcagctaga cacagcttcg gatatcagtg 1560 tcatcgaccg ccgtacgtgg agaaaaattg gcagcccgcc tttaacaccg tcatccgtca 1620 cagctaaaac tgcatcggga gctacactgg tattggatgg tgaattcagc tgtgctgtta 1680 gtgttggtag ccagacgagg caggcgacac tcagcgtatg cggagcagca aacctactac 1740 tactgggagc cgatttgatt gatgttttct ccctctggtc ggtgccgatg gatgcgttct 1800 gcaatcacgt tacagtagca ggacagcaat cgttccagca gctttttccc aaggtgttta 1860 cgggaacagg actctgtaca aaggcgagta taaaatttac cttgcgagat aatgttcgtc 1920 ctgtttttag acctagccgc ccagtagctt acgcgatgga ggaaaccgtg agtcgtgagc 1980 tcgaccgtct tgaggagttg aatgtcatca ctcctgtaac tactgcagaa tgggcagctc 2040 cagtggtcgt cgtgcgcaaa gccaacggac ttgttcgtat ttgcggcgac tattcgacgg 2100 gacttaatgc tgcgttattc ccccacgact acccgttacc tgtaccagag gacatttttg 2160 caaggctggc aaattgcaag gtctttagca aaatagacct gtcagatgca tttttgcaag 2220 tggaaattga tccagaatac cgtcatttct tgaccataaa tacgcatcga gggctctata 2280 cgtacaaccg acttccacct ggcattaaga tagctcctac agcctttcag caattgatgg 2340 acataatgct atccggtatc caaggcgttt cagtgtactt ggatgacatc atcatcggag 2400 gtccatccga agcggagcac gacgcaaccg tagtagaagt tttgaatcga attcagaact 2460 acgggttcac actgcgagcg gaaaaatgcc acttcagggt taaccaaatt aaatatttgg 2520 gccacattat cgacagccat ggattacggc ctgatgcgca gaaggttgaa gctattcgca 2580 agttaccgga gccaaccaat ttaaccgaag taagatcgtt tttaggggcc ataaactatt 2640 acggtcggtt tgtacccaac atgagaaatt tgcgctatcc attggacgat ttgttaaagg 2700 ccggagtcga atttcgctgg acgtcggaat gcagaaaagc ttttgagagc ttcaagacga 2760 tattggcttc cgatttacta ctaacccatt acgatcctaa acaagcaatc atcgtatctg 2820 ccgacgcatc atcagttggc ctcggggcga caataagcca taagtatcct gacggatcta 2880 ttcgggtggt ccagcatgcc tcgcgcgccc tcacagcggc agaaaaagca tacagccaga 2940 ttgatcgcga gggcttggcc ataatttttg cgataaaaaa atttcataag atgatattcg 3000 gcagaaagtt tttattacaa acagatcatc ggccgctttt gcgaatattt gggtcgaaaa 3060 aagggatccc caattttaca gccaatcggt tgcaacgttt cgctcttacc ctactggcat 3120 atgattttga aatagaatac gttcgcaccg atcagtttgg caacgctgac cttctttccc 3180 gcctgataca cacacacgcc aagccagacg aagattcagt gatagcatgt gttacattag 3240 aggcggaagt aaaatcattg gtaactagcg ctattcagca tgttccagtt aattttatcg 3300 acataagtag agaaaccata gccgacagat tattatccaa agtccttcaa tacgttcaac 3360 atggatggcc aaataatgca gcttatgaag gagaactgtc gcgcttccat gacagaaaag 3420 attcgcttac agcaatcgac gggtgtctcc tatttagaga gagggtggta atccccaaag 3480 tattgcaacg cacctgcttg aaacaggttc atctcgggca cccaggaata aataggatga 3540 aagccatggc cagaagttat ttttattggc catcaatgga ccatgacatt gtcgattggg 3600 tgaactcgtg tcattcatgc cagatagcag ctaagtctcc aactcacatc aagtcgacca 3660 gctggccaga agcaccaggt ccgtggtacc gcattcatgt cgactacgcg ggaccactca 3720 acggggaatg gtacctggtg attgtcgatt cgttctccaa gtggccagaa gtgattccaa 3780 cgagcagtac aacaacaacg gcaacgataa gtatactgcg taacatattt gctcgttttg 3840 gcaacccagt aacacttgtg tcggataatg gtccacaatt cactagcaca gattttgaat 3900 ctttttgtag gcaaaatggg gttgagcaca tcaggacagc gccgtatcac ccgcaatcca 3960 acgggctggc tgaaaggttt gtcgacacct ttaagtgcgc tttgaggaag atgtcagccg 4020 atggtctcac gttacgagaa gctctggaca ccttcttgca gacctatcgg gccaccccga 4080 acgctcagct gaataattcg aaaactcctg ccgagattat gctaggcagg cgtcccagga 4140 ccttgctaga gctgttattg ccgccacgtc aagcttcaca accgagtgct ggtctacgag 4200 ttcgtgagct gaatcccggt gagtcgatct acgctaagga gtaccgcctg aacgattgga 4260 aatgggtcac tgctacggtg gtcgatcggc aaggcagata catgtacctg gttcggactg 4320 ctgaggggaa gacgttccgt cgacacatta atcagctacg tcggcgtatc agcgacggtt 4380 cgtttaagga cacaacacgg gcccattcac cactaccttt ggatctgtta ttcgatgctt 4440 ggcactttaa tccgtcacca gtacctgcat catcgggcca gctagaagct gataaaccgt 4500 cctcaccgcc agcgcctgtg tcgtcctgtg ttgtaccatc tagtaatttt aatattgaga 4560 gcagccgtcc aacactcatc aaatgtccac gaagaagagc tacatcccgc agaacggatc 4620 aaacaatcga ttcggtacat gaaccacgtc gctcttctcg caacagaaga ccgcccagtc 4680 ggttcgatgt gtaccgaacg ttttaaggag ggaga 4715 // ID GYPSY48-I_AG repbase; DNA; ANG; 5426 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY48-I_AG is an internal portion of retrotransposon GYPSY48_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; CsRn1 lineage; GYPSY48-I_AG; GYPSY48-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY48_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5426 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY48_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 88-88 (2004). XX DR [1] (Consensus) XX CC GYPSY48_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the CsRn1 CC lineage of other organisms. CC GYPSY49_AG, GYPSY50_AG, GYPSY51_AG, GYPSY52_AG and GYPSY53_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY48-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 297-aa GYPSY48_AG1p gag-like CC polyprotein (pos. 1402-2292) and the 1068?aa GYPSY48_AG2p CC pol-like polyprotein (pos. 2145-5348). CC The sequence of the LTRs flanking GYPSY48-I_AG is deposited as CC GYPSY48-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1402..2292 FT /product="GYPSY48_AG1p" FT /translation="MSKNESDDKNAGPSAIPVRAAGPSSVAGTMEQENSHL FT AIEKISTKLPCFWEDAPSAWFVAAEAEFDISNVTRERSKYSHVLSVLPKNV FT LPKIMDIITMPLPNNPYSHLKSQLLERLNASEEQRITQLLYHVQMGDRSPS FT DFYRHMVQLAGDSANLTSELIRKLWLSRLPKTMEVALIAIGNREINELIKI FT ADRVWEATQAGAVSSIAGVSSSESSGRNPSNETAILRQEISELRSMLRRIS FT FQTGNNRGRQRERSQSQKRARTPSQVRKHPMCYFHFRFGEQARRCQKPCNF FT SQQKN" FT CDS 2145..5348 FT /product="GYPSY48_AG2p" FT /translation="RKTARKISEPKTCKNTITGSEAPHVLFSFSVWRTSTT FT VPKTLQFFAAKKLICSTAAAGNPRAILGSSRLFIDDEKTTFTFLVDTGADV FT SVIPYNLFGKVIKQAEHQLVAANGSPIATYGTKLLQLSLGLRRVFSHCFTL FT AEVNRPIIGADFLAKHDLWVDLKRKRLIDHLTGQETIATIALVDTPTPKNW FT MIESSIFGEIFKKYPTLVDPPDYKKPVRHSIVHNIETKGSLPFSKPRRLDP FT KKHKIAQAEFQHMLDLGICRVSSSSASSPLHMVPKPDSDWRPCGDYRRLNA FT ITIPDRYPIPHVQDFTMNLDGCKFFSKLDIIRAYHIIPVAEEDIHKTAITT FT PFGLFEFLRMPFGLRNASQCFQRLMNNIFHDLDFVFVYIDDILIASSTRES FT HLEHLHIVLKRLTDHDIKLKPSKCILGVNQIEFLSHTISAKGITPSPEKIM FT AIRDFPSPCSVKQLQRVLGMVNYYHRFIPNVSDKLRVLHQLVKEHNKNRKI FT KFEWSDACEQALNQIKDELSNATMLVHPKSNATYTLTTDASSVAVGAVLQQ FT TSEGILEPLAFFSKTMTPTEQRYSTFDRELLAITLGIKHFRHFLEGRSFTV FT YTDHKPLIHALNSKTEKSPRQCRQLDFISQFTTDIKYITGESNVVADALSR FT IGETNEITNFTYEAIAKEQQKDEQLKQLFEDSNSQANSKFKLNQLILDKNK FT LVFESSTGQNRLYVPQSLRKKTFANLHNLSHPGIRASRKLLSERFFWPSMN FT KDVGNWTRSCVGCQKSKIGRHTRSPLEALSVPKGRFEHVHMDIVGPLPPSE FT GMSYLLTIVDRFTRWPECYPMPDMSAKTLAETFVRNYIPRFGCPSTITTDR FT GRQFQSKLFEELTKLLGTHHIKTTAYHPQSNGMVERFHRHLKGALKACEDN FT THWISRLPLILLGIRVAYRDALKGAPAEMVYGESLRLPSEVFLPSAAKSVE FT DVPEFVQDFKKALRHIDPCPPKLNDTTPVYLPKHLQTCKHVFVRVDKVRTG FT LQAPYEGPFEVVRRCRKFFTVKIKEKNESISIDRLKPAFEMEGVSKTGKTT FT NSQKNVRFAKN" XX SQ Sequence 5426 BP; 1710 A; 1200 C; 1127 G; 1389 T; 0 other; aattggtgac cccgacgtga tctgaagtgt gtcgtgaaaa gtgtcgtgat tggtgcaaca 60 agtgctaagt gcatggctaa tggatagagg aaacgacaac cgtcgtcgct aacgcacatc 120 acggtcccac ctcttcagct gttcacgtcc ctgttatcat cgtgctgcag cgcaggactg 180 gagtataacg cacacggaca catacactgc gtgtcggaac attaaggaga gtgcaacgcg 240 ttaccgcgtg gacagtgttt taccgctact acaaaacctg tcaatccaac gtacggggtg 300 aatgccatcg tgaaatccgc gtttcggacc atgcggtagc ccacattgtg agtgtcgcaa 360 aggaaaccac tcacgggttt tttttttctc gttttcggat gggaagaagc cagccctccg 420 aaaactgacg agtgtcacac gggaccacgc caaagcagca attttcccgt gtgaagaacg 480 aggcatagga agcaacgcga tcttctattt ggcaaaccac gtgacgcagc catcgcgatc 540 ttctatttgg caaaccacgt gaagcagcca tcgcgatcct ctatttggca aaccacgtga 600 cgcagccatc gcgattttct acgtggcaaa cctcgtgaag cagccatcgc gatcctgtat 660 ggtggaaacc acatgcagca gccatctcga tcctctatgt ggaaaaccac gttaacccat 720 cgcgcagcag aaatcgagct gtgtaaaaca acgggaagca gcataagatg acgtgccaat 780 cattcgacgg aacatcatag tgaacgaacg gacgcagatt tcaacgtgaa cgaacgtgcg 840 gaacatcgtg gatgacgggt gcagcttttt acggtgcaac atcgtagatg acgtgtgcag 900 tttttcgtcg gcgcagccat cgcaataacg tgtcgagatt ttcatcggcg cagcatcgtc 960 tttgctgcat gaaccacaac gccgcagccg ctatcgagaa caacgagctg agaacctcgt 1020 ggtgacctac cggctgctga cgctatatgc cacgatatcg gtgagctcgt tcctttattt 1080 tattttatct tttctcttat tataacgtat tttcccgcgt tccatcatta tattatatcg 1140 tatatactat tgcgttaaat attgacatgc attatatttg aatttgccga cctatttata 1200 tatatgcata cttatacata tttcacaatc gtttcgtgat acgtcgtaaa catttcaaac 1260 aaaaattttt ttttcatcat aatacgcaat tttcattaaa aaaaaaaaaa aaaaaaaaaa 1320 aattcaactc aatagtgtaa gcattttaga gagaattagg tacatttttt tattagatag 1380 ggaatattct gtgaacccaa aatgtcaaag aacgaaagtg atgataaaaa tgctggtccc 1440 agtgcaattc cggtgcgagc cgcggggccg tcatcagtgg cgggaacgat ggaacaagaa 1500 aactctcact tggccataga aaaaatttcc acaaaattgc cttgtttttg ggaagacgct 1560 ccaagcgcat ggtttgtggc agcagaagct gagttcgata tttcaaacgt tacaagggaa 1620 cgctcgaaat attctcacgt tctaagcgtg ctcccgaaaa acgtcttgcc gaaaataatg 1680 gacataataa ctatgccatt accaaacaac ccgtactcac atttaaaaag tcaacttcta 1740 gaaaggctta acgcgagtga agaacaacga atcactcagt tgttgtatca cgtacaaatg 1800 ggtgatcgta gtccgtctga tttttacaga cacatggttc agctggcagg tgattctgca 1860 aacctcactt ccgagcttat tcgtaagctt tggttatcaa gattgccaaa aacaatggaa 1920 gtagcgctca ttgccatcgg aaaccgagaa atcaatgagc taattaaaat cgcagacagg 1980 gtgtgggaag caacccaagc cggagcggta tcatcaatcg caggcgtttc ttctagtgaa 2040 agctcgggaa gaaatccctc gaatgaaact gcaatattgc gacaggaaat ttctgaactg 2100 cgtagtatgc taaggagaat ttccttccaa acaggaaata atagaggaag acagcgagaa 2160 agatctcaga gccaaaaacg tgcaagaaca ccatcacagg ttcggaagca ccccatgtgt 2220 tattttcatt ttcggtttgg cgaacaagca cgacggtgcc aaaaaccctg caatttttcg 2280 cagcaaaaaa actaatttgc tcaacggcag cggcgggcaa cccgagagca atattgggct 2340 cctctcgtct tttcattgac gatgaaaaaa caacatttac cttccttgta gatacaggag 2400 ccgacgtgtc cgtcataccg tacaatcttt ttggaaaagt catcaaacaa gctgaacacc 2460 aactagtagc agctaatggt agccctattg ctacttacgg aaccaagttg ctacagttaa 2520 gtcttggact taggagagtg ttttcgcact gtttcacact tgctgaagtg aatcgaccaa 2580 tcattggggc tgatttttta gcgaagcacg atttgtgggt cgacttaaaa agaaagaggt 2640 tgatcgatca tcttacagga caagaaacga tcgcaaccat cgctttagtg gatacaccta 2700 cccccaaaaa ctggatgatt gaatcgagca tatttggtga aatttttaaa aaatatccaa 2760 ctttagtgga tccaccagat tataaaaagc cagttaggca ttcgatagta cataacatcg 2820 aaaccaaagg aagcttacca ttttctaaac cacgacgtct ggatccgaaa aagcacaaga 2880 ttgctcaagc agaatttcaa cacatgttgg atctgggaat ctgcagagtg tcctcgtcgt 2940 cagcttcgtc accgctacac atggtaccca aacctgattc agattggaga ccttgcggcg 3000 attataggcg tttaaatgcc atcacaatac ctgatcgtta tccgatcccg catgtgcagg 3060 attttacgat gaacttggat ggttgcaaat ttttttctaa gttggatatc attcgagcgt 3120 accacatcat acctgttgcc gaggaagaca tccacaaaac ggcgataacc acgccgttcg 3180 ggttatttga atttctaagg atgccgtttg gcctcagaaa tgcaagccag tgttttcaac 3240 gactcatgaa taatattttt catgatttag attttgtgtt tgtatacata gatgacattc 3300 tgatagctag ttcgacgcgt gaatcccatt tagaacactt gcacattgtt ctaaaacgtt 3360 taacggatca cgatataaaa cttaagccat caaaatgcat tcttggtgtg aatcaaatag 3420 aattcctcag ccacaccatt tcggcgaaag ggataacacc atcgccagaa aaaataatgg 3480 ccattcgaga ttttccttca ccatgttcgg tgaaacagct tcaaagagtt ttgggaatgg 3540 tgaattatta tcaccgcttc attcctaacg tttcagataa gctaagagtg cttcaccagt 3600 tagtaaaaga gcacaacaaa aataggaaaa tcaaattcga atggtctgac gcttgtgaac 3660 aagcactaaa tcaaataaag gatgaacttt caaacgctac catgctcgtg catccgaaaa 3720 gtaatgctac ttatacacta acaacagatg cttctagtgt tgctgtaggg gcagtattac 3780 aacaaactag cgaaggcatt ttggaacctc ttgcgttttt ttctaaaacg atgacgccca 3840 cggaacagag gtattccaca ttcgatagag aattgctcgc tattactctt gggataaaac 3900 atttccgaca ttttctagaa ggtagatcgt tcacggttta tacagatcat aaaccattga 3960 tccatgccct aaattcaaaa acagaaaaat ccccaagaca atgccgacaa ctagacttta 4020 tttcacaatt cacaacggat ataaagtaca ttactggcga gagcaacgtt gtggcagatg 4080 cgttatctcg aataggagaa accaacgaaa tcacaaactt cacatatgaa gcaattgcaa 4140 aggaacaaca aaaagacgaa cagctaaaac aactatttga agactcaaat agtcaggcaa 4200 actccaaatt caaattaaat caattaatat tggataaaaa caaactagta ttcgagtcgt 4260 cgactggaca gaataggttg tatgttcctc aatctctaag aaaaaaaact ttcgccaacc 4320 tacacaacct atcccatccg ggtattcgcg catcacgcaa gctgttatcc gaaagatttt 4380 tctggccgtc aatgaacaag gacgtgggaa actggactcg ttcgtgtgtt ggatgccaaa 4440 aatctaaaat tggtcgtcac actcgctcgc ctttggaagc attatccgtt ccaaaaggga 4500 gatttgaaca tgtgcacatg gacatcgttg gtccattgcc accatccgaa ggaatgagct 4560 acttactgac cattgttgac cgttttacac gttggccgga atgctatccg atgcctgata 4620 tgtcggctaa aacgctagct gaaacatttg taaggaatta cattccaagg tttggttgtc 4680 catctacaat cacaaccgat agaggcagac agttccagtc aaaactgttt gaagaactaa 4740 caaagttgtt aggaacacat cacattaaaa cgactgcgta tcaccctcaa tccaatggaa 4800 tggttgaaag attccatcga cacttaaaag gagcgttaaa agcctgcgaa gacaatacgc 4860 attggatttc tcgattgcca ctgattctgc tgggcattcg agtagcatac agagacgcat 4920 taaaaggcgc tccagcagaa atggtatatg gagaaagttt gcgacttcct tccgaagtat 4980 ttttaccctc tgctgcaaaa tccgtcgagg acgtacccga atttgttcag gattttaaaa 5040 aagcgctaag gcacattgac ccatgcccac caaaactaaa tgatacaaca ccggtttact 5100 taccaaaaca ccttcaaacc tgtaaacatg ttttcgtacg ggtagataaa gtgagaacgg 5160 gtttacaagc accgtatgaa ggaccttttg aggtagttcg acggtgcaga aaatttttca 5220 cggttaaaat caaagaaaaa aatgaatcta tttcgattga ccgcctcaaa cctgctttcg 5280 aaatggaggg tgtctccaaa acaggcaaaa ccacaaatag ccaaaagaat gttcgctttg 5340 cgaaaaacta aatgaaatct caatttaaga gacaaacact ctcacaattc tttttctatc 5400 gcgactccgt cactggtggg gggttc 5426 // ID GYPSY44-LTR_AG repbase; DNA; ANG; 164 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY44-LTR_AG is an LTR of retrotransposon GYPSY44_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY44_AG; GYPSY lineage; GYPSY44-I_AG; GYPSY44-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-164 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY44_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 81-81 (2004). XX DR [1] (Consensus) XX CC GYPSY44-LTR is a long terminal repeat of GYPSY44_AG (its internal CC portion is deposited as GYPSY44-I_AG). XX SQ Sequence 164 BP; 60 A; 28 C; 26 G; 50 T; 0 other; agttaacacc aatactattt agcaatacta ttttgcttat atgcatgcac ataggattga 60 taggttaaat ctaagcgagc gtttcgaata aagattaatc tgtaagtaga atccaacgag 120 aacagacgtt cttatttgtg aatccgaaac acatacactt aact 164 // ID HATN6_AG repbase; DNA; ANG; 315 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE HATN6_AG is a hAT-like nonautonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW 8-bp TSD; HATN6_AG; nonautonomous DNA transposon; TC1N-2_AG; KW hAT superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-315 RA Kapitonov V.V. and Jurka J.; RT "HATN6_AG: a family of nonautonomous hAT-like DNA transposons RT from African malaria mosquito."; RL Repbase Reports 3(5), 91-91 (2003). XX DR [1] (Consensus) XX CC HATN6_AG is a family of nonautonomous DNA transposons that CC belongs CC putatively to the hAT superfamily. HATN6_AG copies are ~2% CC divergent CC from the consensus sequence. There are several HATN6_AG-like CC subfamilies present in the genome. CC HATN6_AG has imperfect 36-bp terminal inverted repeats (3 CC mismatches CC and one indel). The genome harbors ~30 HATN6_AG elements. CC A small subset of HATN6_AG elements (~10 copies) was transposed CC as a CC composite TC1N-2_AG-like transposon. XX SQ Sequence 315 BP; 94 A; 62 C; 63 G; 96 T; 0 other; caggggtctc caaactacgg cccgcgggcc gcatgcggcc ctcaagagct taaaatgcgg 60 ccccgatgac tttgccagat ttagaataaa tattaggtta ttttgacttt ttcaaaagaa 120 attgtatttt tttttacatt tcttagacaa aaatcaagct ttagtaacac gaatctgttc 180 aagtatcgtt agtattttgt tataaaaatg agatttaaac gttcactcat caattatagg 240 taacatcaaa tggccctcaa catagtgatt tttgaagcaa tgcggcccgc gggctgaaaa 300 gtttggaggc ccctg 315 // ID GYPSY38-LTR_AG repbase; DNA; ANG; 229 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY38-LTR_AG is an LTR of retrotransposon GYPSY38_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY38_AG; GYPSY38-I_AG; GYPSY38-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-229 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY38_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 69-69 (2004). XX DR [1] (Consensus) XX CC GYPSY38-LTR is a long terminal repeat of GYPSY38_AG (its internal CC portion is deposited as GYPSY38-I_AG). XX SQ Sequence 229 BP; 71 A; 44 C; 60 G; 54 T; 0 other; tgtaggaaat aggagacagt tgacagttgt cagacaaggg atgtaaggcg aatgtcaagg 60 aatgaattta ggatcaccct gtgctggtgt caatcgaact gacgacgcag taggtttagc 120 tagagcgaaa cattcacgcg ggagaagcca agcgaaccag acgaaagtga ataaagtgga 180 ttgtaaaacc atcgctttcc gcgtatcctc ttttcttcat cgttctaca 229 // ID GYPSY42-I_AG repbase; DNA; ANG; 5015 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY42-I_AG is an internal portion of retrotransposon GYPSY42_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY42-I_AG; GYPSY42-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY42_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5015 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY42_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 76-76 (2004). XX DR [1] (Consensus) XX CC GYPSY42_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY40_AG, GYPSY41_AG, GYPSY43_AG, GYPSY44_AG, CC GYPSY45_AG, CC GYPSY46_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY42-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 373-aa GYPSY42_AG1p gag-like CC polyprotein (pos. 481-1599) and the 1138?aa GYPSY42_AG2p CC pol-like polyprotein (pos. 1560-4973). CC The sequence of the LTRs flanking GYPSY42-I_AG is deposited as CC GYPSY42-LTR_AG. XX FH Key Location/Qualifiers FT CDS 481..1599 FT /product="GYPSY42_AG1p" FT /translation="MHVNSSANVKPRIELSVLFKNLSASEESSDSELSGEY FT SNIPLQDSSLPNLDQLNINTIEMEPTEQLKIMNQTIADLQEKIAVLTIQQS FT QPSIDVASFFRIPDPIKSLPSFDGNRKQLSTWLTTTEETLNLFKDRVTGEV FT FKMYLTAVINKIEGKARDILCLAGSINDFESLKEILFDAFGDRQELSTYKC FT KLWQNKMVDGMTIHKYYQKTKEIIQCIKTIAKQTQAYKDNWAVINQFIDED FT GLAAFISGLKGMYFGHIQAARPKDIEEAYAFLCKFKSHEITADCMVQKPPN FT QQKNSFFQNNRTATTQKSHFAQNQNINREPYPQPMEVDNSMRSRLTLNKRT FT INNFEVASQDDNSANCEQNFHLDSPSTSIT" FT CDS 1560..4973 FT /product="GYPSY42_AG2p" FT /translation="TKFSLGFAINQHNIKYCSNFLPYIKVKDSKTNRQIRM FT LIDTGANKNIIRPGIIKNTIKTEQVSIKNIFGTKIIQEKAICKLLGPNIPA FT QTYYIMEFHDFFDGIIGTEFLSQTNTVIDFKNNVVVINETKIWFEKLFSSK FT KFYHHTISIETDQNGDWCVPTFENLSENIIIEPGLYSSIDNKTFVKVLSTS FT KTTPHIPKLHFTVNNFETLTPIPSACNDIPTKKIIETLIRTDHLSFYEKSK FT LFETVIKNHNVLLKLNEKLTSTTIIKHKINTTDDLPVYTKTYRYPHVYKQD FT VETQIRDMLDSGIIQPSTSPYSSPIWVVQKKMDASGKKKVRVVIDYRKLND FT KTINDKFPMPEIEDILDSLGKSQYFTILDLKSGFHQIEMHPEHQEKTAFST FT SHGHFEFTRMPFGLKNAPATFQRAMNNILAELIGKICYVYLDDIVIVGTNL FT EDHLKNVSTVLGRLAQFNLKIQLDKCEFLKRETEFLGHIISPDGIRPNPEK FT VKKILDWPIPSNEKQIRQFLGLSGYYRRFIKDYSKITKHLTKYLKKDQTIN FT INDPEYIDSFSKLKETIASDQILAYPDFNLPFVLTTDASDFAVGAVLSQIQ FT NKVERPIAFASRTLNKAEINYSTIEKEALAIIWAIRKYKAYLYGNEFKLFT FT DHKPLTFIKTSIKNNRILNWRLELENYQYSVEYKEGRANVVADALSRKTEN FT TNEINSTNTTILATNHSGSTSDDFYIKSSERPLNYYRNQIVFELVAQHEDL FT IEIPFPNYKRTIIRRTDYDESKITDILRKFHNGKQTAILASQNLIQIIQNS FT YKNHFSCSGYIVMTHSQVKDVASVEEQNQLITREHERAHRGIHEIENQMKR FT SYFFPKMHDRIKSAINACPVCNMHKYERKPYNIKISPRAATDKPMERGHMD FT IFSINSKSFLSLADSFSKFAQMIPIDTKNLVDVKNALAKYFSTFGIPLQII FT TDHETTFRSIQLKNFLCNLGCSLTYASSSESNGQVEKTHSTIIEIYNTNKH FT KFVDMDTEALIPIAVSLYNATVHSATGYTPNEILFNQTNEMRPITIHEQAE FT KIFANAKTNIERSRQNQMKGNIRKETPPLIREGQEVYVKPNIRKKLDPRAR FT NTTVNNVTDRTFENSRHIKRHKNKIHRIRS" XX SQ Sequence 5015 BP; 1884 A; 947 C; 827 G; 1357 T; 0 other; ggcgctgtag acaatccaga gtgcaaaaat agtggcaaat aacgaattaa tcctcgttac 60 gtaagtggtt cccgcattca gcatcaacag ctataggatt acatccgctg ttcgatttga 120 gggccccccg gcacacacct ggtaccggac catcggattt ttttttatat ccaaagcacg 180 tgagtacgtt tcatcttttc atagtgaatg tgtattcggt tagtttcaag cactaccaga 240 taggcttgaa ataactacag tgtatttgtg tccatgtgag tttgcgatcg attttttatt 300 tatcgtcgta aactatctgt attgatacta ccagataggt caacagatta tcgaagaaga 360 atcgtcctcg tgttgtccgc gagtgaactt attgttagta gtgtgcgttg tgtgtattgt 420 gcgaatagca agagcacgag gcatcggttg ccgctgctta taacaattgt atcgacacgc 480 atgcatgtga acagttcagc taacgtaaag cctcgaatcg agcttagcgt tttgttcaaa 540 aatctatcgg cttcagaaga gagttcagat tcggaattat ccggagaata ttcgaatata 600 cctttacaag acagttcact tcctaattta gaccaactca atattaacac tatagaaatg 660 gaaccaacag agcagcttaa aattatgaat caaacgatag cagatttgca agaaaaaatt 720 gcagttttga caatccagca aagccaacca tctatcgatg tagccagttt ttttcgcatc 780 cccgacccta taaaatcatt gccatcattt gacgggaatc gcaaacaatt atccacatgg 840 ctcaccacta cagaagaaac acttaacctt ttcaaagata gggttactgg tgaggttttt 900 aagatgtact taacggcagt aataaataaa atcgaaggta aagctaggga tattctttgt 960 ttagccggaa gtatcaacga ttttgaatca ttaaaagaaa ttctttttga tgcatttggg 1020 gatcggcagg aattgtcgac ttataaatgt aaattgtggc aaaacaaaat ggtcgacgga 1080 atgacaatac ataagtacta ccaaaaaaca aaagaaataa ttcaatgcat taaaactatt 1140 gctaagcaga cccaagctta taaagataac tgggctgtaa ttaatcaatt tattgacgaa 1200 gatggattag ctgcttttat ctcaggattg aaaggaatgt attttggcca tatacaggca 1260 gcccgcccca aagacattga agaggcttat gcttttcttt gtaaatttaa atcacacgaa 1320 ataacagcag actgcatggt ccagaaacca cccaaccaac aaaaaaatag tttctttcaa 1380 aataatagaa cagctaccac tcaaaaaagc cattttgctc aaaatcaaaa catcaatagg 1440 gaaccttatc cacaacctat ggaagtggat aattcaatgc gaagcagact cacattgaat 1500 aaaagaacga ttaataattt tgaggttgct tcacaagatg ataactctgc aaattgtgaa 1560 caaaattttc acttggattc gccatcaacc agcataacat aaaatactgt tctaattttt 1620 taccttacat aaaagtgaag gacagcaaaa ccaacaggca aatccgtatg ctcatagata 1680 caggagcaaa taaaaatatt ataagacctg gtattatcaa aaacacaatt aaaacagaac 1740 aagtgagtat taaaaatatt tttggtacta aaattattca agaaaaagct atttgtaagc 1800 ttttaggccc aaacatacca gcacaaactt attacattat ggaatttcac gatttttttg 1860 acggcataat aggtactgaa tttttgagtc aaacaaacac cgtcatagat ttcaaaaaca 1920 atgtcgtagt tatcaacgaa acaaaaattt ggttcgaaaa attattttcc tcaaaaaaat 1980 tttaccacca cacaatatca attgaaaccg atcaaaatgg tgattggtgt gttccaactt 2040 ttgaaaattt atcagaaaat ataattatag aacctgggtt atattcttct atagataata 2100 aaactttcgt taaagtatta tctaccagta aaacaactcc ccacatacct aaattgcatt 2160 tcactgtaaa taatttcgaa acattaacgc ctattccatc ggcttgtaat gacattccaa 2220 caaaaaaaat aatcgaaaca ttaataagaa cagatcatct atcattttac gagaaatcga 2280 aattgttcga aactgtcatc aaaaaccata acgtcctttt gaaattaaat gaaaaactaa 2340 ctagtactac aataattaaa cataaaataa acactaccga tgacttgcct gtttatacta 2400 aaacataccg ctacccacat gtctataagc aagatgttga aacccaaatc agggatatgc 2460 tcgactctgg cattattcaa ccttcaacaa gtccttattc atcgccaatt tgggtggtgc 2520 aaaagaaaat ggatgcatct gggaaaaaga aagtacgggt tgtcatagac tacaggaaat 2580 taaatgacaa aacgataaac gataagtttc caatgccaga aattgaagat attcttgata 2640 gcttaggtaa atcacaatac ttcacgatat tagatctgaa atctgggttt caccagattg 2700 agatgcaccc tgagcatcaa gaaaaaactg ctttctcgac aagccatggc cattttgaat 2760 ttactagaat gccatttggg ctgaaaaatg cgccagctac atttcaacgt gccatgaata 2820 acattttagc tgagcttatt ggcaaaatct gttatgttta cttggacgat atcgtaattg 2880 taggaacaaa tttagaagat cacttgaaaa acgtctcaac agtactcgga agattggcac 2940 aatttaattt aaaaatacag ctagacaaat gcgaattcct aaaaagggag accgaatttc 3000 ttggtcatat aatatctcct gatggtataa gaccaaaccc cgaaaaagtg aaaaaaatat 3060 tggactggcc tataccttcc aatgaaaaac aaatccgaca atttcttgga ttatcaggat 3120 attatcgtcg tttcatcaaa gattattcaa aaattactaa gcatttaact aaatacttaa 3180 aaaaagatca gactattaat atcaatgacc cggagtatat cgattcattt tctaaattaa 3240 aagaaacaat agcatcagac caaatactag catatccgga tttcaattta cctttcgttc 3300 ttaccaccga cgcaagtgat tttgctgttg gtgcggtact ttcgcaaata caaaacaaag 3360 ttgaaaggcc cattgccttt gccagtagga cattaaataa ggcagaaatt aattactcta 3420 ctatagaaaa ggaggcacta gcaataattt gggcaatccg caaatacaaa gcttatcttt 3480 atggaaatga gttcaaactt ttcactgatc ataaacctct aacgtttatt aaaacctcta 3540 tcaaaaataa tagaatctta aattggagac tagaattaga aaactatcaa tattctgtag 3600 aatacaaaga gggaagagct aatgttgtag cagatgccct tagcagaaaa actgaaaata 3660 ctaacgaaat taattcaacc aatacaacaa ttttagctac aaatcattca ggaagtacgt 3720 cagacgactt ttatattaaa tctagcgaaa ggccattaaa ttactaccgc aatcaaatag 3780 tttttgaatt agttgctcaa catgaagatt taatagaaat accatttcca aattacaaaa 3840 gaaccattat acgtagaacc gactacgatg aatcaaagat aacagacata ttacgtaaat 3900 tccataacgg gaaacaaacg gccattcttg cttcacaaaa cctaatacaa ataatacaaa 3960 actcttataa aaatcatttt agctgcagcg gttacatcgt tatgacacac tcccaagtca 4020 aagatgtggc atccgttgaa gagcaaaacc agctaataac aagagaacac gagagagctc 4080 accgaggtat acacgaaatc gaaaatcaga tgaagaggtc atactttttc ccaaagatgc 4140 atgatcgaat caaatctgct atcaacgcat gccctgtgtg caacatgcac aaatacgaac 4200 gaaagccgta taacatcaag atctcaccga gggcagccac cgataaaccg atggagcgtg 4260 gtcatatgga tatattttcc ataaactcga aaagcttcct ttcattagca gattcgttct 4320 caaaattcgc gcaaatgatt cctatagata caaaaaacct ggtggacgta aagaatgcgt 4380 tagccaaata ttttagtact tttgggattc cattacaaat aattaccgat cacgaaacaa 4440 cgttcagatc tatccaactt aaaaatttct tatgcaattt aggttgctcg ctgacatacg 4500 catcatcatc agagagcaat ggccaagtag aaaaaacgca ctcaacaata atagaaattt 4560 ataatacaaa caaacataaa ttcgtggaca tggacacaga agcgcttata ccaatagcag 4620 tttcattata caatgcaact gtccactcag ctactgggta tacgccgaac gaaatattgt 4680 tcaatcaaac aaatgaaatg aggcccataa ctatacacga acaagcggaa aaaatatttg 4740 cgaatgcgaa aactaatatt gaacgatcta ggcaaaacca aatgaaagga aatattagaa 4800 aagaaactcc acctttgatt cgagaaggtc aagaagttta tgtaaaacct aacataagaa 4860 aaaaattaga ccctagagca agaaatacaa cagtaaacaa tgtaacagat agaacttttg 4920 agaattcgag acacataaag cgacataaaa ataagattca tcgtattagg tcatagtaat 4980 acgtccgggg acgtccgtct tttccccccg tgacg 5015 // ID Q repbase; DNA; ANG; 4526 BP. XX AC U03849; XX DT 03-DEC-2002 (Rel. 7.11, Created) DT 20-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE Q is a non-LTR retrotransposon. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; CR1 clade; KW ORF1; ORF2; Q; reverse transcriptase. XX NM Q. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4526 RA Besansky J.N., Bedell A.J. and Mukabayire O.; RT "Q: a new retrotransposon from the mosquito Anopheles gambiae."; RL Insect Mol. Biol 3(1), 49-56 (1994). XX DR Genbank; U03849; Positions 1 4526. XX CC Q is a CR1-like non-LTR retrotransposon. It encodes a CC DNA/RNA-binding CC protein (Q-ORF1p) and the reverse transcriptase (Q-ORF2p). XX FH Key Location/Qualifiers FT CDS 1264..4413 FT /product="Q-ORF2p" FT /note="reverse transcriptase" FT /translation="MIPWKHPTQHLFLDRPKTTVLHLRTYTSLQSIAFFST FT RRSSAHCTQQTRQASCSDNAAQLTYRLPALSNRHNDNATAEYLSCYYQNVR FT GLRTKTKEFHLAVSEADFDLIALTETWLVDNIPSALLFNNNFSVYRCDRSL FT SGSSSRGGGVLLAVSNAYESIELPSRDRSLEYICVRVACNNAHLYVMVVYI FT PPQLSSEISTLRSLHDCISSFTLRLKPSDLLFVIGDFNQPSISWSTADPSS FT SPAYSSITHYEPTARSLANNTFVDGFKFNGLVQLNHINNSHGRMLDLLYAN FT NAAAKLCSPVFPSVVPLVPLDSYHPALDFNIRINSSTRRNSTTRQNSTTTA FT LYRYNFAKADYVKLNDMISMFNNSFHCSNFISLDEAVCSFSSFMLQAFVVC FT VPVQRPKPNPPWADRTLKRLKRVKRAAYRHYQTRRCQRSRSIYFDTHSLYC FT SYNRFRYRRYLSKIQRNLCRWPDSFWRFYNSKTKSTHTPKSITYKGATSAN FT TNEMCNLFADRFADCFSPAMNDTDTIDAALVNTPAGAINMSTPFIDSEIVL FT SALAQLKPSFAPGPDGIPSTVLKRCQTTVAPILAKLFNASLANGYFPKAWR FT KSWMVPIYKKGDRTDAINYRGITSLCAIAKVFELVIYKNLLHACRSYLSPY FT QHGFVPKKSTTTNLVEFVTYCTSQIDAGAQVDAIYTDLKAAFDSLPHAILL FT AKLDKLGIPSPLVQWLKSYLIHRTYIVKIDKHMSKEIVSSSGVPQGSNIGP FT LLFILFINDVTLALPPDSISLFADDAKIFAPINNTGDCTFLQDCIEIFCSW FT CKRNGLTICIEKCYCVSFSRCRSPVTGTYFMDGTAVNRQNHAKDLGVLLDS FT SLNFKQHIDDVVARGNQLLGVVIRTTNEFRNPMCIKAVYNCIVRSVLEYSC FT VVWSPTTASSIARIEAIQRKLTRYALRLLPWQDRNNLPPYAARCRLLGLEP FT LSVRRRNAQCSFIAGLLNGSIDSSPLLHRVDIYAPSRTLRSRETLRLAQPR FT SSAGRSDPMFRMSAVFNTVSDCFDFDISTQCFKERLRLLPWPQ*" FT CDS 210..1376 FT /product="Q-ORF1p" FT /note="DNA/RNA-binding protein" FT /translation="QCCAVITRDFAMAAICFSCAEPLEATGCIISCAYCDA FT TFHRGCCKLPPELIDAVLSNVDLHWSCIGCTNMLKNPRCRSVKEIGAQVGF FT QAALNSAVAAIGKLVEPIVAEVRSGFTLLQTASTPHNRNSDPRPATGRKRR FT RIIEDSASPGVNKIVNSRGNTLCAASSPNAYTNTTIAVQPAPTQPHELVGT FT DPLSSPLQAAPREPFTDRIWIRLSAYQRPSLWNKWSLSVKRRLATDDVIAY FT CLLRRGVSVDSMNWLSFKVRVPAILRDAALTPSTWPVGIGVREFFQSRQHD FT HQTSSPIATRNRFTTRTPATSTEHRYTTRTPTTTHRLAARTSTPPDPETTS FT SQQCHPPVNDTLEAPNSTLVSGPPQNHRASSPHLHQSTIDRFFLN*" XX SQ Sequence 4526 BP; 1117 A; 1250 C; 917 G; 1242 T; 0 other; acgtttgacg tttgacgttt aattgttcgg ccgcatttcc gtcgcgctcc gtgatttttc 60 aataattctc ttgtgttttc tgcgtcgcat cgcattgttg ccttttctac ggactattaa 120 actgtttaac atcatcgctt cgtcttggcg aattgtttag tcctgcatca caactgtggt 180 ttcctgctgt tttttcctgc tcggcttagc agtgttgtgc tgtgattacg cgtgatttcg 240 cgatggcggc tatttgcttc tcgtgtgctg aaccgctaga ggccaccggc tgcatcatca 300 gttgtgcata ttgtgacgct acgttccacc gcggctgttg caaattgccc cccgagctga 360 ttgacgcagt attgtccaat gtcgacctgc actggagctg cattggctgc actaacatgc 420 ttaaaaatcc gcgctgccgt tccgtaaaag aaataggcgc ccaggtcggt ttccaagccg 480 ccctcaactc agctgtagca gctattggca agctcgttga gcctattgtc gccgaagttc 540 gcagtggttt taccctcctt caaactgcat ccacgcctca caatcggaac tctgatcctc 600 gaccagctac tggtagaaag cggaggcgta taatcgagga ttcggcatct cctggtgtaa 660 acaaaattgt aaacagtcgc ggcaacaccc tttgtgccgc gtcatcgcca aacgcataca 720 ccaacaccac gattgccgtc cagccggcac ctacacaacc gcatgaactg gtgggaaccg 780 atccgttatc gtcaccgctt caagctgcac ctcgtgagcc attcacagat aggatctgga 840 tccgcctatc cgcttatcaa cggccgtcac tgtggaacaa gtggtcgctt tctgtaaagc 900 gtcgcttagc caccgatgac gttatagcgt attgcctgct gagaagaggg gttagtgtgg 960 acagcatgaa ttggctttca ttcaaagtga gggtcccggc tatccttcga gatgcggcac 1020 tcacaccatc cacctggccc gtcggtatcg gtgtacgtga gttttttcaa tcccgtcaac 1080 acgaccacca aacctcatct cctatagcca cccgaaaccg ctttactaca cgcacaccag 1140 ctactagtac tgaacaccgc tacaccacac gcacaccaac aacgactcac cgtttagccg 1200 cacgcacgtc tactccacct gatcctgaaa caacatcatc acaacagtgt caccccccgg 1260 tgaatgatac cttggaagca cccaactcaa cacttgtttc tggaccgccc caaaaccacc 1320 gtgcttcatc tccgcaccta caccagtcta caatcgatcg cttttttctc aactagacga 1380 agttccgctc attgcacgca acaaacacgc caagcctcat gctctgacaa cgcagctcaa 1440 ttgacttacc gtttacccgc tttatccaac cgccacaatg acaacgctac cgctgagtat 1500 ttaagctgct attatcaaaa cgttagaggt ttgcgtacta aaacaaaaga atttcactta 1560 gctgtatcag aggctgactt cgatctcatc gcactcacgg aaacttggct tgttgacaac 1620 attccatctg ctcttctctt caacaacaac ttctctgttt accgctgtga tcgctctctc 1680 agcggctcta gctctcgcgg tggtggtgtt ttgctagccg tttctaacgc atacgagtcg 1740 atagaattac cttcccgtga tcgttctctc gagtatattt gtgtgcgcgt cgcctgtaat 1800 aacgcacatc tgtacgtcat ggtagtatac attccaccgc agcttagctc cgagatatcg 1860 actcttcgct ctctacatga ttgtatcagc agcttcactc tcagactgaa gccttcggat 1920 ctgctgtttg ttattggcga tttcaatcag cctagcataa gctggtccac agctgatcct 1980 tcgtcctcgc cagcatactc atcaatcacg cactatgaac caactgcgcg ctcactggcc 2040 aacaacacct ttgtggatgg atttaaattt aacggattag tgcaacttaa tcacatcaat 2100 aattcacacg gacgcatgct tgacctgctc tacgctaaca atgctgcagc taaattgtgt 2160 tcgcctgtct ttccaagtgt tgttccgctt gtacctcttg actcctacca tccggcgtta 2220 gacttcaata tacgtattaa ctcgtcgacc cgacgcaatt cgacgactcg acaaaactcg 2280 acgacaaccg cattgtatcg ttataacttt gctaaagctg actatgtgaa actgaacgac 2340 atgatatcaa tgtttaacaa tagctttcat tgttccaatt ttatttcact tgacgaagca 2400 gtatgttcat tctcgtcctt catgctgcaa gcttttgttg tatgcgtccc agttcagcgt 2460 cctaaaccta accccccctg ggcggaccgc acactcaaac gactcaaacg tgtaaaaaga 2520 gctgcttatc gtcactacca aacgcgccgc tgccaaagat ctcgctcgat ctactttgac 2580 acgcactctt tatactgcag ctataacagg tttcgatatc gcagatattt gagtaaaatt 2640 caacgtaacc tctgcaggtg gcctgactcg ttttggcgct tctacaacag caaaacaaaa 2700 tccacgcata caccgaaatc cattacgtac aaaggagcaa caagtgccaa cactaatgaa 2760 atgtgcaatc tcttcgcgga tcgcttcgca gattgcttct caccggccat gaatgatacc 2820 gataccattg atgctgctct cgtcaacact ccggctggag caattaacat gagcactcct 2880 ttcatcgaca gtgagatcgt tttatctgcc ctagcgcaac taaagccttc cttcgctcct 2940 ggacccgacg gaattccttc taccgtgctg aaacgctgtc aaacgacggt agcacctatc 3000 cttgcaaaat tgttcaatgc atcgctagcc aatggctact ttcccaaagc gtggaggaaa 3060 tcttggatgg ttcctattta caaaaagggc gacaggacag atgccatcaa ctaccggggt 3120 attacatcct tgtgtgccat tgccaaggtg ttcgaactag tgatatacaa aaatctgcta 3180 catgcatgcc gcagctacct aagtccgtat caacatgggt tcgtgccaaa aaagtcgact 3240 accacgaacc tggttgaatt tgtaacctat tgcactagtc aaattgatgc cggagctcaa 3300 gtcgatgcaa tttataccga tctaaaagca gcattcgact ctcttccgca cgcaattctt 3360 ctcgctaaac tcgataagct aggaattccc agcccgctcg tacagtggct taagtcgtac 3420 ctaattcatc gcacatacat cgtgaaaatt gataagcaca tgtccaaaga aatagtcagc 3480 agctcgggtg tgccacaagg aagcaacatt ggcccgcttc tcttcatatt gtttatcaac 3540 gatgttaccc tcgctttacc tcccgacagt atcagtctgt ttgccgacga cgcaaaaatt 3600 tttgcgccta ttaacaacac aggtgattgt acattcctgc aagactgcat cgaaattttc 3660 tgttcgtggt gcaagcgtaa tggactgact atctgcatcg agaaatgcta ctgtgtgtct 3720 tttagtcgat gcaggagccc agtgactggg acctacttca tggacggcac tgcagttaat 3780 cgacagaatc atgccaaaga cctgggcgtt ctgcttgact ctagtttgaa ctttaaacag 3840 catatcgatg acgttgtagc cagaggaaat caattacttg gcgtggttat ccggacaact 3900 aatgaattcc gcaaccccat gtgcatcaaa gctgtgtaca actgtatcgt tcgttcggtt 3960 ctggaatatt cgtgcgtagt ctggagccca actaccgctt cttcaattgc tcgaattgag 4020 gcgattcaac gtaagctcac gagatatgcc ctacgcctac ttccctggca ggatcgcaat 4080 aatcttcctc cgtatgctgc gcggtgccgt cttctaggcc ttgaacctct ttcggtcaga 4140 agacgcaatg cacagtgttc tttcatcgct ggattgctaa atggctctat cgactcatcg 4200 ccattgttgc atcgagtcga catctatgca ccatcccgaa cacttaggtc tagagaaact 4260 ctacggctcg ctcaaccccg ttccagtgct ggtcggtcag accctatgtt ccgcatgtcg 4320 gctgtcttca acactgtctc ggattgcttc gacttcgaca tctcaactca gtgcttcaag 4380 gaacgtctcc ggcttttgcc gtggccgcag tgaattgcga tgcaaatctt gtttttgcta 4440 tgtatttttt tttttttatt gaactgttac atcttaatta ggccatacgg ccgttgaaga 4500 ttaataaata ataataataa taataa 4526 // ID GYPSY35-I_AG repbase; DNA; ANG; 4409 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY35-I_AG is an internal portion of retrotransposon GYPSY35_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY35-I_AG; GYPSY35-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY35_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4409 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY35_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 62-62 (2004). XX DR [1] (Consensus) XX CC GYPSY35_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase CC RNase and Integrase is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY29_AG, GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, CC GYPSY34_AG, CC GYPSY36_AG, GYPSY37_AG and GYPSY38_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY35-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. CC The consensus encodes the 1364-aa GYPSY31_AGp gag-pol like CC protein CC (pos. 262-4353). CC The sequence of the LTRs flanking GYPSY35-I_AG is deposited as CC GYPSY35-LTR_AG. XX FH Key Location/Qualifiers FT CDS 262..4353 FT /product="GYPSY35_AGp" FT /translation="MNELEQRILDLCEEFTKTGLAKRCQQKGLPSTGTKEE FT MARSLVESEEMSIADDSTMDQFHDMEEKSEAAASSAAAKVEPSIVQQPYSF FT RDVEEGIEAFCADGSVDFAAWLTDFEDVATMAGWSDEQKYVMCRKKLVGTA FT RSFLLTMRGVSSFGALRKALVAEFGEKVRPMDIHRQLASRRRKKGESALDF FT IYSMQRIAKQIDLDEESVCEYIVDGIAESEAQRAALYEARTVRELKEKIAW FT QERAAQKQTRAVRSAWHERAGQKENKSLPVQDTIPVKNATKARCLNCGVMG FT HMVRNCPEQRSGPRCFQCNEFGHRANQCGQRNKQQPGTSTNLINHVQAELP FT TKMVQLGGKNIKAVIDTGSEVSIVRQDTLQDLGCVQQRMEQSIQQLRGFGG FT IIQKPIGELVTNIGIDDVEYEIGLLVVSPNSMKVPMLVGMNFLRSVCYAIT FT NDGVQIFKKKSDNEESGSDHESAVYKIDCEQIQELEVPEKFRDKVQALKSQ FT YEASVVDNYEDNCPIQMKIVLEENASPFRHTPRRLPLMEEEAVENHVGEWL FT RKGIVRPSTSDFASRVVVVKKKDGTSRICIDFRKLNTMVLKDGFPVPLLEE FT VLEQLQSAKVFTVMDLENGFFHVPIEEASRKYTAFVTKSGLFEFNRVPFGL FT CNSPAVFIRFVNYVFQNLLRENVLHIYMDDIVVCGSTAEECLEKMGKVFEV FT AAQNGLKIKWKKCRFLQTSIEFLGHHIENGSIWPGQEKVSAVKNFPIPKNI FT RAVQAFLGLTGFFRKFIKGYSIMAKPLTDLLRKEVDFKMGSIEIEAFNDLK FT SALVKEPVLKIYDRNAKTEVHTDASIKGFGATMLQWFDGELHPVYFWSKKA FT TDAEAKHHSYVLEAKAIYLALKKFRHYLLGIKFKLVTDCNAFKQTLKKADV FT PREVVQWVICLEEFSFEVEHRAGDRLKHVDCLSRYPQLTMMVTCEVTARVK FT RSQQRDDSIKAIVEILNTRPYEDYKIKGELLYKCVEGQDLLVIPRDMEKQI FT ISDIHSEGHFGLCKTMHAIKQRYFIPHLERKVKLILNSCVKCIIHNKKLGR FT KEGFLHPIDKGDQPLQTLHLDHVGPMDATGKQYKYVLTVVDGFSKFVWLYP FT TKTTGAEETLRKLECWSTIFGYPARIVTDRGSAFTANAFGEYANRHGIQHV FT VCTTGVPRGNGQAERVNRTMLSVLTKLSSEDPGKWFKVVPSVQRAINSHIH FT VAAGKSPFELMFGTKMRTSEVEDLSRLLEEEAYDRFDCERQEMRKGAKNDI FT CNAQNVYKKHYDLSRKPEYAYVVGDLVAIKRTQFVAGRKLASEFLGPYEVI FT KINRNGRYKVKRAANCEGPNITTTSCDNMKLWSFAISNDKVFSEDEAESEN FT E" XX SQ Sequence 4409 BP; 1390 A; 734 C; 1202 G; 1083 T; 0 other; tttggtgtca gaagtgggat gaaaatccag caaacagtgg ttttacgtgt agcttgtgga 60 atgaacaaga gtgattagaa cgttgagtag attgatctaa gaaaatcaat tgaagtgagg 120 gaaaagtgat cccgtgacta attgatttga ggtgatcaga gttagactca aatttctgtg 180 gagtgaaagt cagtcaaagt gaaggaaagt gttcgttcct gtgaccgatt gattttcgcc 240 aaaaaagttg actgatcaaa aatgaatgag ctggagcagc gcattcttga tctgtgtgaa 300 gaattcacga agactggcct agcgaaaagg tgccagcaaa aagggttgcc ctcaacggga 360 accaaagaag aaatggcaag atcactagtg gaaagtgaag aaatgtccat cgcggatgat 420 tctacaatgg atcaatttca tgacatggag gaaaaatctg aagcggcagc tagcagtgca 480 gcagcgaaag tggagccgag catagtacag cagccctact cctttaggga cgtggaagag 540 ggaattgagg cgttttgtgc agatggctcg gtggattttg ctgcgtggct gaccgatttt 600 gaggatgttg cgaccatggc gggatggtca gatgagcaaa aatatgtgat gtgtcgaaaa 660 aagttggtgg gcacagcgag aagttttttg ctaacgatgc gtggggtctc atcgttcggg 720 gctcttcgca aggcgctagt agcagaattc ggagaaaaag tgcgtccgat ggacattcat 780 cgtcaacttg catcacgacg acggaaaaaa ggggaatctg ctttggattt catttactct 840 atgcagcgaa ttgccaaaca gattgacctt gacgaagaaa gtgtgtgtga gtacattgtt 900 gatggtatcg cggagagcga agcgcaacgc gcggcattgt acgaagcgcg cacggtgaga 960 gagttgaaag aaaaaattgc ctggcaagag cgtgcagcgc agaaacaaac acgtgcggtg 1020 agatctgcat ggcacgagcg cgcggggcag aaagagaaca aaagtttgcc ggttcaagac 1080 acaattcctg tgaaaaatgc aacaaaagca cgttgtctta attgtggtgt gatgggacat 1140 atggtacgga attgtccaga acaaagaagt ggaccacggt gttttcagtg caacgagttt 1200 gggcatcgtg cgaatcagtg tgggcaaagg aataaacaac aaccgggtac aagtacaaac 1260 ctgataaacc atgtccaagc tgagctaccg acaaaaatgg tccagttggg gggaaaaaat 1320 ataaaagcgg ttattgacac aggcagcgaa gtgtcgattg tccgtcaaga cacattgcaa 1380 gatttagggt gtgtacaaca aaggatggaa caatccatac aacaattgcg tggttttggt 1440 ggaatcatac aaaagccgat aggcgaattg gtaacaaaca ttggcatcga tgacgttgaa 1500 tacgaaatcg gtttgctggt ggtctctcca aattcgatga aggttccaat gctggtaggc 1560 atgaattttc ttcgaagtgt ttgttacgct ataaccaacg atggagtgca gatttttaag 1620 aaaaaaagtg acaatgagga atcgggtagc gatcacgaaa gtgctgtgta taagattgat 1680 tgtgaacaaa tccaagaact ggaagtgccg gagaagtttc gagacaaggt acaagcgctg 1740 aaaagtcaat acgaagcttc agtggtggat aattacgaag acaattgtcc catacaaatg 1800 aagatagtgt tggaagaaaa cgcgtcacct tttcggcata cgccccgtcg cttgccttta 1860 atggaggaag aagcagtcga aaaccatgta ggggaatggt tgaggaaggg aattgttcgt 1920 ccatcgacgt cagactttgc aagtcgcgtt gtggtagtga aaaagaaaga cggcactagt 1980 cggatttgta tcgactttcg gaaactgaat acaatggttc taaaggatgg gtttccagtt 2040 ccattgttag aagaagtttt ggagcagctt cagagtgcaa aggtgtttac tgttatggat 2100 ttggaaaacg ggttttttca tgtgccgata gaagaagcta gccgcaagta taccgcattc 2160 gttacaaaat cgggattatt tgaatttaat cgtgtgccat ttggactatg caactcgcca 2220 gcggtgttca taaggtttgt gaattatgtt tttcaaaatt tgttaagaga gaatgtgttg 2280 catatttaca tggacgatat tgtggtttgt ggtagtacgg ctgaggagtg tttagaaaaa 2340 atggggaaag tgtttgaagt agcagcccaa aatggattaa aaataaagtg gaaaaagtgt 2400 cgtttccttc aaacatcgat tgagtttttg ggacatcata ttgaaaacgg atcgatttgg 2460 cctggacaag aaaaagttag tgcggtgaaa aatttcccta taccgaaaaa cataagagca 2520 gtgcaagcgt ttttgggttt gacgggattt tttaggaaat ttataaaagg atattctatt 2580 atggccaagc ctcttactga cttgttgcgg aaggaggtag attttaaaat gggtagtata 2640 gagatcgaag cctttaatga tttgaaaagt gcgttagtga aagaaccggt gctcaaaata 2700 tacgatcgaa atgctaaaac agaagtgcac acggatgctt ccatcaaagg gtttggggca 2760 acaatgttgc aatggtttga tggagagctg catcccgtat atttttggag caagaaggca 2820 accgatgcag aggccaaaca tcacagctat gtgttggaag caaaggctat ttatcttgcg 2880 ttgaaaaaat tcaggcatta tctcctgggg ataaagttca agcttgtgac cgactgtaat 2940 gcattcaaac aaacattgaa gaaagcagac gttccgcgag aggtggtgca gtgggttatt 3000 tgcttggagg agtttagttt tgaggtggaa catcgtgccg gagaccgatt gaaacacgtc 3060 gattgtctca gtcgatatcc acagttaaca atgatggtga catgtgaagt gacggctcgt 3120 gttaaaagaa gtcaacaaag ggacgattca ataaaagcga tagttgaaat tcttaacacc 3180 agaccgtacg aagattacaa aataaaagga gaacttttat acaagtgtgt cgaaggacaa 3240 gaccttttag tgattccgcg tgatatggaa aaacaaataa tttcggacat tcatagtgag 3300 ggtcattttg gactatgcaa aaccatgcac gccataaaac aacggtattt tataccgcat 3360 ttagagcgaa aagtgaaact gattctcaac agctgtgtga agtgtatcat ccacaataag 3420 aagttaggac gaaaggaagg gttcctgcac cccattgaca agggagatca gccactgcaa 3480 acccttcacc tcgatcatgt agggccgatg gacgccacgg gaaagcagta caaatacgtt 3540 ttgactgtag tggatggttt ttccaagttc gtctggttgt acccgacaaa aacgactgga 3600 gcggaggaaa cgttgaggaa gctggaatgc tggtcgacaa tttttggcta tccggctcgt 3660 atcgtaaccg acaggggctc agctttcaca gccaacgcat ttggtgagta cgctaatcgc 3720 catggtatac agcacgtcgt ttgtacaaca ggagtgccga ggggcaacgg tcaagcggag 3780 agagtgaacc gcaccatgtt atcggttttg acgaagttgt catcggaaga tcctggaaag 3840 tggttcaagg tggtgccttc tgtgcagcga gcgattaact ctcatattca cgtagcagct 3900 ggaaagtcgc cattcgagtt gatgttcggg acaaagatga ggacgtcgga ggtggaagac 3960 ttgtcgcggt tactggagga ggaagcgtat gatcgttttg attgtgaacg gcaggaaatg 4020 cgaaaaggag cgaagaatga catttgtaat gcgcaaaatg tatataaaaa acattatgat 4080 ttatcacgaa agccggaata tgcgtacgtg gtaggagatt tggtagccat caaaagaaca 4140 caatttgtag ccggaagaaa gcttgctagc gaatttttgg gaccgtatga agttattaaa 4200 attaatcgta acggacgata caaggttaaa agggcagcca attgcgaggg accgaacata 4260 acaacaacaa gttgcgacaa catgaagtta tggtcgtttg ccatcagtaa cgacaaagta 4320 ttttcggaag atgaagcaga atcggaaaat gaataaaatg aattagaaga aaaatcttca 4380 ggggctgaag atagggtagg acggccgaa 4409 // ID HATN3_AG repbase; DNA; ANG; 348 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE HATN3_AG is a hAT-like nonautonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW 8-bp TSD; HATN3_AG; nonautonomous DNA transposon; KW hAT superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-348 RA Kapitonov V.V. and Jurka J.; RT "HATN3_AG: a family of nonautonomous hAT-like DNA transposons RT from African malaria mosquito."; RL Repbase Reports 3(2), 20-20 (2003). XX DR [1] (Consensus) XX CC HATN3_AG is a family of nonautonomous DNA transposons that CC belongs CC to the hAT superfamily. HATN3_AG copies are ~3% divergent from CC the CC consensus sequence. HATN3_AG has 15-bp terminal inverted repeats. XX SQ Sequence 348 BP; 61 A; 88 C; 113 G; 84 T; 2 other; cagtggcgga tttagggtgt tgggggccct aagcgttaaa aggagtggag gccccccggt 60 tggtgtcgag cggggggggg gggggggtgc atattgttgc attaggtttt gaatcaaacc 120 cacttcaatg gtttgctggc gggggggggg ggkgtstgct cagaatccct gtactgggcc 180 cctatagttt atgagtccct tcatccatgg ggcccctaac tagcacagaa gctcttgttg 240 agactactgt acacattgtt ctatatgctg gacttcctta cacggggccc ctacattgac 300 ggggcccccc gcggccgctc agtccgcgca ccgttaaatc cgccactg 348 // ID GYPSY17-LTR_AG repbase; DNA; ANG; 869 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY17-LTR_AG is an LTR of retrotransposon GYPSY17_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY17_AG; GYPSY17-I_AG; GYPSY17-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-869 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY17_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 176-176 (2003). XX DR [1] (Consensus) XX CC GYPSY17-LTR_AG is a long terminal repeat of GYPSY17_AG CC (its internal portion is deposited as GYPSY17-I_AG). XX SQ Sequence 869 BP; 261 A; 232 C; 170 G; 206 T; 0 other; tgtagcagat ttactgctaa cagtaggtca catctaactt agcattgcca aaagccaggt 60 aaaaagacgt taaccaccag aagcgcaagc gcatagccag aagccaggta aagaaacgtt 120 aaacaccaga agcgtaaacg ttctcgtgtt ttaccccaga acataaataa ggaaaacgcg 180 ccacagataa gcaattaaac ttccgtaaaa cccaggcata aacaggaaaa cgttcgctac 240 atccacaatg cacggatagt tgataaaaca caggcacgct aggtataatt taagaaacac 300 ctgtaaaaac ccaaacccgg aaaaccaaat gaaagctcac atttgacaaa catctggttg 360 ccaaatgtgt cacaactttc acaccatctt tcttataaat aaagctcgaa ttgatggcga 420 agtcagttca ccttccatag acacgtagat cactcgtgtt ctggtgcatc tcaccaattt 480 gcagattccg ccgaaggctc tgctcatatt tgataatcat ttttggttat cagctgttca 540 gtcggttgac cgatcagcat taccctcatc gaaataaagt gcacgtttgc actagttcca 600 cccaacttct cccacggtgt ccgattccgc ctcggtcatc aactcgacgc gcaaggtcga 660 tccagtccgc gattccgccg acgtaagcaa gtgcttcttt tcggacttag cgtaagtgtg 720 aacaggttga cgtaaccgat acgcgacgaa tgtgtttaca ttttggtgtc gcgttccgta 780 cgatcccctt cctcgtccca caccctttca ctgtgctccg cgcattgact caccgaaacg 840 gtttgagaag cgcggccgaa cctaccaca 869 // ID DNA-1_AG repbase; DNA; ANG; 310 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; DNA-1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-310 RA Jurka J.; RT "Non-autonomous DNA transposons from mosquito."; RL Repbase Reports 10(9), 1426-1426 (2010). XX DR [1] (Consensus) XX CC 3bp tsd (taa/tta). >97% identical to consensus. XX SQ Sequence 310 BP; 91 A; 60 C; 67 G; 92 T; 0 other; ggcccgtgtc tgtgctccat cggatgcgat tttatcggaa ggcctgtcaa aattttatat 60 gggatttgac agataacgtc ggacgtgcga tttcgtcgta cgacggaatc aaaaaatttt 120 gatttcgtcg gacgccgcat ccgattttat ctgtcaaact aaatgtgtgg tgttttgtag 180 gaacaatctc aacttaattt ttcataatcg ttatattatt aaaatcatcc aaataatcgc 240 aatattatac gaaaaagcgt ttgacagcac gtgcgataaa atcgcatgcg atggagcaca 300 gacacgggcc 310 // ID GYPSY53-I_AG repbase; DNA; ANG; 4127 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY53-I_AG is an internal portion of retrotransposon GYPSY53_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; CsRn1 lineage; GYPSY53-I_AG; GYPSY53-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY53_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4127 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY53_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 98-98 (2004). XX DR [1] (Consensus) XX CC GYPSY53_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the CsRn1 CC lineage of other organisms. CC GYPSY48_AG, GYPSY49_AG, GYPSY50_AG, GYPSY51_AG and GYPSY52_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY53-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 295-aa GYPSY53_AG1p gag-like CC polyprotein (pos. 115-999) and the 1034?aa GYPSY53_AG2p CC pol-like polyprotein (pos. 1003-4104). CC The sequence of the LTRs flanking GYPSY53-I_AG is deposited as CC GYPSY53-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1003..4104 FT /product="GYPSY53_AG2p" FT /translation="RHSADAMGESGEHSSTVTIFRLKITDSTSCMQFLIDT FT GADVSVVPRGLHSAKVKPTSLQLFAANGTPIKVYGEVLLKVNLGLRREFLW FT SFLIADVTSGIIGADFLRNFDLLIDLKRNRLIDNTTHLESNGSLAKSSKYS FT IKTFNSVSPYADLLAQFPTITRLAPPGTISQSTICHRIETTGQPVYARPRR FT LPPDKLEAARTEFEHLMKLGICRPSSSNWASPLHMVKKADGSWRPCGDYRA FT LNAQTVPDRYPLPYLQDFTTILQGKTIFSKVDLQKAFHQVPIHPDDVPKTA FT ITTPFGLFEFSFMTFGLRNAAQTFQRLIHEVVRGLDFIFPYIDDLFIASSS FT PDEHREHLRMLFERLKQHNLAINVAKCEFGRDTLDFLGHSVSAEGIRPLLE FT RVEAVKNFKQPSTVKDLKSFLAMINFYRRFVPNAIQVQIPLLTMIPGNKRN FT DRTPLTWTDETMAAFNRCKQQLAEAVLLAHPAKSAELSLWVDASDIAAGAV FT LHQIVDGNAQPLGFFSKKFDKAQLRYSTYDRELTAIFLAVRHFKFMLEGRN FT CHIYTDHKPIIFAFHQKLDKASDRQARQLDFIGQITTDIRHIVGKENIVAD FT LLSRIAAIQTVPTLDYDALADDQKIDEEIQDIQQGKLKLSLNLKSFSIPGS FT KKQLLCDCSGDRIRPFFTRRFRNTALQATHHLSHPGTRATATMMTERFVSP FT NIRKESIAFARNCLQCQRSKVTRHTQSPVSRYSAPDSRFAHINIDIVGPFP FT PSRGNRYCLTIIDRFTRWPEAFPIPDMTATTIADALVSGWVSRFGVPACIT FT SDQGRQFESSLFSELARLLGSSHFRTTAYHPQSNGIIERWHRTLKSAILCH FT DADHWSEHLPIILLGLRTVYKEDIKASPAELVYGTTLRIPSEFFVDVSQST FT NEAEFVTKLRRAMRDLRPQETARHGNRSVFMHQDLQTCRNVFVRNDSIRPS FT LSLPYEGPFEVMNRTDKFFKLNIRGRTTNVSIDRLKPAYALEEQQTPIITK FT HAATTPTLTPSAKVTRSGRQVIIPSRYR" FT CDS 115..999 FT /product="GYPSY53_AG1p" FT /translation="MMNSEDKKTDAAADGTSRDAAQSLVGEIKAPRLNPPS FT MTDTNIETYFMSLEFWFAASGISSAYDVRRYNIVMAQVPPNKLTELRSIID FT GTPAYDKYKYIKRELIVHFADSQQRRLQRVLSDMPLGDMKPSRLFNEMKRV FT AGDALSEEVLLDLWAARLPPHAQVAVIASRGDAADKTTIADAIVDSMGLRK FT IDAIDHMQNFAMAPSHIREQEVVDPIQELHREIAELSKRLERVLPNRRNER FT GRSLSRSRRSDWNDNRSLREPSTGPCWYHRTFGNDARRCRKPCTTGARFAS FT NQQ" XX SQ Sequence 4127 BP; 1107 A; 1086 C; 948 G; 986 T; 0 other; gtggtgaccc cgacgtgatc caccatcatt taatccgtaa tttgctaatt cattaccgca 60 tcgaaccgtg catcgcaaat taagtgtaaa ttcggttgat tacgttttcg cgcaatgatg 120 aatagtgaag ataaaaaaac cgacgccgcc gccgacggca cgagtaggga tgcagcacaa 180 tcgctcgtag gagaaataaa agcaccgaga ctaaatccgc ccagcatgac cgacacaaac 240 atcgaaacct atttcatgtc gctggaattt tggtttgcgg catcggggat cagctcagca 300 tacgatgtgc gccgatataa tatcgtgatg gctcaggtgc cacccaataa gctgacggaa 360 ctacgctcga tcatcgatgg cactccggca tacgataagt acaaatacat caaaagagag 420 ttgatagtgc attttgccga tagccagcag cgccgtcttc aacgtgtcct ctcggacatg 480 ccgctaggag acatgaaacc tagtcgttta tttaacgaga tgaagcgagt agcgggcgat 540 gcattgagtg aggaagtgct gcttgatttg tgggcagctc gattgccgcc acatgctcaa 600 gtagcggtga tcgcttctcg aggagatgct gcggataaaa cgaccatcgc cgacgccatc 660 gttgattcga tgggactacg aaaaatcgac gccattgacc acatgcaaaa tttcgcgatg 720 gcgccttccc acatcagaga acaggaggtt gttgacccta tacaggaact gcatcgtgaa 780 atcgccgaat tatctaaacg gctcgaaaga gttttgccca accgaagaaa tgaacgcgga 840 cgatcgcttt cccgttctcg tcgtagtgat tggaacgata accgtagctt aagggagcct 900 tctaccggac cgtgttggta tcatcgcacg ttcggcaacg acgcccggag gtgtcgaaaa 960 ccgtgtacga ccggtgcacg atttgcatct aaccagcaat gaagacactc tgctgatgct 1020 atgggggaat ccggcgaaca ttcctccaca gtcacaatct ttcgcttaaa aattacggac 1080 tcaaccagct gcatgcagtt tttaatcgac acgggtgccg atgtgtccgt agtgccgcga 1140 ggattgcatt cagcaaaagt gaagccaacc tctctacaac tgttcgccgc aaatgggaca 1200 ccaattaaag tttatggaga agttctacta aaggtcaatc ttggcctccg acgagaattt 1260 ttgtggagtt tcctgatcgc cgacgtcaca tcaggaataa ttggtgccga ttttttacga 1320 aatttcgatt tattaatcga tttaaaaaga aatcgcctca tcgacaacac cacccatctc 1380 gagtctaacg gatcgcttgc caaatccagt aagtactcca ttaaaacgtt taattcagtg 1440 tcaccatacg cagatttact ggctcagttc ccgacgatca cccgtctcgc acccccgggg 1500 acgatcagcc aatcgacgat ctgtcatcga attgaaacta caggacaacc ggtatacgct 1560 cgaccaagac gcctgcctcc cgacaaacta gaagcagctc gaactgaatt cgaacacctg 1620 atgaaactcg gcatatgccg accctcaagc agtaactggg ccagccctct tcatatggta 1680 aaaaaggcgg atgggtcctg gcgaccatgc ggggactacc gtgccttaaa cgcccagacg 1740 gttccagatc gataccctct gccctacctt caggacttca cgaccatttt gcaaggtaaa 1800 actatatttt ccaaagttga cttgcagaag gcgtttcacc aggttcccat ccatcctgac 1860 gacgtaccta aaacggccat cacgacacct tttggcctct ttgagttctc ctttatgacg 1920 tttggattaa ggaacgcagc tcaaaccttc caacgcctga ttcacgaggt cgttcgggga 1980 ctcgatttca tctttccgta tatcgacgat ctgttcatcg cttcgtcttc gcctgatgag 2040 catcgggagc atttacgcat gctgtttgag aggctgaagc aacacaactt ggcaataaat 2100 gtagccaaat gcgaattcgg gcgtgacaca ctcgattttt tgggacattc tgtctccgct 2160 gaggggatcc gtccgctttt ggagcgcgtc gaggccgtca agaatttcaa acagcccagt 2220 acagttaaag atttgaaaag tttcttggcc atgatcaatt tttatcggcg gttcgtccca 2280 aacgcgatcc aggtgcagat tcctctgctg acgatgattc ccggcaataa gcgaaacgat 2340 cgtacgccgc tgacctggac cgatgagacg atggccgcat ttaaccgttg caagcaacag 2400 ctagccgaag cagttttgct agctcacccc gccaaatcag ccgaactatc tctctgggtc 2460 gacgcttcag atatcgctgc aggcgcggta ttgcaccaaa ttgtagatgg aaacgcacaa 2520 ccgctaggat ttttttcgaa aaaattcgac aaggctcaac ttcgctacag cacgtatgat 2580 cgcgagctta cagcgatttt cctggctgtg cggcatttca agttcatgct agaaggacga 2640 aattgccaca tctacacgga tcacaagccg atcatcttcg cctttcatca aaagctcgac 2700 aaagcttccg atcgccaagc tcgtcaattg gacttcatcg gacagataac gacggacata 2760 aggcatatag tgggcaagga gaacatcgtt gctgatttgc tctcccgcat cgctgccatc 2820 caaactgtgc ctacgctcga ttacgacgcc cttgctgatg accaaaaaat cgacgaggaa 2880 atccaggaca ttcagcaagg taaattaaaa ttgtcattga atctaaagtc tttttcgatt 2940 ccaggcagta agaaacaact gctatgtgat tgttcaggcg atcgcattcg accctttttt 3000 acgagacgtt ttcgtaacac agcgctgcag gcaacacatc atctttctca ccctggcact 3060 cgtgcaacgg caacaatgat gacggagaga tttgtatcgc ccaacattcg caaggaaagc 3120 atcgcttttg ctagaaactg cctacagtgc caacgatcga aagtcactcg ccacacgcaa 3180 tcgcctgtgt ctcggtactc agcacccgac agtcgatttg cgcacatcaa tatagacatc 3240 gtggggcctt ttcctccgag tagaggaaac cgttattgcc tgacaatcat cgacagattc 3300 actcggtggc cggaagcatt ccccattcca gacatgaccg cgacaacaat cgccgacgct 3360 cttgtttccg ggtgggtatc cagatttggc gttcctgcat gcataacgtc cgatcaaggg 3420 cgccagttcg agtcatcgct gttttccgaa ctagctcggt tgctcggatc cagccatttc 3480 cggactactg cgtaccatcc ccaatcgaat ggtatcatcg aacgatggca taggacactt 3540 aaatcagcaa tcctctgtca cgatgccgac cactggagcg agcatttacc gatcatattg 3600 ctcggtttgc ggacagtgta caaggaggac attaaagctt cacctgctga gttggtatac 3660 ggaacgaccc tccgaatacc atcggagttt tttgttgacg tatcgcaaag cacgaatgaa 3720 gctgaatttg tcacaaaact acgaagggct atgcgtgatc ttcgccccca ggaaaccgcg 3780 cgacacggaa atcgaagtgt attcatgcat caagacctgc aaacgtgcag gaacgtattt 3840 gtgcgtaacg attcaattcg accatcgtta tcacttccgt acgaaggacc gttcgaggtg 3900 atgaaccgaa ccgacaagtt ttttaaactg aacatccgtg ggcgcacaac taacgtttcg 3960 atagatcgtc taaaacccgc ttacgcgctg gaagaacagc aaacgccaat tattacgaaa 4020 catgctgcca caactcctac tcttactcct tcggctaaag tcacgcgttc gggtcgacaa 4080 gttataatcc cttctcgtta ccgttaactt gtctaggagg ggagtac 4127 // ID COPIA5-I_AG repbase; DNA; ANG; 4021 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA5-I_AG is an internal portion of the COPIA5_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA5-I_AG; COPIA5-LTR_AG; COPIA5_AG; Copia clade; integrase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4021 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA5_AG, a family of autonomous, copia-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 55-55 (2003). XX DR [1] (Consensus) XX CC COPIA5_AG is a young family of Copia-like LTR retrotransposons. CC COPIA5-I_AG, an internal portion of COPIA5_AG is flanked by 100% CC identical COPIA5-LTR_AG LTRs. The consensus was reconstructed CC from 5, CC >99% identical, copies of COPIA5-AG internal sequence. CC The consensus sequence encodes the 1312-aa COPIA5_AGp protein CC (positions 76-4011). XX FH Key Location/Qualifiers FT CDS 76..4011 FT /product="COPIA5_AGp" FT /translation="MALPSSSNSSSSFVGSTSIPSIERLLGRENWTSWKFA FT AKTFLQLEGLWEVVKPVKKEDGTFETVDEKKDLQARLKLILLLDPTIYVHI FT EDAESARSAWDKLEMAFEDKGLSRQIGLLHKLIKSDLDTCGSMNSYVNQVI FT STANQLNAIGFKLPDLWVGMILLAGLPEEYRPMILAMENSGVAITGDYVKT FT KLLQERPLLRNENVQALATNKRREVFKPKQKPSSSKGPQCRKCGRYGHIAK FT FCKDDRKGGTTLCTVLSTFGNSEANEWILDSAAYAHMTSNKDLLSNLQNAT FT GKVVAANGGTLDIVARGTAIIQPKCMEEIVTISDVKLIPGLTSNLLSVSRM FT VEKGYTVQFNTKGCKVYNPSGKLVLTGIHNNNQFKVEQVANNMQEALSCNT FT AESFELWHKRMGHLGAVNLKKLAGGLATGITLKNMDGADCRVCPLGKHSKL FT PFPKKGSRAENVLDLVHSDINGPMETHSLGGHRYYITFIDDKTRRIFVYFL FT KTKSEVEVFEAFKRFHAMAERQSGRKLKTLRTDNGKEYMNKSLTSFLQKEG FT IRHETSNGYTPQQNGLAERANRTIVEMARCLLFEGNMTKGFWAEAVSTAVY FT LINRSPTRGHNLTPEEAWNGRKPDLSHLRVFGTKAMVMIPKEKRRKWDPKS FT HECVLTGFDEETKGYRLYDHKKKQTIISREVIFLDEGCSSNVTVTAMQEPR FT RTFVRLDIEETTSIQPVQIPVPNFAPDAGSDSTLEENDAETDGELDESTTD FT DENNTSTETMVQSEDDSSDFLGFSGSDVDGFVAVASGMYSGTYADPVSHQE FT ALARDDSKEWGTAMQEEYNALMENKTWTLTSLPKGRQAIKCKWVYRTKCDS FT SGNLTRYKARLVVKGFSQRKGEDYDETYAPVVRYCSLRYLFALAIKHDLMI FT DQMDAVTAFLQGDLDEDIYMEQPPCFVDGQRKTLVCKLNKAIYGLKQASRV FT WNNKLDAALQRFGLIPTQYDPCVYVGSEGGKIIIVAIYVDDMMIFSNDVAW FT KKQLKKHLCSCFRMKDLGAAQHCLGIRIQRTKETIKLDQEIYIESILKRFN FT MDKCKPVAVPMNNSEKLTKEESPKSNNETAAMKDVPYQEAVGCLMYLAQST FT RPDILYAVNMLSRFNKNPGQKHWNGVKHVMRYLRGTSNFKLVYKKNVDSKI FT IGYCDADWGSDPDERKSTTGNIFMAQGGAISWMCKKQPTVALSTCEAEYMS FT VSAAVQEASWWRGLSAKLANADEVIEIRCDNQSCIAIAKNGGYHPRTKHID FT IRHHFIKDALSRGIVTLEYVSTEDQIADGLTKPLQRTKFEISRELMGISEA FT " XX SQ Sequence 4021 BP; 1253 A; 755 C; 1017 G; 996 T; 0 other; ggttatgggc ccagactcga aaattgatta aagatccaga atatcgaaga cttgaatcta 60 gaaagtttag tgaaaatggc gcttcctagc tcaagcaatt cctcgtcaag tttcgtcggt 120 tcgacaagta ttccctccat tgaacgtctg ctcggtcggg aaaattggac atcctggaag 180 tttgctgcta aaacgtttct acaactcgaa ggactttggg aagtagtaaa acctgtgaaa 240 aaagaggatg gaactttcga aacggtggac gaaaaaaagg atttgcaagc cagattgaag 300 ctcatccttt tactcgaccc aacaatttat gtccatattg aagacgcgga atcggctcga 360 tctgcttggg acaaattgga aatggcgttt gaggataaag ggctttctag gcaaatcggc 420 ttgcttcaca agctgattaa gtctgatttg gatacatgtg gctcaatgaa ttcatacgtg 480 aatcaggtaa tatccacggc aaaccaacta aatgccattg gtttcaaatt gcccgacttg 540 tgggtaggaa tgattctttt ggctggttta ccggaagagt atcgaccgat gattttggct 600 atggagaatt ctggtgttgc aatcacaggc gactatgtga agacaaaact tcttcaagag 660 aggccgttgc tacgtaacga aaatgttcaa gcgttagcca ctaacaaacg tcgtgaagtt 720 ttcaagccaa aacagaagcc ttcttcgtcg aaaggtccgc aatgtaggaa atgtggccgg 780 tatggtcata ttgcgaaatt ctgcaaggat gatcgaaaag gaggaacaac gttgtgtacg 840 gtgctatcga catttggaaa cagtgaagct aacgagtgga ttttggattc ggcagcgtat 900 gcgcacatga ctagcaacaa ggatttattg agcaacctgc aaaacgcaac aggtaaagta 960 gttgctgcta acggaggaac tctggacatt gtcgctcgtg gaactgctat aatacaaccg 1020 aaatgcatgg aagagattgt aacgatcagc gatgtaaagc tgattccagg tttaacatcg 1080 aatttacttt cggttagcag gatggtagaa aagggatata ctgttcaatt caacaccaaa 1140 ggttgcaaag tatataaccc tagcggcaag ttggtgctta ctggtattca caacaataat 1200 cagttcaagg tggaacaggt agcgaacaac atgcaggagg ctctgtcgtg taacacagca 1260 gaaagtttcg aattgtggca taaacggatg gggcatctgg gtgctgtgaa tctcaagaaa 1320 cttgctggtg gtttggcaac tggcatcaca ttgaaaaata tggacggtgc ggattgcaga 1380 gtttgcccgt taggcaaaca ttcgaagtta ccgtttccga agaaaggttc tcgagctgaa 1440 aatgtcttgg atttggttca ttcggacatt aatggaccga tggaaacgca ctcgcttggt 1500 ggacatcgat actacatcac gttcatcgat gacaagacga ggcgtatatt cgtttatttc 1560 cttaagacga aatctgaagt agaagttttc gaagccttca agagattcca cgcgatggcg 1620 gagcgtcaaa gcggaaggaa gcttaaaaca ctacgtacag ataacggtaa ggaatatatg 1680 aacaaatccc taacatcgtt cttgcagaaa gagggcatcc gccatgaaac ttcgaacgga 1740 tacacaccac agcaaaatgg actggcggaa cgtgccaaca gaactattgt ggaaatggcg 1800 cggtgtttac tgttcgaagg aaacatgacc aaaggttttt gggcggaggc ggtttcaaca 1860 gcagtttatc ttatcaatcg ttctcctaca cgtggacaca atttaactcc tgaagaagcg 1920 tggaatggaa gaaaacctga tctttcacat ctgcgtgtgt ttggaacgaa agcgatggta 1980 atgattccga aggagaagcg acgtaaatgg gatccgaaat ctcatgagtg cgttctgact 2040 ggctttgatg aagaaacgaa aggatatcgt ctgtacgacc ataaaaagaa gcaaacgatc 2100 attagccgcg aagtaatttt tttagatgaa ggttgttcat cgaacgtgac agttacagca 2160 atgcaagagc caagaagaac attcgttaga ctggacatcg aggaaacaac ttcaatccaa 2220 cctgtgcaaa tcccggttcc caattttgca cccgatgctg gatcagacag cacgttggaa 2280 gaaaatgatg cagaaacgga tggtgaactc gatgaaagta cgacagatga cgaaaacaac 2340 actagtacag aaacgatggt acaatctgaa gacgattcga gtgattttct tgggttttcc 2400 ggaagcgatg tcgatggctt tgtagcagtt gcgtctggta tgtacagtgg aacatatgct 2460 gatcctgttt cacaccaaga agcactcgct agggatgata gcaaagaatg ggggactgcc 2520 atgcaggagg agtataacgc tctgatggag aacaaaacct ggacgctaac ttcgctcccg 2580 aaaggaagac aagccataaa atgcaaatgg gtgtatcgga ctaaatgcga ttcatctgga 2640 aatttaactc gatacaaggc tcgtttggtc gtcaaaggat tctcacagcg aaagggggaa 2700 gattatgatg aaacgtacgc tcccgtggta cggtattgtt cgctgcgata cctttttgcc 2760 ctggctatca agcatgatct gatgatagat caaatggatg cagtaacagc atttctgcaa 2820 ggagatcttg atgaagatat ttacatggag caaccacctt gcttcgttga tggacagcgg 2880 aaaacgttgg tatgtaaact taacaaagct atttatggtt tgaagcaggc gagccgagtt 2940 tggaacaaca aactggatgc agcattacaa cggttcggcc tgataccaac tcagtacgat 3000 ccttgcgtgt atgtaggtag cgaaggaggt aagatcatta tcgttgctat atacgttgac 3060 gacatgatga tttttagcaa cgatgtggca tggaaaaagc agctgaagaa acatctttgt 3120 agttgtttcc gtatgaagga tttaggagca gcacagcact gcctaggtat aaggattcaa 3180 cggacaaaag aaaccatcaa gttggatcaa gaaatctaca tagaatctat tttaaagagg 3240 ttcaatatgg ataaatgtaa accagtggca gttccaatga acaacagtga gaagctgacc 3300 aaggaagaaa gtccgaagag taacaatgaa actgctgcga tgaaagatgt gccgtatcaa 3360 gaagccgttg ggtgtttgat gtatttggca caaagtactc gaccagacat tctgtacgcg 3420 gtgaacatgc tgagtcgatt caacaaaaat ccaggacaaa aacactggaa cggagtgaag 3480 catgttatgc gctatcttcg agggacttct aattttaaat tggtttacaa aaaaaatgta 3540 gattcgaaaa ttatcggtta ctgtgatgct gattggggat ctgatccaga tgaacgaaaa 3600 tccaccactg gcaacatctt tatggcgcaa ggaggagcaa tttcatggat gtgcaagaag 3660 caaccgacgg tagccttatc tacgtgtgag gctgaataca tgtctgtatc ggcggcggta 3720 caggaagcct catggtggcg cggactttct gcgaaactgg cgaatgcgga tgaagtgatt 3780 gaaattcgtt gtgacaatca aagctgcatt gcgatcgcga agaatggtgg gtatcatcca 3840 cgaacgaaac acattgatat tcgtcaccat ttcatcaaag atgctctcag ccgtgggatt 3900 gtcactctag aatatgttag cacggaggac cagattgcag atggacttac taaaccattg 3960 caacggacta aattcgagat tagtcgcgag ttaatgggta tctctgaggc ttgaggagga 4020 g 4021 // ID Clu-47B_AG repbase; DNA; ANG; 1195 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; Clu-47B_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1195 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1444-1444 (2010). XX DR [1] (Consensus) XX CC 2bp TSD. >96% identical to consensus. XX SQ Sequence 1195 BP; 309 A; 263 C; 319 G; 304 T; 0 other; ccctactagc aatgagaatg taaaagtgtg cgtctttcag gcgaggggca tgtagaagca 60 gggggagacg gttggaaata cgcggaatta cgcggcggcg agagcgtgtt ttgctttcta 120 cacagttttc gcactcacac aatcacacat gtgtgaatgt gtgcgttcac tttcagccag 180 cgcgctcagt ttggttttct cgcagcggct tgctggtgtc ggtgtctcgc tcgcactagt 240 gcagcgagtg ttcctcgtgc gttccctttc ttgctaccac acaaaatcgt cagtacagtt 300 caagccttcg cagtgtgagg ccgcgcgcgt tttagcctaa gtcccgggcg ctcgatgtaa 360 tttgaagtga agtgttgaag tgttatttat ttcaattcac attattacgg cctcggccgt 420 attttcaagt gttgaagtgt tgtgcgcatt gaaataaagc agcgttaatc tttcataatt 480 caacatgaat atgtaaatta tgctgcttta tttcaatgct ttttttacag tattcaacat 540 acacaacata caggcagtaa ggcgggtgct tgaagtgacg atttttttaa ttatttataa 600 taaaaaatat gtgtggtttt gtggcaagat tgtggttagc gatgtgtgga actgtgcctt 660 aagtttcaat aaagtgttaa aatatcaaac gagtttgatg tgtttaagtt ttatttcgat 720 gcatgcgatt ttattacgca gcaaatcaaa gtgaacgaaa gcgcacagcg cacaacgcgc 780 aacgaaagaa acgaggggga gggggtgtcc agcgcacgcg caaagtgcgg caacagcgca 840 gcgcaaaccg aaaagaacgc gcgggagggg gtgttcagcg cagcgcgcaa gtgcggcaac 900 agcgcagcgc aaaccgaaag aaacgcgaga gagcaagaac agctgatccg cgcatccagc 960 gcagcgcgag aggaaaaaaa cgggagggag ggggtatcag ctgatcagct gttttgtatc 1020 gcgagagcgc cccgtgtaat ttgaaggcag cgagagagcg ctgcgcgcgc tcactctcaa 1080 ttcggaaaaa aatcactgcc gctcgactat ggcggccaat acaaatcgac cgaacccctt 1140 ccccctgttt ctacatgccc ctcgcctgta aatttcgctc tctttgtctg tcggg 1195 // ID Mariner-N17_AG repbase; DNA; ANG; 669 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Putative nonautonomous Mariner DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW Mariner-N17_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-669 RA Jurka J.; RT "Putative mariner/Tc1-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 642-642 (2009). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 669 BP; 194 A; 118 C; 133 G; 220 T; 4 other; tacagggttt tccattccaa ttaacaactg tccacagtct actagcaatc ctcgatttga 60 aaaacacgat tgtctaccac ttacaaacaa gtcaagccgt ttttggtaca ctgtccattt 120 caatttcaca attgtcgtct tcgaaaacac acgtcagatg cggcacggtt tgtttgggtt 180 gttttggttt tggctggaac gtgtatcact tgtctgtaag aattgaacca tgttgtaaac 240 aggtcttgat tgaatatcag aatatgggaa agctgagcaa twtggacatg ttaaacagtt 300 ttatcatttg rttttagtta ttttgaatcg ttacagtttt taagttattt tgggacgttg 360 taattttttt tcagctacaa gtaaaagcac cgataaatgt caacatttct gcattgtttg 420 aaagattaaa gtatacataa aatgcctaca tgtccatttg ctcagcgtcc taacaacaag 480 tgatacacgt tccagccaaa cccgtgccgc atctgacgtg tgttttcgaa gacgacaatt 540 gtgaaattga aatggacagt gtaccaaaaa cggcttgact tgtttgtaag tggtagacaa 600 tcgtgttttt caaatcgarg attgctagtw gactgtggac agttgttaat tggaatggaa 660 aaccctgta 669 // ID COPIA4-LTR_AG repbase; DNA; ANG; 198 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA4-LTR_AG is a long terminal repeat of the COPIA4_AG LTR DE retrotransposon - a consensus sequence. XX KW Copia; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW COPIA4-I_AG; COPIA4-LTR_AG; COPIA4_AG; Copia clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-198 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA4_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 54-54 (2003). XX DR [1] (Consensus) XX CC COPIA4-LTR_AG is a long terminal repeat of the COPIA4_AG LTR CC retrotransposon. There are ~20 copies of COPIA4-LTR_AG in CC the genome. XX SQ Sequence 198 BP; 76 A; 38 C; 31 G; 53 T; 0 other; tgttgttatg aaccaacgca agatgattct catgagaact ggatttagta atgaaaacac 60 agaacgaaat tttatagcaa atagagagaa ttatattgtc aaaattagga ataaagaaaa 120 ccctcttccg ttactgcatc caaccaaaca agacgtgttt tctctcagct ccgaacataa 180 ccctagtagt ttacaaca 198 // ID Ag-Outcast-5 repbase; DNA; ANG; 4989 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE An Outcast clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW Outcast; Non-LTR Retrotransposon; Transposable Element; KW Ag-Outcast-5. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4989 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4989 RA Kojima K.K. and Jurka J.; RT "Outcast clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 3 CC sequences with >99% identity. XX FH Key Location/Qualifiers FT CDS 84..1037 FT /product="Ag-Outcast-5_1p" FT /translation="MVTKDNTFKVDFNPMPQRPSPGYVIQFILEKIGVKKE FT SLLKVQLMTTTAIAVVQIKESGEAIKIVDNHDGKHAVDYNGEKYFIPIKME FT DNSTRVIIKNCTDQITNEEIKKAMMKFGEVKKVIDCVWEAPSPIAGIRDGN FT REVTIVIDTPIPSFVSIKGEKAHVQYNGQQRTCKYCELNMHLGMTCLQYKK FT TVTNNNEDEGTYANMVRKEGVKSSVQNRLQLERAVEGLNSGEKKQDNTLFK FT KPVNTIKERRVYESVEEAPSTSKANAQKRTGIVDSHTTTETEDESCMETET FT ELKKLKGRPKKTKTLTQETKQDNNTN" FT CDS 1067..4708 FT /product="Ag-Outcast-5_2p" FT /note="apurinic-like endonuclease, reverse FT transcriptase and ribonuclease H." FT /translation="MKIKMATINQKEISLLQTNITSLKRNKEELERMMNAN FT KITIACITETWMKEEEVNKVNVSNYNLVTSNREDGYGGSAIYIRKNLAYKV FT IKHDIKDKEVQITEIKLTREKINIVSVYMSPQLKINRFRENVSKLLHNRSG FT NQKNIITGDLNCHHKIWGNPTNDSKGNFLLNELNDSNLCLLHNKEKTLIPT FT DRSKRETAVDLTIISEDIIDKCQRTVTEHHMGATNHKTIITKINEIIQEHT FT VTYTNKSKLFKEIEAIEVKETTTAYQVTKQIKKLIKSNKITSKKKPKIYWN FT IETEKAWKNKNELRKIFNNDKSTENKINMNKGIAEFRTKKRLNKIKKFEDK FT LEEVDFNRDPIKLYEFLQKMEGKFKSKTKAGNIIETDRKAAKIFLDKHFFK FT NKKGNKKPYRNFKPNKDIIDIHKWNRLLKKKKNTAPGHDGVSYEILKKINE FT KTRLVMVNEINEMWHKGIIKKHYKKMKIITFPKANQDKNDPLNYRGIALIP FT AIIKMANSAVLEEMNRYWNINKVIPGTTFGFIKNRSINNCTNYITNRIGQN FT KREGKATGIIFLDLTNAYDKVRTDILISKLEKYQTPKEIIRWIGAFLNNRT FT IELNTEDTIINKLISDGLPQGDVLSPSLFNIYTKDIHDKINRLGIQIIQYA FT DDFAIITQSHSHHILQTKMQEIMKQFSDITDELKLEINTEKTKFMIFDTKN FT NNIKIRIKGKSIEKVNNYKYLGLWLDNKLTFKKQIDAMKTKTQKRLMTIKR FT ICNYRNKLNPNKTIQIHRSFIRSNLEIGTASIRNAKKELLKQLDTIQNQSL FT RKVTGCTKTTPIPSLMAIAAEIPLGIRAKYLASNEQAKEIAHSKIHRNLLC FT DRNQNNNTRNNNNKHKTYYEKLYEENKNIINKMDNIVINDNTKRVSIISNA FT PDNITKNQHTNKQILKRHYQQEIEFFRKTNTIIYTDGSRTDDGCGIGIYIK FT PHNKNMYWSYKYKLKNNTPITSVELTAIDKALQIAVENNIINPILYTDSKA FT ACSIIKREQYNNNIEESTYNIIKNSLKINAQIRWLPGHIDIEGNEKADALA FT KKGVTADNIMENKLRKSDIKLMFQQRMIFETQEWYTTEIQHKGKKFAEYQK FT EFKRDNWHKTININANEIKTMNRILTGHDFSKYWLKILKIEENDICDVCQV FT QETGKHQIFHCTKYNNERQKYPNIQEDLFRKHWKNNDGKIIKEIIRFIQDT FT KISL" XX SQ Sequence 4989 BP; 2377 A; 763 C; 826 G; 1023 T; 0 other; cattccaagt tggtacgtca aacggacaac acagtttttc aaagcaaacg ttgaacgagt 60 gatacaaaaa acgattttgc aaaatggtga cgaaggacaa taccttcaaa gtggatttca 120 atccaatgcc acaaagaccg agtccgggct atgtgataca atttattttg gagaaaattg 180 gagtaaaaaa ggagagtttg ctaaaggtgc aactgatgac gacaacggct attgcagtgg 240 tgcaaataaa agaatcggga gaggccataa aaatagtgga taaccacgat ggcaaacacg 300 cggtggacta caatggggag aaatatttca tacccataaa aatggaagat aactcgacaa 360 gagtaataat aaaaaactgt acggatcaga taactaacga ggagattaaa aaagccatga 420 tgaaatttgg agaagtaaag aaagtcattg actgtgtatg ggaagcgcca tccccgatag 480 cgggaataag agatggtaac cgagaggtaa ccatcgtaat agatacacca ataccatcat 540 tcgtatctat taaaggagaa aaggcgcatg tacaatacaa tggacagcag agaacatgta 600 agtactgtga actcaacatg catttaggca tgacctgtct ccagtacaaa aaaacggtga 660 cgaataacaa cgaggatgaa ggtacgtatg ccaatatggt acggaaggaa ggggtaaaga 720 gttcggtaca gaatagacta cagttagaga gagcagtgga gggattgaac tcaggggaaa 780 aaaaacaaga taacactttg ttcaaaaaac cggttaacac aataaaagag cgacgagtgt 840 atgagagcgt agaagaagct cctagcacta gcaaagctaa tgctcagaag cgcacaggaa 900 tagtggacag ccacactaca acagaaactg aggatgaaag ttgcatggaa acagaaaccg 960 aattaaaaaa actaaaagga agacccaaaa aaacaaaaac tcttacacag gaaacaaaac 1020 aagacaataa cacaaactaa taaactaaac aataaagaaa aattaaatga aaataaaaat 1080 ggcgacgata aatcaaaagg aaattagcct attacagact aatataacaa gtttaaaacg 1140 aaataaagaa gaattagaga ggatgatgaa cgcaaacaaa attactatcg cgtgcattac 1200 cgaaacatgg atgaaagaag aggaagttaa taaagtaaac gtatcaaatt acaatttagt 1260 taccagcaac agggaagatg gatatggagg atcagccata tacattagaa aaaatttagc 1320 atacaaagtt atcaaacatg acataaaaga taaagaagtt caaattacag aaataaaatt 1380 aaccagggaa aaaataaata tagtaagtgt atacatgtca cctcaattaa aaattaatag 1440 attcagagaa aatgttagca aattactaca caacagatca ggaaatcaaa aaaatataat 1500 aacaggagat ttaaactgtc atcacaaaat atgggggaac ccaacaaatg acagtaaagg 1560 gaacttttta ttaaatgaat taaatgactc aaatttatgc ttattacaca acaaggaaaa 1620 aacactaata cccaccgata gaagcaaaag agaaacagct gttgacctta ctatcatatc 1680 agaagatata atagataagt gccaaagaac agttacagaa catcatatgg gtgcaacaaa 1740 ccacaaaact ataataacaa aaataaatga aataatacag gaacacacag taacatacac 1800 aaataaatca aaattgttca aagaaataga agcaatcgag gtaaaagaaa ccacgacagc 1860 ataccaggta acaaaacaaa ttaaaaaatt aataaaatct aacaaaatca catcgaaaaa 1920 aaaaccaaaa atatactgga atatagaaac agaaaaagca tggaaaaaca aaaatgaact 1980 tagaaaaatt ttcaacaatg acaaaagcac ggaaaataaa atcaacatga ataaaggaat 2040 agctgaattt agaacgaaaa aaagacttaa taaaataaaa aaatttgagg acaagctaga 2100 ggaagtagat ttcaacagag atccaataaa actgtatgaa ttcttacaaa aaatggaagg 2160 gaaattcaaa agtaaaacaa aagcaggaaa cataatagaa actgacagga aagcagcaaa 2220 aattttctta gacaaacatt tttttaaaaa taaaaaagga aacaagaagc cttatagaaa 2280 ctttaaacct aacaaagata tcatagatat acataaatgg aatcgcttat taaaaaagaa 2340 aaaaaacaca gcaccaggtc acgatggagt atcatatgaa atactaaaga aaataaatga 2400 aaaaacaaga cttgtaatgg ttaatgaaat aaatgaaatg tggcacaaag gaattataaa 2460 aaaacattac aaaaaaatga aaataataac ttttccaaaa gcaaaccaag ataaaaacga 2520 tccactgaat tatagaggta tagctcttat accagcgatt atcaaaatgg ccaatagtgc 2580 agtgttggaa gaaatgaaca ggtattggaa tattaataaa gtaattccag gaacaacatt 2640 cggatttatc aaaaacaggt cgataaacaa ttgcacaaat tacataacga atcgaatagg 2700 tcaaaataaa agagaaggaa aagcaacagg gataattttc ttagacctaa cgaatgcata 2760 tgataaagta cggacagata tacttatctc taaattagaa aaatatcaaa caccaaaaga 2820 aattattaga tggataggag cttttttgaa caacagaact atagaattaa acacagaaga 2880 tacaataatc aataaactaa tatcagacgg acttcctcaa ggagatgtgc tatcgccaag 2940 cctatttaac atatatacta aagacataca tgacaaaata aataggttgg gaatacaaat 3000 aattcaatat gcggatgatt ttgcaattat tacacaaagt catagtcatc acatactaca 3060 aacaaaaatg caggagatca tgaaacagtt ttcagacata acagatgaat taaaactaga 3120 aattaacaca gaaaaaacta aattcatgat ttttgatacg aaaaataata atataaaaat 3180 aaggataaaa ggaaaatcaa tagagaaagt taataactac aaatacttag gattatggtt 3240 agataataaa ctaacattca aaaaacaaat agatgctatg aaaacaaaaa cacaaaaaag 3300 actaatgaca ataaaaagaa tttgtaatta tcgcaataaa ttaaatccta acaaaacaat 3360 acaaattcac agaagtttta ttagaagcaa cttagaaata ggaacagcat caataagaaa 3420 tgcaaaaaaa gaactactga aacagttaga tacaatacaa aatcaatctc taagaaaagt 3480 tacaggatgt accaaaacaa ctcctatacc ctcactaatg gccatagcgg cagaaattcc 3540 tttaggcata agagcaaaat acctcgcgag caatgagcaa gcaaaagaga ttgcacattc 3600 aaaaattcac agaaatctac tatgtgacag aaaccaaaat aacaacacca gaaacaacaa 3660 caacaagcac aaaacatatt acgaaaaatt gtatgaagaa aacaaaaata tcataaacaa 3720 aatggataat atagtaataa atgataacac taaaagagta tcgataatat ccaatgcccc 3780 ggataatata acgaaaaatc aacatacaaa caaacaaatc ctcaaaagac attaccaaca 3840 agaaatagag ttttttagaa aaacaaacac aattatttac acggatggaa gcagaacaga 3900 tgacggttgt ggtataggta tatacatcaa gccacacaac aaaaatatgt actggagtta 3960 caaatataaa ttaaaaaaca acacacccat tacctcagta gaactgactg caatagataa 4020 ggcactacaa atagcagtgg aaaacaacat aataaatcct atactatata cggacagtaa 4080 agcggcttgc agcatcatca aaagggaaca atataacaac aatatagagg aaagtacata 4140 caatattatt aaaaacagtt taaaaataaa tgcacaaatc agatggttac cgggacatat 4200 cgatatcgaa ggtaatgaaa aggcggatgc cttagcaaaa aaaggagtga cggcagacaa 4260 cattatggaa aacaaattaa gaaaaagtga catcaaactt atgtttcaac aaagaatgat 4320 ttttgaaacg caggaatggt acacaacgga aatacaacat aaagggaaaa aattcgcaga 4380 ataccagaaa gaattcaaaa gagataactg gcataaaacg ataaatatca atgcaaatga 4440 aataaagact atgaatcgga tactaacagg acatgacttc tccaaatact ggctaaaaat 4500 acttaaaata gaagaaaatg atatatgcga tgtttgccaa gtacaggaaa cgggcaaaca 4560 ccaaatattt cactgtacaa aatacaacaa tgaaagacag aaatatccca acatccaaga 4620 agacctattt cgcaaacact ggaaaaataa tgatggaaaa ataataaagg aaataattag 4680 attcatacaa gacacaaaaa tctccctata atggaaaaaa accgcaagtt tgaacagcac 4740 acaaaaacat agataacatg atcatatgaa ctttgaaaga caaatcataa acacaacata 4800 gaaatgtcaa aaccatcatc aagtaaacaa cacaaataca acggtaccaa cattggagag 4860 gtatatgact caggcagtcg aaaacctaag agcggtacgt ctttgaatgt ggctttatta 4920 agcataggct ttataaaaca tagaagacta catcctgctc cacaaaagaa gaagaagaag 4980 aagaagaaa 4989 // ID BEL3-LTR_AG repbase; DNA; ANG; 350 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE BEL3-LTR_AG is a long terminal repeat from the BEL3_AG LTR DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; BEL3-I_AG; KW BEL3-LTR_AG; Bel/Pao; Long terminal repeat; RETRO935_AG_I; KW RETRO935_AG_LTR; retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-350 RA Jurka J. and Drazkiewicz A.; RT "RETRO935_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 21-21 (2002). XX RN [2] RP 1-350 RA Kapitonov V.V. and Jurka J.; RT "BEL3_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Direct Submission to Repbase Update (19-FEB-2003). XX DR [1] (Consensus) XX CC 5 bp target site duplication [1]. This long terminal repeat CC is from the BEL3_AG LTR retrotransposon [2]. XX SQ Sequence 350 BP; 88 A; 72 C; 93 G; 97 T; 0 other; tgtcacttat ggtgacgacg cgatcgttcg aggatcgaca aggagcgtcc gtattttagc 60 tagcggccgc tagaggtctc ggactggacg ataaacgttc ggactgaacg aaagggatcg 120 accgagcgtt tgacggatag ccggtgaccg catggactgc gaaatagggc tttctttttg 180 atcccggacc tgcgtggaag cggacgaatt tttagttttt tcttcactct ctcggctact 240 aaaattttgt gctgttaaag gcaaaataaa ataatcctga attacttaaa aaagccgttt 300 gtttggtaaa ttatttatct gcaggccggg cgttgtgcta acgcgcaaca 350 // ID GYPSY6-LTR_AG repbase; DNA; ANG; 181 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 19-SEP-2005 (Rel. 10.1, Last updated, Version 2) XX DE GYPSY6-LTR_AG is an LTR of the GYPSY6_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY6-I_AG; GYPSY6-LTR_AG; GYPSY6_AG; Gyspy clade; KW GYPSY31-LTR_AG. XX NM GYPSY6-LTR_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-181 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY6_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 82-82 (2003). XX DR [1] (Consensus) XX CC GYPSY6-LTR is a long terminal repeat of GYPSY6_AG (its internal CC portion is deposited as GYPSY6-I_AG). XX SQ Sequence 181 BP; 76 A; 33 C; 31 G; 41 T; 0 other; tgtaaacaat ttaaaattag gttacttata cttatctcgg gtgcaataat ctaatctaca 60 agaactgcaa taaaatagga atgacagcga gcagaacggc acttcctacg acacacgaag 120 aaacgataga aagccctcgt gtacgaaaaa agcaacaatt gtaaaaataa agctgtttac 180 a 181 // ID GYPSY7-LTR_AG repbase; DNA; ANG; 327 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 20-SEP-2005 (Rel. 10.1, Last updated, Version 2) XX DE GYPSY7-LTR_AG is an LTR of the GYPSY7_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY7-I_AG; GYPSY7-LTR_AG; GYPSY7_AG; Gypsy clade; Gypsy group; KW GYPSY46-I_AG; GYPSY46-LTR_AG. XX NM GYPSY7-LTR_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-327 RA Kapitonov V.V. and Jurka J.; RT "GYPSY7_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(5), 88-88 (2003). XX DR [1] (Consensus) XX CC GYPSY7-LTR is a long terminal repeat of GYPSY7_AG (it internal CC portion is deposited as GYPSY7-I_AG). The A. gambiae harbors CC ~30 copies of GYPSY7-LTR_AG. XX SQ Sequence 327 BP; 101 A; 79 C; 78 G; 69 T; 0 other; agttagaccg acccgctaga aaccagactg caagctggca ttccacaccg gctacgagca 60 gacgaagatg taacgctaca ccggccacga gcggacaacg gcgacatgcg caagtaatgc 120 gacacgcaga ccgatcgtga gaacggacca atgcagccag caaccagcgg cctcgttaga 180 acattagctc gttaggttta gtcagtcgaa gtcgaagtct agaatcagcc agatatagtt 240 catagtttag ctttagtcag gagtaatcct gtttgtgtaa aataaaaatc tttttttatg 300 gccaaccggc ctagataaag attaact 327 // ID GYPSY10-LTR_AG repbase; DNA; ANG; 982 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY10-LTR_AG is an LTR of retrotransposon GYPSY10_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY10_AG; GYPSY10-I_AG; GYPSY10-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-982 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY10_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 163-163 (2003). XX DR [1] (Consensus) XX CC GYPSY10-LTR is a long terminal repeat of GYPSY10_AG CC (its internal portion is deposited as GYPSY10-I_AG). XX SQ Sequence 982 BP; 333 A; 255 C; 172 G; 222 T; 0 other; tgtagcagat ttactgctaa cagttcaacc aacctataga gccacctaag cagttataca 60 atcgttacta accaaaaacc ggatgacata acattaagag ttgaaccaac tcaaaaagcc 120 acctcaacat aatcgttgct aaccaaaacc atccaactaa aaacacattc aaattaacca 180 gcaatcaaat caggtgagac agaaaaccta ccatgcaata aatgcttagc acacacaaac 240 cgtaggacag gaagcgcgaa ccgacaactt ataaaaacat aacgcgacgg taacatgaaa 300 cgcgcctcgc aactcagaac ggctgcaaaa cgattgcgca acaataacac caagtgacaa 360 acaaacaacc ggacattgac tttatatttt gatgaacaaa cctcaaaaca ggaaggttag 420 caaccacata atgaaagtag gtcaatatta gtcgtaaaca taaaataagt gtaaaaagac 480 attgtgtcat cactgtttaa gaatgcgtta taaataaagc ttattcgaaa cagtgagaca 540 gttcactagt ttcagacacg tagatcactc gtgtttagtg tttctcccca ttagcagaag 600 ccgccgaagg ctctgcccaa ttggtaacca cgtttggtta tcagctgttc agtcggttga 660 ccgatcagca ataccccttc tacggaagaa gtgcaagttt gcaccttaat cggttaatac 720 tcccgcagcg ttttaacgcc gcctggaatc cgatccatcg cgcaagtaaa gttccgtagt 780 tccacccgtc cgttgcataa gtgtaacttg accatctctt cctaacgaag tgaacgaatg 840 atacaattca ttcgcaatta aaccaactac ggtacgttct cgcctcgata ccccgcctcg 900 ttctgcactc ttcctccaca ttccgcgcat cggcttaccg cacacggttc tggtaagcaa 960 agtgcgcgcc gaacctacca ca 982 // ID DNA-3_AG repbase; DNA; ANG; 369 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; DNA-3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-369 RA Jurka J.; RT "Non-autonomous DNA transposons from mosquito."; RL Repbase Reports 10(9), 1428-1428 (2010). XX DR [1] (Consensus) XX CC TA tsd. >97% identical to consensus. Likely Mariner/Tc1. XX SQ Sequence 369 BP; 114 A; 76 C; 69 G; 110 T; 0 other; tacaggcggt ccccgagata cacggtacct cttatacgcg gattcggaga tacgcggttt 60 tctaaatttg acaattcttt gagcaaattg tactgatttg acacatcaat tgcaaattgc 120 caaataattt ccgttttgat cgaatgttaa aaactatttc aaaaggttta aaacagttat 180 attcagtcag aatcatatca aataattcat aaagtgacta aaaccgcccc ctacttgcaa 240 aattacacga aaattagtga tattttagct ggaaatcacg agattcgact tacgcggaaa 300 ttcgagatac gcggtatttt gcggccgttt tcggtcccca ttaaccgtgt atctcgggga 360 ccgcctgta 369 // ID AgaP13MITE592 repbase; DNA; ANG; 592 BP. XX AC DQ301490; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP13MITE592 P MITE, complete DE sequence. XX KW P; DNA transposon; Transposable Element; Nonautonomous; KW AgaP13MITE592. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-592 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-592 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301490; Positions 1 592. XX SQ Sequence 592 BP; 173 A; 114 C; 109 G; 196 T; 0 other; caaagtgagt tgatatatca ggtgggcata tccagtgcaa tttgacgtct aaaaacgaaa 60 gtaggaagaa aacctggttt gacagttaga ttcataacac actggatgct ttgatgattt 120 ttctatggac tcactcctaa ctctatccaa acttttcaca attttactcg atatttgaga 180 tcaatcggtg gtctgttgaa cagaaccaaa ggatatatgc gcaggcactc tgataaattc 240 ctcactttca gtataacatc atattcggtt cggttggtac actgttcctt gcttgctggc 300 agaaaggatt caataattgt tgcctatgta tcagatttgg ccaaaaaaat tgtttttttt 360 tgaatcttta gccttcaaat cttcgtttta atccaagaac tggctaccac gtaggttttt 420 ttggcttctt tcttcgtgct aagccagggt aaaaccacat acaatttcta cgagatttcg 480 tttaaacttc aaggcatact agattttgtt atggaccaaa ccgtcaaatt gtgtgtcgat 540 gtattggtga gaaagaacac cataagccca cctgatatat acactcactt tg 592 // ID Clu-39_AG repbase; DNA; ANG; 1080 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-39_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1080 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1442-1442 (2010). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 1080 BP; 386 A; 189 C; 176 G; 329 T; 0 other; tagggtaact gtaccagttt tcggcagtgt acctattttc ggcagggtag ctaaaaacgt 60 gaaattctgc acaaatgcgg taaaaaattt cacaaaatac aattttgtag tgaaagtgta 120 cttatttgat attcacatac gaagtttcac accatttcat tacgtatttt ccaaaatatc 180 aaataaattg tgcttttttt cggcagcttt gctgtcagcc aatcgttcgg caggttttgt 240 gattaagaga atcagaacag taaaacaatg aaaaagtggt gtatattaaa gattttatgc 300 ttatttaatt tattctaact tgttcggaat tttaaaatca cgataaacag gttatttttc 360 actttaaatg ctgccgaaaa taggtacaca gtaaatgttc atttttgaca gctcaatgtt 420 gtgggctcct gccgaaactc ggaacaataa cacactttga atttggctga atattccttt 480 taaaattgat cagaacacta caatgcgtat gtcgacacat aatatgacct ttcttgtacc 540 aaatatcacc aattttacga caaacaatgc actaacttaa gagaaaaata aatatttgta 600 agccagttcg caccgtacgc tataaccatc aaactgtcaa tggctgctct gtttaaacgt 660 cgcatcaaaa tggctgtcaa aacgggccaa aataagcaac ataaaattaa taattaaagt 720 gctttttaac accaatctta ggaaagtaag agtgttaaat tgctttaaac atttttttgt 780 acatattcac attgatacaa acattttctg acccgtattc gcgctgccga aaataggaac 840 acagcctgcc gaaaatagga acaaactgcc gaaaatagga gcaaaatcaa tgtttgcatt 900 ttcacgaata tttatgaaaa agggctttga accggcaaat aaaaaatatt gtaacatact 960 atgatagttt atcaaccaga ataacaacac tttcaagaaa aataatgaaa aatattgatg 1020 tgtgagcaat ttttgtgaaa ctgctgcact agcctgccga aaactggtac agttacccta 1080 // ID BEL10-LTR_AG repbase; DNA; ANG; 551 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL10-LTR_AG is a long terminal repeat of the BEL10_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL10-I_AG; BEL10-LTR_AG; BEL10_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-551 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL10_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 28-28 (2003). XX DR [1] (Consensus) XX CC BEL10-LTR_AG flank an internal portion of BEL10_AG (deposited as CC BEL10-I_AG). XX SQ Sequence 551 BP; 163 A; 127 C; 156 G; 105 T; 0 other; tgttgcgtgc gaagtgcatg cagaataaaa aataaaaagt ggaaagacac aacatcggtg 60 aatttccacg aaacgccagc cgtacccccc acgatgaacg tcgctaccag cacgcgcagc 120 cccggtaccg tctacacaaa ggattgctcg atggattgat cgatgcggaa tttctggaat 180 cgccagtgca aacagatgtc gccgggagtg acaatggaaa ccgggagcgc gatgggagtt 240 ggggacaaac gggaatggga gtgggaggaa aggcgtagaa ttttgggtta gtatcgtgca 300 gaaaaattgt gaaaattttg gtatataaag gcggctcaag ctggagccaa atcagattcg 360 aacgataagc caaagtgtta agagcttcat ttcagaacat ccgaaataat ccgaatctca 420 aggaacgatc cccctttctg ttgcatagac tcccgaggca gcgaggagca ctagctgaaa 480 ggcttggaga attccccgtt gtagccgtag gagccggaac cttcgcccgg cggagcccct 540 cggccgcaac a 551 // ID CR1-2_AG repbase; DNA; ANG; 4665 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 19-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE CR1-2_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; CR1 clade; KW CR1-2_AG; DNA/RNA-binding; PHD finger; endonuclease; KW reverse transcriptase. XX NM CR1-2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4665 RA Kapitonov V.V. and Jurka J.; RT "CR1-2_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 2-2 (2002). XX DR [1] (Consensus) XX CC CR1-2_AG is a family of CR1-like non-LTR retrotransposons. CC The CR1-2_AG consensus sequence was reconstructed based on CC multiple alignment of ~100 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-2_AG occurred less than 1 million years ago. CC Integrations of CR1-2_AG have not produced target site CC duplications. CC The consensus sequence encodes two proteins: a 416-aa CC CR1-2_AG-ORF1p CC (positions 256 1503) and 996-aa CR1-2_AG-ORF2p (positions CC 1559-4546). CR1-2_AG_ORF1p is a putative DNA/RNA binding protein, CC which includes the PDH domain. CR1-2_AG-ORF2p is composed of CC the AP endonuclease and reverse transcriptase domains. The 3' CC terminus CC is composed of the AATA microsatellite. XX FH Key Location/Qualifiers FT CDS 1559..4546 FT /product="CR1-2_AG-ORF2p" FT /translation="MEANPASNTHNPSDSLPPLITCSESTPGASRSLIPSI FT DRLNIYYQNVRGLRTKLDELRLSLSELDMDVLVLTETWLDGSIPSSLISED FT AYVIYRCDRNSLNSNRCRGGGVLIACSSVLNTSTLSLPFDSLESVWTIVKL FT QNLAIYIGAVYIPPDLRSSEVVLDDLHESVSFVAGKLKPNDLMVLLGDFNS FT PSLSWQPSASCVNQFIPTGVSRENVSLLDGMSVNGLLQLSGIKNIRGRQLD FT LLFANAAFLECCSPVIASPVPLVALDNHHPALETSVLLTHSRPASSQRIPT FT ARMFNFRKLDYQKLHRILADTDWSFIDADCDINQAVAAFTNVITSAFPSCC FT PLLKPAPNPKWSNRALRLLKSDKNRAQRAYRLNNTLHNLCVYKYAAKAYRL FT LNRHLYRRYVRRLQMRFTIDPGSFFRFANSRRGSASLPSTLFLDLSSATSN FT PDICNLFAKHFSSVFVDPSTFKVPLDVGLSYTPSDVISCNSVVVSESLVKS FT ALSKLKTSFSPGPDGIPACVLKKCGNTLTPILTRLFSRSLSVGIFPSQWKL FT AWLVPIYKKGDRTLASNYRGISIICACSKILESIVHLSVMPCVKNYISTEQ FT HGFMPNRSVSTNLMCFLSSLYHYLSSGKQVDTIYTDFKAAFDSIPLSLLVA FT KLRKLGFGGSILPWFNSYLENRSYAVKICGSFSECFLSSSGVPQGSVLSPL FT LFILFLNDCTSILPPNGFLLYADDVKIFLPVSSTADCLVLQSWLCKFSTWC FT ASNGLVLCPEKCSVLSFFRSSTSITHAYSVCDAPIPRASLSKDLGVFFDPS FT LSFKEHTDYVINKANKSLGYICRMSTEIRDPFCLKSLYCCWVRSVLEYACV FT IWSPVQLSLLQRIERIQRRFTRIVFRRSLGHHSIPLPSYDDRCTLLGLAKL FT EHRLSVAQASFVAGILLNTIDTPSLLSRLHLYAPCRTLRYRFRLQLPICRT FT RFARNEPFVRAMSSFNSTSDLFDFNISYPVYRSRLRSFSVP" FT CDS 256..1503 FT /product="CR1-2_AG-ORF1p" FT /translation="MASVICKKCEGAISNDPIPCFGLCEHYYHDKCIGLST FT PLLRDFKKSQNLFWACADCAQRLRAVDTLRFSHGLSRDAAYLLESLQSDFR FT DTSRSVQAASAGLRLELSSSLDCFRNEIALMKQESASSIRSVKDFIDSLTA FT SHSMERNYSQAPLLTTLDEVKHGIKELDLMHRELLTSFNSLMNKLNSHLAT FT HTTTSSAHHSAIPATHSTTTIPVAASKLTHQAVGENPSKRRLLDRSPDPSP FT TNTVTRAMLSSGTGLSCNNITTVPERPPRTWVFISRIAPDTPIEAIREMAC FT SNIGTDDILVYSLVRRDRDLSTLSYVSFKIGVPDSHRAIALAASTWPRGIS FT FKEFIDLNPRSVNVWRPTTAASHAPSAPVTRESDHHSSPPSINHTTQLRNA FT DFTISPDHGPMSLPYTQYFQQA" XX SQ Sequence 4665 BP; 1046 A; 1223 C; 888 G; 1508 T; 0 other; tgtcacttgt cacttgtcac tcatagcggc tggttgtgct ttctcactct gctatttcca 60 agttgatttt ttaccgcgat tttcttgttg attaaggctt gaactgtttt agttcggtta 120 atttgcactc gcgcgttcac gtatcacaaa tttgtgcggc acctgtgaat tgtacggttc 180 aacatataaa cattgcaatc cgcctgcggt tgttgtgacc actcactgtt cagtcgaact 240 ttgcgctatt ttgcgatggc gtctgtgatc tgtaagaaat gtgaaggtgc tattagcaac 300 gatccaattc cgtgcttcgg tctttgtgaa cactattacc acgataagtg cattggactt 360 tcaaccccgc tcctgcgtga ttttaagaag tcacaaaatt tattctgggc ttgcgcggat 420 tgtgctcagc gtttgcgggc cgttgacact ttacgcttct cgcacggcct ttctcgtgac 480 gctgcctacc tgttggaatc gttgcagtcc gatttccgcg atacctcacg ttctgtgcag 540 gcggcatcag ctggcttgcg acttgaactt tcctcttcac tggattgttt tagaaacgag 600 atagctttga tgaaacagga atcagcatcg tcaattcgtt ccgtgaaaga tttcatcgac 660 tcacttactg cttctcactc aatggaacgc aactactcac aggctccact actcacaacg 720 cttgatgaag ttaagcatgg catcaaggag cttgatctca tgcaccgtga gcttctcact 780 tccttcaact cactaatgaa caagctcaac tctcatcttg ccacgcatac cactacatcg 840 agtgctcacc attctgcgat tcctgctacg cactcaacga ccacgattcc agttgcggcc 900 tctaagctca cccatcaagc tgttggtgag aatccttcta aacgtcgatt gttggatcgc 960 tctcccgacc catcgcctac caataccgtt acacgcgcta tgctttcatc gggcacaggg 1020 ttgtcctgca ataatattac gaccgttcct gaacgcccac cccgtacttg ggtctttatc 1080 tcccgtattg ctcctgatac tccgattgaa gcgatccgcg aaatggcctg ttctaacata 1140 gggacagacg acatcttggt atacagcctt gtacgacgcg accgggatct ttctacgctc 1200 tcctacgtat cttttaaaat tggtgtaccg gattcgcacc gggctattgc tttggctgct 1260 tcaacctggc ctcgcgggat ctcttttaag gagttcatcg accttaatcc ccgctccgtc 1320 aatgtttggc gacccactac tgcagcatcc catgcacctt ctgcgcctgt tactcgtgaa 1380 tcggatcatc attcatcgcc accttctatc aaccacacga cacaactgcg aaacgctgat 1440 ttcactattt cgcccgatca tggtcctatg agcttgcctt atacgcagta ctttcagcaa 1500 gcgtaaaccg gctgatccgg ctgaagattt tcatgaacta ccgactttac ccgaattgat 1560 ggaagctaac cctgcatcta atacacacaa tccgtccgac tctcttccgc cgttaataac 1620 ttgcagcgag agcactcccg gcgcatctcg ctctctcatt ccttcgatcg atcgactaaa 1680 catctactat caaaacgtaa gaggattgcg cacaaaacta gacgagttac gcctctctct 1740 atctgagctt gatatggatg tgttggtgtt aactgaaaca tggctcgatg gctcaattcc 1800 gtcctctctt atctcggagg atgcgtatgt catctatcgc tgcgaccgaa attctctcaa 1860 cagtaaccgt tgccgtggtg gtggtgtact cattgcctgc tcttccgtgc tgaacacgtc 1920 aactttatca ctgccgttcg actcgctgga gtctgtttgg acaattgtta agcttcaaaa 1980 ccttgcaatc tatatcggcg ccgtttacat cccacccgat ttgcgttctt ccgaagtagt 2040 ccttgacgat ttacatgaga gtgttagttt cgttgctggc aaacttaaac caaatgatct 2100 tatggtgtta cttggtgatt ttaattcacc atcactctcc tggcaaccgt cggcatcgtg 2160 cgttaatcag ttcattccta ctggtgtatc tcgtgaaaat gtttctctgc ttgatggaat 2220 gtcggtaaac ggtttgctgc aattatccgg cataaaaaat atacgcggaa gacaattaga 2280 tcttctattt gccaatgctg ccttcttgga gtgctgctcc cccgtcatag cttcacctgt 2340 tccactcgtc gcgctagaca atcatcatcc tgctcttgag acaagcgtgc tccttacgca 2400 ctctcgccct gcatcctccc aacgaatacc tacggctcgt atgttcaatt ttaggaaatt 2460 ggactaccaa aagcttcatc gtatattagc cgataccgat tggtccttta ttgatgctga 2520 ctgtgacata aatcaagctg tagcagcgtt tactaatgta atcacttccg ccttcccttc 2580 ctgctgtcct ctccttaaac ccgctcctaa tcctaagtgg tcgaacagag ctttacgtct 2640 attaaaatcg gacaaaaacc gcgctcaacg cgcctaccgt ttaaataata ctttacacaa 2700 tctttgtgta tataaatatg cggctaaggc ttatcgtctt cttaatcgcc atctgtatcg 2760 tcgctacgtt cggcgtcttc aaatgcgctt cactattgat cctgggtcat tctttcgttt 2820 tgcgaactct cggcgaggct ctgctagcct tccatccacc ctgtttcttg atctgtcctc 2880 tgctacgtcc aatcctgata tatgtaacct gtttgccaaa catttctcta gtgtatttgt 2940 cgatcctagt acttttaagg ttcctttgga cgtaggcttg tcttacacgc cctctgatgt 3000 gatatcatgt aattcagtcg ttgtaagtga aagcctagta aaatccgctt tatccaaact 3060 taaaacttcg ttttcaccag gccccgatgg catccctgcc tgtgttttga aaaaatgtgg 3120 caataccctc actcctattc tcactcgtct tttttctcgc tcgcttagtg tagggatttt 3180 tcctagtcaa tggaaactag cttggcttgt tcccatttat aagaaaggtg accgcacact 3240 agcatcaaac tacagaggca tttctattat ttgtgcgtgt tctaaaatcc tcgagtcaat 3300 tgtccattta tctgtaatgc cttgtgtaaa aaattatatt tctacggaac aacacggctt 3360 tatgcccaat cgctcagtgt ccaccaacct aatgtgtttt ttgtcttctc tatatcacta 3420 tctgtctagt ggtaagcaag tagacactat ctataccgat ttcaaagcgg catttgacag 3480 tatacctcta tcattacttg ttgctaagct tcgaaaacta ggtttcggtg gctctatatt 3540 gccgtggttc aactcctatc ttgagaatcg ttcatatgca gttaaaatct gtggctcttt 3600 ctctgaatgt tttcttagtt cttcgggtgt ccctcaaggt agtgtcctta gtcccttact 3660 gttcattctc ttcctcaatg actgtacttc gatccttcct cctaacggct tcttgctata 3720 tgcggatgac gttaaaattt ttcttcctgt atcttctaca gctgattgtc tagtccttca 3780 atcctggctc tgtaaattct ctacatggtg tgcttctaac ggtttagtcc tgtgtcctga 3840 aaaatgttcc gtcttgtctt tcttccgatc ttctacaagt ataactcatg cttatagtgt 3900 ctgtgatgcc cctattcccc gtgcgtcctt gtctaaggat cttggcgtct tctttgaccc 3960 gagtctttct ttcaaggagc acacggacta tgtcatcaac aaggccaaca aaagtcttgg 4020 ttatatttgt cgcatgtcca ctgaaattcg tgatcccttt tgtcttaagt ccctttattg 4080 ttgttgggtc cgttccgtac tggaatatgc ctgtgtcatc tggtctcctg tccaactatc 4140 tctgctccaa aggattgaga ggatccagag acgttttaca aggatcgtat ttcgtaggtc 4200 gctgggtcat cactctattc cgcttccttc gtatgacgat agatgcactc tattaggcct 4260 tgccaagttg gagcaccgcc tctcggtcgc tcaggcttct ttcgtcgctg gcatattgct 4320 caatacgatt gatactcctt cacttctgtc gcgcttacat ttgtatgcac cttgtcgcac 4380 cttacgttat cgttttcgtc tccagttacc tatatgtcgt acacgttttg ctcgcaatga 4440 gccttttgta agagctatgt cgtcttttaa tagtacttct gatctgttcg atttcaacat 4500 atcctatcct gtctaccgat cccgtcttcg ctccttttcc gtaccataaa ctccttccaa 4560 ctccttcgtg aataattgta tactatagtc agtaagcact attgctgtaa cccaacgtgg 4620 ccgagcaatt aataaaataa aaataaataa ataaataaat aaata 4665 // ID GYPSY59-LTR_AG repbase; DNA; ANG; 288 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY59-LTR_AG is an LTR of retrotransposon GYPSY59_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY59_AG; GYPSY59-I_AG; GYPSY59-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-288 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY59_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 158-158 (2004). XX DR [1] (Consensus) XX CC GYPSY59-LTR is a long terminal repeat of GYPSY59_AG (its CC internal portion is deposited as GYPSY59-I_AG). XX SQ Sequence 288 BP; 67 A; 62 C; 78 G; 81 T; 0 other; tgttgggtgc gtacgtgcag ggtactcacg cgccctctct gcgagatgtt ttctctcgtc 60 ggttgatggg cacgagtgtc ctagcgagag attcgtcccg atccgcgatc ccgattgtta 120 ctcgttacac gatcgggatg cgagcggcag tcgggataca gctagcaacg aatgtggtgt 180 ggtgtcagta acaaggcgga aataaattaa gtttatttat tgtagtttat ctaatttacg 240 ttgtgtcgta aacacttatt cggccacata tcacacggca acgtaaaa 288 // ID TransibN2_AG repbase; DNA; ANG; 1750 BP. XX AC . XX DT 21-MAR-2005 (Rel. 10.03, Created) DT 01-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE TransibN2_AG is a family of nonautonomous DNA transposons - a DE consensus sequence. XX KW Transib; DNA transposon; Transposable Element; Nonautonomous; KW TransibN2_AG. XX NM TransibN2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1750 RA Kapitonov V.V. and Jurka J.; RT "RAG1 core and V(D)J recombination signal sequences were derived RT from Transib transposons."; RL PLoS Biol 3(6), (2005). XX DR [1] (Consensus) XX CC TransibN2_AG is a family of nonautonomous DNA transposons that CC belongs to the Transib superfamily. TransibN2_AG elements are CC characterized by 14-bp terminal inverted repeats (1 mismatch) and CC 5-bp target site duplications. This family was active less than 1 CC million years ago (some copies are less than 1% divergent from CC the consensus sequence). XX SQ Sequence 1750 BP; 633 A; 266 C; 261 G; 589 T; 1 other; cacaatgggc atttgccggg atgaaattca aaaaatcaac tttgtaatag cgcaattcca 60 aataataggt tttaatacta agggttaaac taagaaaaat accaaatatg agctctttat 120 ctgttctggt tctctagaaa acacctctca aagtcgagat ttgttaaaaa aacgcagaaa 180 aatcttcact tttttggaaa actttaaacc tttgtaactt tttcaaatat gaaccgattt 240 tgataaattt agacatttta taaaggatat ttagtcagct ttataaacac atgaaaaatt 300 ttgagctaag ttaattttat gcaaaaatat acgataataa ctgaagaaac ctcctgaaaa 360 ttttcatmat tttaattttt gacttgatgc cattataaaa gtggcgtaga gtggcgttgt 420 ctcatatatg tcacatcata gtgtaatata tatactatca atctgtgaag aaatattctg 480 cgaaagtttg cgaacgcgaa ttttattgaa gttttttcac atcaactttt tctaaaaaca 540 gacacatgcg taactaggcg gctccatagg tgcctagtgt agcgtatttc gtgtattcta 600 caaaaatata ttctacgtgc ttttgatcaa ctttcatttc aattacgaag ctgtgtatct 660 tagtttcagc tcatgtactt atcttatatg taaatttcta ttctaaccta actttatcat 720 taaatataat gaagcaaatc agaaccaaat accttttggt cctactgcag attatagaag 780 gatataagga aagtaattct taaattatac gcataacagg aaaacgttat gaatgcagta 840 gcatctattg gcgttttggg agaagaaact cccaaaactc atgacaaact aaatataagg 900 atagagggga gatgctacaa agaatatttt gcaagaaaga aaatgcatcc agaaggacat 960 tttgtataat gagctgtatt cctctgattc gttaaccagt tcaatatcag tagctacacg 1020 cactaaaaca aagacagggt ttaactttac aaatgaagta aatacttagt ttgtagctat 1080 cgtgcaattt tgattttttc agatagttgt aaaaccaaga aggattttta gttttattac 1140 cccttctgaa ggcagtacag cacttggaaa ggtattggat gaatattcaa agacaaacaa 1200 tatgtgctgc ttttcattac gaatttggga agtactttat ttcaatgcac ctgctctatc 1260 taattattta aatgccttta ttcatgtctt tagaaaagtg aactagacgt caggagcttg 1320 taatagcatc cacctaagtt gtgtatgtaa atttttaata tttgtactac ctgtgagtaa 1380 aagtgacgag aattagctac aattgtccta tacaaagcag aaaaaattat cttattcaaa 1440 attattatgt tttgtttata tattattatt attattattt atttaatctt cattatggaa 1500 cattacagta tattttaaaa tggtttatac tattagcaac aaataataca agaaaaaaaa 1560 cacacaaaat taataaaact aagcaaagat cccttaaaac actattttat gtagcgccac 1620 ctggtggtat ggatgcgaac tacaaataaa atgttattta ctatcataat actttactac 1680 aaacgataaa aactaacaat ttctgcaaaa ataattttat cccgcaattt gttctaaact 1740 gcccattgtg 1750 // ID GYPSY33-I_AG repbase; DNA; ANG; 4577 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY33-I_AG is an internal portion of retrotransposon GYPSY33_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY33-I_AG; GYPSY33-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY33_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4577 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY33_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 58-58 (2004). XX DR [1] (Consensus) XX CC GYPSY33_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY29_AG, GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY34_AG, CC GYPSY35_AG, CC GYPSY36_AG, GYPSY37_AG and GYPSY38_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY33-I_AG consensus was reconstructed after multiple CC alignment of 5-7 copies. CC The consensus encodes the 1466-aa GYPSY33_AGp gag-pol like CC polyprotein CC (pos. 82-4479). CC The sequence of the LTRs flanking GYPSY33-I_AG is deposited as CC GYPSY33-LTR_AG. XX FH Key Location/Qualifiers FT CDS 82..4479 FT /product="GYPSY33_AGp" FT /translation="MLTKDQMLCALECANVPVSPNATIAQIRKMYETTFES FT VQNMNNPYDVPSVSGKPKEETALALQSETHVSENNATTMTHAENDVTAILR FT NEHNTATHDSTIAGNPQSDTPNLNDEIEMLKKKLEILELKQRITALEAPTV FT PFPGPQPPIFNFEEIVDKFSGDNNDHIERWFKELERAFLPYNMDNTMKLHY FT TRRLLTGTAAKFAKSLDFTAYHELKDSLIETFSETKSLENVYKQLRSRQLG FT RNESITRYVLDMQALAWGTPVPEEDLVNIIIDGINDPINTASIRFAARSLT FT DLKRLLKRYEQIRPQHIATPSSSGTIRQPSNVSKTSDNQQAPAVVRCYNCS FT QFGHYQNSCPMPRRPPGSCFKCHQVGHAARNCPIKIIIPSAAAHYKDQGNE FT DQGTTLDEFEQVSVAFQNQNDPSKVLTCVRSLLDTGSPVSFIDSSIVPKTL FT VKPPLTSRYQGLGNQMLVVCGEVSCKINLRKDEAVHTFIILPSDSIAWPMI FT IGRDLLKKFNIFLYKNKSKIELNKGPTNLNRETCMSSLAQSIPLPQDTLDL FT VPFKLERLCPAEYHDTQNEAFEEICAIDLSDDVSELHIGKHLTGHENSALL FT CSINQNYLNYPSEKIIPPDHSMKISLTHDTPLFTKPRRLSYGERNQVREIV FT NDLLEKQIIRPSNSPYASPLVLVRKKSGEIRMCIDYRPLNKITVRDNFPLP FT LIETCLEHLSNKRIFTLLDLKSGFHQVKMHDDSIKYTSFVTPDGQYEYVKM FT PFGLKNAPSEFQRFINNILREFIEADKLVVYLDDIIIASVDFNSHLSTLSA FT VLTKIRQNGLELRLDKCRFGQQELDYLGYKANSCGIRPSDRHIAVIKNYPI FT PTNTKQVRRCLGLFSYFRRFVPSFSHIAKPLTNLLQNEKRFDFDSTCENAF FT KSLREKLILSPVLAIFDPKRETELHCDASSSGFGAVLLQKQDDGRFHPIAY FT FSKSTTTDESKLHSYELETLSIIYALKRFHVYVHGIPVKIVTDCNSLVETL FT KNRNTSAKIARWSLFLENYEYSIQHRAGSSMNHVDALSRLESSCAVNEIDL FT DFQLQVTQARDSVIEEIKKNLELGPVAGFTLQDGLVYRVSPSKGLQLYVPR FT EMSENIIRHVHEKIGHLAVDKTYDKIGTHYWFPYMKSKVEHFIRNCLKCII FT YSAPTRINNKNLHSIPKEPVPFHTLHIDHLGPLPSIRSQKKYILVVIDAFT FT KFMKMYATRSTNAQEVCNILNQYMSYYSRPKRIITDRATCFTSNQFENFLE FT HNGIQHILNATSSPQANGQVERVNRVLRPMLSKLSDFHDHSDWSSQLRSAE FT YALNNTKHASTNFTPSILLFGIEQRSCEVDELEEFLDEKNILNVNRPLIEI FT RHKASENIKRSQEINENYFNKNHKPATKFQKGDFVVIRNVDTTTNTNKKLI FT PKYKGPYVIHKVLPSDRYVIRDIDGCQVTQMPYDGVLEANKLKKWIEPCLR FT V" XX SQ Sequence 4577 BP; 1509 A; 945 C; 848 G; 1275 T; 0 other; tctcagaagt gggatacgac aaattataga gtgaaaatta atagcgttgt caagcgtgtg 60 ttgtttagcg tgtcggtcag aatgttgaca aaagaccaaa tgctgtgcgc cctggaatgc 120 gctaatgttc ctgtgtcgcc gaatgccacc attgcgcaaa tacgtaagat gtacgaaact 180 accttcgaaa gcgtacaaaa tatgaacaat ccatatgatg ttccatcggt tagtggaaag 240 cctaaggaag aaactgcact tgctctacaa agcgagacac atgttagtga gaacaatgcc 300 accaccatga cgcatgctga gaacgacgta accgccattt tgcgcaatga acacaatacc 360 gctacgcatg attcaactat agctggaaat ccgcaaagcg acacacccaa tttaaacgat 420 gagattgaaa tgctaaaaaa aaagcttgaa attcttgagc tcaaacaacg aattacagcc 480 ctagaagctc caaccgttcc tttcccgggg ccacaaccac cgatattcaa ttttgaagaa 540 attgtcgata aatttagcgg tgacaataat gatcacatcg agcgttggtt caaggaactg 600 gagcgtgcgt ttctcccata caatatggat aatacaatga agctccacta tactcgtcga 660 ttgctcactg gtacggctgc aaaatttgca aaatcactag attttacagc ctatcatgaa 720 ttgaaagata gtttgataga aactttcagc gaaaccaaat cgcttgaaaa tgtatacaag 780 cagctcagat cacgacaact tggaagaaac gagtccatca cacggtacgt tctagatatg 840 caagccttag catggggaac tccagtacca gaagaagatc tggtaaatat catcatcgat 900 ggaatcaatg atcctatcaa cacagcatct attcgattcg ctgctcgttc tttaactgat 960 ttgaagcgac tattgaaacg gtacgagcaa atccgccctc aacacattgc aacaccatca 1020 tcatcaggga ccatccgcca accaagtaat gtgtcaaaaa catcggataa ccagcaggca 1080 cccgctgtag tgagatgcta caactgttct caatttggac actatcagaa ttcctgccct 1140 atgccacgtc gaccaccagg atcgtgcttt aagtgtcatc aggttgggca tgctgctcga 1200 aactgcccaa ttaaaataat catcccatcg gctgccgcgc attataaaga tcagggtaat 1260 gaagatcagg gtacaacttt agacgaattt gaacaggtga gtgttgcctt ccaaaatcaa 1320 aatgacccga gcaaagtgtt gacatgcgtg cgttctctcc ttgatacagg aagcccagtg 1380 agctttatcg actcttctat agttccaaaa actttggtta aaccacctct gacatcaaga 1440 tatcagggtc tcggaaacca gatgcttgtt gtctgtggag aggtgtcttg caaaataaat 1500 ctgcgaaagg atgaggcagt acatacattt atcatcctcc ctagcgatag tattgcctgg 1560 cctatgatta ttggtcgtga cctattaaaa aaattcaata tctttctgta taaaaacaaa 1620 tcaaaaattg aacttaataa aggacctacc aatcttaaca gagaaacatg catgtcatca 1680 ttggctcaga gtattccgtt gcctcaagac acattagatt tagtaccatt caagttagaa 1740 agactatgcc ctgcagaata tcacgacacc caaaatgagg catttgagga aatatgtgcg 1800 attgatttat ctgatgatgt ttcagaactc catataggta aacatttaac tggccatgaa 1860 aattcagctc ttttatgttc tataaatcaa aattatttaa attatccatc tgaaaaaatt 1920 attccacccg atcacagcat gaaaataagt cttactcatg acacaccttt gttcacaaaa 1980 ccccgtcgac tatcatatgg agagcgaaat caagtacgtg aaatagttaa tgacttatta 2040 gaaaaacaaa taattcgtcc aagtaattcg ccgtatgctt cgcctctagt tttggtcaga 2100 aagaagagtg gggaaattcg catgtgtatt gactataggc ctctcaataa aatcactgtt 2160 cgagataatt ttccgctgcc gttaatagaa acatgtttgg aacatttaag taataagcga 2220 atttttacat tactagattt aaaaagtggc tttcatcaag tcaaaatgca cgatgattcg 2280 attaaatata cctcgtttgt tacccctgat ggtcaatacg agtatgtcaa aatgccattc 2340 ggattaaaaa acgctccttc tgaattccaa cgatttatca ataacatatt acgtgaattt 2400 attgaggcag acaaattagt agtttattta gacgatatca tcattgcttc agttgacttc 2460 aattctcatt tgagcactct tagtgctgtc cttacgaaaa tacgccaaaa tgggttagag 2520 ctccgtttag ataaatgtag gtttggtcaa caagaattag attatctggg gtataaagca 2580 aattcttgtg ggatacgtcc cagtgacagg cacatagcgg ttattaaaaa ttatccgata 2640 ccgacaaaca ccaaacaggt acgcagatgt ctagggcttt tctcatattt ccgtcgtttc 2700 gtcccttcat tttcacacat agcgaaacca ttaactaatt tgttacaaaa tgaaaagcgg 2760 tttgatttcg attctacgtg tgaaaatgca ttcaaatcct tacgagaaaa actcatttta 2820 tctcccgtat tggcgatatt tgatccaaaa cgtgagacag aactgcactg tgatgccagt 2880 tcttctggtt tcggtgcggt tttgttacaa aaacaagatg atggcagatt ccatcccata 2940 gcatattttt ctaagagcac aaccacggat gaatctaagc tccatagtta tgaactagaa 3000 acattgtcga tcatatatgc tttgaagcgc tttcatgttt atgttcatgg aatacccgta 3060 aaaattgtta cagactgtaa ctctctcgtg gaaaccctca aaaaccgtaa cacctcagcc 3120 aaaattgctc gatggtcgct tttccttgaa aattacgaat attccattca acatcgtgcg 3180 ggttcctcta tgaatcatgt tgacgcttta agcagattgg aatcaagttg cgctgtcaat 3240 gaaattgatt tggactttca actacaagta actcaggccc gagattctgt cattgaggaa 3300 ataaaaaaaa acttagagtt ggggcctgtt gcaggtttta cattgcagga cgggttagta 3360 tatcgcgtat caccttccaa gggactacaa ctatacgttc cacgtgaaat gtcagaaaac 3420 atcattcgtc atgttcatga aaaaataggt catctagctg tcgacaagac ttatgacaaa 3480 attggtacac attattggtt tccgtatatg aagtcaaaag ttgaacattt cattcgaaat 3540 tgtttaaaat gcataattta ttcggctcct actcgtatta ataacaaaaa cctacatagc 3600 atcccaaaag aaccagtacc gttccataca cttcatatag accatttggg tcccttacct 3660 tcaattagat cacaaaagaa atacatatta gtagtcatcg atgccttcac taaatttatg 3720 aaaatgtatg caactcgctc aacaaatgcc caagaggtct gtaatatcct caaccaatac 3780 atgtcatact atagccgccc aaagcgaatc attaccgatc gtgctacgtg tttcacttct 3840 aaccaatttg aaaatttcct cgaacataat ggtattcaac acatcttaaa tgccacaagt 3900 tccccccaag ctaatgggca ggttgaaaga gtcaatcgcg ttcttcgtcc tatgcttagt 3960 aagctctcgg actttcatga tcatagcgat tggagttctc aattacgatc agcagaatat 4020 gctttgaaca acactaaaca tgcgtctaca aactttactc cttctattct tctttttggc 4080 atcgaacagc gaagttgcga agtggatgaa ttggaggaat ttctggatga aaagaatatt 4140 ctaaatgtaa atcgaccact gatcgagatc cgtcataaag catcagaaaa tataaaacga 4200 tcacaggaga taaacgaaaa ttacttcaat aaaaatcata aacctgctac caaattccaa 4260 aaaggtgatt ttgtagttat acgtaacgtt gacactacca cgaacactaa caaaaaactg 4320 attccaaaat ataaagggcc atatgtcatt cataaagttt taccaagcga tcggtacgtt 4380 atcagagata tagatggttg tcaggtaact caaatgccat acgatggtgt gttagaagcg 4440 aataaactaa agaaatggat agaaccatgc ttgcgtgttt aggctaatga tgattaacgg 4500 caacatttac cttagattag tttaagaact aggctaggct tgaattgagg gcaattaaat 4560 gttcaggatg gccgagc 4577 // ID GYPSY47-I_AG repbase; DNA; ANG; 6625 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY47-I_AG is an internal portion of retrotransposon GYPSY47_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY47-I_AG; GYPSY47-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY47_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6625 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY47_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 86-86 (2004). XX DR [1] (Consensus) XX CC GYPSY47_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY40_AG, GYPSY41_AG, GYPSY42_AG, GYPSY43_AG, CC GYPSY44_AG, CC GYPSY45_AG and GYPSY46_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY47-I_AG consensus was reconstructed after multiple CC alignment of 6 copies. CC The consensus encodes the 429-aa GYPSY47_AG1p gag-like CC polyprotein (pos. 640-1926), the 1064?aa GYPSY47_AG2p CC pol-like polyprotein (pos. 1866-5057) and the 503-aa GYPSY47_AG3p CC env-like polyprotein (pos. 5061-6569). CC The sequence of the LTRs flanking GYPSY47-I_AG is deposited as CC GYPSY47-LTR_AG. XX FH Key Location/Qualifiers FT CDS 640..1926 FT /product="GYPSY47_AG1p" FT /translation="MSQQEVVDRILSRLAALEAQQQSIPGVIDYSDPPLHF FT TKTDGTAVSPESFDKIPDLVKDLPVFSGEPSELNSWIDDVDGIVKLYQTNS FT TNTVEQQNRFHMVCKFIRRKIRGEANDSLVASNVGINWNLIKKTLITYYGE FT KRDLETLDFQIMSVQQKGRSLEVYYDEVNRLLSLIANQIHTDDRFSHPEAS FT KALIETYNRKAIDSFIRGLDGDVYKFIRNYEPTSLAAAYSYCISFQNIECR FT KMLTTPKSYIPPSAPRNLIPIVPRIPTKPNMYRPPTTPNHPNSPIYYRPPT FT YLTPQNRPNQNFYNPHLRIRPPVIPQRNPFKPPVAQNPPQPEPMEVDSSIR FT SKQVNYGNRPNFSGTYKPPLKRTRAFNIETNTGNEQVEYEEEMAKTLEECD FT DDVLSAHDRYLACINRQDGSETSENGEEAELNFLE" FT CDS 1866..5057 FT /product="GYPSY47_AG2p" FT /translation="QTRWIRNIRKWRRSRIEFFRMSSTLPYFLYYGNAKEP FT LRILVDTGSNKNYIHPKYAKISHDLEKPFFISTVAGDVKITKYSQARLLKP FT YSDKMIKFFHLEQLKSFDAILGWDSLKENGSWINTVQDTLIVNGKYTIPLL FT LYKLQEVNQINIRDEHLQTQEKNDLRNMLHDFQDIFQPPDKKLPFTTKVKA FT EIRTIDQSPVYSKTYPYPQALKSEVHSQINKLLNDGIIRTSRSPYNSPVWI FT VPKKLDASNEKKYRLVIDYRKLNSKTISDRYPIPDTASVLANLGSNKYFTT FT LDLASGFHRIPMHERDIEKTAFSINNGKYEFVRMPFGLKNAPSIFQRVMDD FT TLREHIGKICYVYIDDIVVFGKTVEEHLKNLKTVLETLREANFKIQPDKSE FT FLKSEVEFLGFIISAEGLKPNMKKVECIRKYPEPKTLKDLRAFLGLSGYYR FT RFVKNYAELAKPLTKLLRGEDGHRQIPKNQSKNFTITLDESAKKSFQYLKE FT VLCSNDVLAFPDFEKPFILTTDASNIALGAVLSQQTSDGERPITFISRTLS FT RTEENYATNEKEMLAVVWALQTLRNFIYGAKLKIFTDHLPLTFTLSPKNNN FT AKLKRWKSFLEEHDYELCYKPGKANIVADALSRVQINSLTPTQHSAEDDDN FT DYILSTEAPINVFRNQLIFRLTPNSSYELTIPFQGYKRHTFCESEFSSDFI FT KQTFKKFLNPKVRNGIYCDQSIMGNIQEIFKSMYNSKVIKIRYSQLKVEDL FT QTEEQQLEKIKEIHNFAHRNVKENSIQFIKKYYFPGMRKSIQNFIQNCEIC FT KLEKYDRKPQKFIPIKTPIPTYPGEIVHIDIFAYNANNLFITSLDKFSKYL FT KIRPIKSKSIADIKDVLLQLLYDWNLPAEIVIDNESSFVSNVVEQSILNLG FT VKIFKTPVNRSETNGQVERCHSTIREIARCIKSLNPDMSITCLVQQATYKY FT NNSIHSFIKDTPRNIYIGEFSNDHSFAERSKIREEKDRKIAQLFKEKEEKI FT VDQEYQTYEPGSIAYEKNKTNNKRESRYYPIKIKENHSTYIIDSNNRKVHK FT VNLRKTN" FT CDS 5061..6569 FT /product="GYPSY47_AG3p" FT /translation="QKNLLTFNIFSLFRLAICVSFTIVYANIGIHDITNNP FT LAVIPLGQARVKTGNIRIIHPIKIDHLEQVLINNDAELKHHVNDNPLYNII FT KLKVNKLHETFNKIRPHTNRERRWDTIGTIWKWIAGTPDAEDLRVINSTIN FT SLIEQNNQQVLINDEIDKRLYDITRIANSVLQLEEERFHQRSNEIIQLIII FT SNLDTLQNYLETLEDAILLAKHGIPSSKLLSLDDLNQMVMFLAEHNIHVSS FT TEEMLIRSTAQVTMNKTHIIYMLKYPFESKHTFEYNYIDSIIKNDKRILLH FT HNYILKNNTHVFESSQPCEQIDNGNYLCDSILLEPSSKCIQKLVHGYHSDC FT TYEKVYSDGIIKRIHDAVILINNATVRISSSCNNDTQFLTGSYIIHFEKCN FT IYINGEEFPNLEITIPNKAFQPTLGLIATETDVIDIPPSKYLNNLTIEHRE FT ILKKIKLENKSLSWKVNLFGSISGSIVFIVFIVIAVLINIYRYQSLRFILP FT SASRETT" XX SQ Sequence 6625 BP; 2624 A; 1216 C; 994 G; 1791 T; 0 other; ggcgcccgaa tagggacccg atattgtgaa gtttttataa aattgttttg tgaagtgact 60 acgtaagagt aatgttgcaa cagaacagtt tgttaaaaac gagtgttacc tatcgccccg 120 caaaaaagtg acatttttca gtggcagcag ttcgtgtgat accgtagtat caacaagtgg 180 ttaaaagctc atcttgttga aaaaaggagc aattatctac gaatcgagaa gtaccgcaac 240 attatcgggt caacgccacg actgaatgac ttccaccgta agaggacagg agcaacagtc 300 gtcataagaa tttcagaaac atcgccgtaa gaggatagat actgattttc ttcgcagaga 360 attccagacc atcaccgtaa gaggcacgag gcggaatttt cataatcatt ttgaggtatc 420 ccctaaaatg aaacttatgt aagtaaaagt gaaataagcg tgagtatata aaatcaaaag 480 caaagtgaat cgaaaatata aagtgaaagt aataagagtt tccaaattga ttttcaaaaa 540 ttttcaaaaa aaaaaaacat aaaattaata aataatcaaa atcaatttat aataatcctg 600 gacaaattaa tagaatcact aaataatttt tcattcaaaa tgtctcaaca agaagtagtg 660 gatagaattc taagcagact agctgcttta gaagcacaac aacaatcgat tcctggagta 720 atagattact cggacccccc attacatttt acgaaaactg acggaacagc cgtaagccct 780 gaatcattcg ataaaatacc tgatcttgta aaagatttgc ctgttttctc tggtgaacca 840 agtgagttaa atagctggat tgatgatgtt gatggtatag taaaactata ccaaaccaac 900 agtaccaata ccgtggaaca acaaaacaga ttccacatgg tttgtaaatt cattcgaaga 960 aaaattagag gtgaggctaa cgattcctta gtggcctcaa atgttggaat aaattggaat 1020 ttaatcaaga aaaccttaat aacatattac ggagagaaac gtgatcttga aacacttgac 1080 ttccagataa tgagcgtaca acaaaagggt cggtcgttag aagtgtatta tgacgaagtc 1140 aacagactat tgtcattaat agcaaaccaa attcatactg atgatcgatt ctctcatcca 1200 gaagcatcta aagctctcat tgaaacttac aataggaaag ccattgattc cttcattaga 1260 ggacttgatg gagatgtgta caaattcata agaaattatg aacccacatc attggccgct 1320 gcttacagtt attgtatttc attccaaaac atcgagtgtc ggaaaatgct taccacacca 1380 aaatcataca taccaccttc cgcacccagg aatcttatcc caatagtacc aagaatacca 1440 acaaaaccaa atatgtacag accaccaacc actccaaacc accccaattc accaatttac 1500 tacagacctc ctacctatct aactccgcaa aatagaccaa atcaaaactt ttacaaccca 1560 catttacgca ttagaccacc tgtaattcca cagagaaatc cgtttaaacc accagtcgca 1620 caaaacccgc cacaaccaga acctatggaa gttgatagtt ccatccgttc taaacaggta 1680 aattacggta atcgtcctaa tttttcagga acatataaac caccattaaa aagaacaaga 1740 gcatttaata tagaaacaaa cacaggaaac gagcaagtag aatacgaaga ggaaatggca 1800 aaaacattag aagaatgtga cgatgatgtg ctctccgccc atgatcgata cttggcatgc 1860 attaacagac aagatggatc agaaacatcc gaaaatggag aagaagccga attgaatttt 1920 ttagaatgag ttccacatta ccatatttcc tttactatgg taatgcaaaa gaaccattac 1980 gaatcttagt agacacagga tctaacaaaa actatataca tccaaaatat gcaaaaattt 2040 ctcacgactt agaaaaacca ttttttatct caactgtagc aggagacgtc aaaataacca 2100 aatattcaca agctagatta cttaaaccat actcggacaa gatgataaaa ttttttcatc 2160 ttgaacaact gaaatcgttt gatgctattt tgggttggga ttctttaaag gaaaatggtt 2220 catggataaa tacagtacaa gacacattaa tagtaaatgg aaaatatact attccattat 2280 tgctatacaa gcttcaagaa gttaatcaaa taaatattcg cgatgaacat ttacaaactc 2340 aagaaaaaaa tgatttaaga aatatgctcc acgattttca agatatattt cagcctccag 2400 ataaaaaact accttttaca acaaaggtta aagctgaaat aagaactatc gatcaatcac 2460 cagtttacag taaaacatac ccatatccac aagcactaaa atcggaagtg cattcccaga 2520 taaacaaact tttaaacgat ggaattatac gaacatctcg atctccatac aattcacctg 2580 tatggattgt ccctaaaaaa cttgatgctt ctaacgaaaa aaaatataga cttgtcatag 2640 attacaggaa attaaactca aaaaccataa gtgatagata ccctattcca gatactgcat 2700 ccgttttggc caatttaggt tcaaataaat attttactac attagactta gcttcaggat 2760 ttcaccgaat tcctatgcat gaaagagaca tagaaaaaac tgcgttttca attaacaatg 2820 gcaaatacga atttgtacgt atgccttttg gtttaaaaaa tgcaccatcg atttttcaaa 2880 gagtaatgga tgacacatta cgtgaacata ttggaaaaat ttgctatgtt tacatagatg 2940 acatagtagt ctttggaaaa actgttgaag aacatttaaa aaatttaaaa actgtgttag 3000 aaacacttag ggaagctaat tttaaaatac aaccagataa atcagaattt cttaaatcag 3060 aagttgaatt tcttggcttc attatttctg cagaaggtct taaaccaaac atgaaaaaag 3120 tagaatgcat ccgaaaatat cccgaaccga agacattaaa agatttaaga gcatttctag 3180 gactctccgg atattataga cgttttgtaa agaattacgc agaacttgcg aaacccctta 3240 cgaaactttt aagaggagag gatggccatc gccaaattcc caaaaaccag tcaaaaaatt 3300 tcaccataac tttagatgaa tcagcaaaaa aatcattcca ataccttaaa gaagtgctat 3360 gttcaaatga tgttttagct tttccggatt ttgaaaaacc cttcattctt actacagatg 3420 catcaaatat agcattggga gcagtacttt cgcaacaaac ctcagatgga gagagaccaa 3480 ttacatttat ttccaggact ctttccagaa cggaagaaaa ttacgcgact aacgaaaaag 3540 aaatgctagc agtagtatgg gcattacaaa ctctaagaaa ttttatctac ggagcaaaac 3600 ttaaaatttt taccgatcac ctcccactaa catttacact ttccccaaaa aataataatg 3660 caaaacttaa aagatggaaa tcatttttag aagaacatga ttatgaatta tgttataaac 3720 caggtaaagc aaacattgtt gcagatgctt tatcacgtgt tcaaattaac tccttaacac 3780 ctactcaaca ctcagcagaa gatgatgata acgattatat tctttcaaca gaggcaccca 3840 ttaatgtttt ccgaaatcaa cttatattcc gattaacccc taactcctcc tatgaactaa 3900 caataccatt tcaagggtat aaaagacaca ccttctgtga atctgaattt tcaagtgact 3960 tcattaaaca aacttttaaa aaattcctca atcctaaagt acgtaatggt atatattgcg 4020 atcaatccat aatgggaaac attcaagaaa ttttcaaatc aatgtataat tcaaaagtaa 4080 taaaaattcg ttattctcaa ttgaaagtag aagatttgca aacagaggaa caacaattag 4140 aaaaaattaa agaaatacat aactttgcac atagaaatgt taaagaaaat tctatacaat 4200 tcataaagaa atattatttc cctggaatgc gtaaatccat tcaaaacttc attcaaaatt 4260 gtgaaatttg caaactagaa aaatatgata gaaaaccaca gaagttcatt ccaattaaaa 4320 ccccaatacc tacctatcca ggagaaatag ttcacataga tatttttgcc tataatgcaa 4380 ataatctttt tattacatcc ttggacaaat tttctaaata tttaaaaatc agacctataa 4440 aatcaaaatc aatagccgat attaaagatg tcttgttaca gcttttatat gactggaatc 4500 ttccagcaga aatagtaatt gataatgagt cttcttttgt atcaaatgtt gtagaacaat 4560 caatattgaa tttaggcgtg aaaatattta aaactccggt taataggtca gaaacaaatg 4620 gtcaagtaga aagatgtcat tcaactatac gagaaatagc cagatgcatt aaatctctta 4680 atccagatat gagtatcact tgccttgtac aacaagccac ttacaaatac aacaactcta 4740 tacatagttt cattaaagat acaccgagaa acatttatat tggagaattt tcaaatgacc 4800 attcatttgc agaaagatca aaaatacgtg aagaaaaaga taggaaaatc gctcaacttt 4860 tcaaagaaaa agaagagaaa atagtcgatc aggaatatca gacttatgaa ccaggaagca 4920 ttgcatatga aaaaaataaa acaaataaca aaagagagag tagatactat cccataaaaa 4980 ttaaagaaaa tcattcaact tatataatcg attctaataa cagaaaggta cacaaagtta 5040 accttaggaa aaccaactaa caaaaaaatt tactaacatt taacattttt tctcttttca 5100 gactcgccat ttgcgtgtca tttacaattg tttatgctaa cataggaata catgatataa 5160 ctaacaaccc cttagcagta attccattag gtcaagcacg agtcaagaca ggaaacataa 5220 gaatcataca cccaatcaag atagatcatt tagaacaagt tctcataaac aatgatgcag 5280 aattaaaaca ccatgtgaac gacaatcctc tttacaatat tataaaatta aaggttaata 5340 aattgcatga aacatttaac aagattaggc cacacacgaa cagagaacga cgatgggata 5400 ccattggcac catatggaag tggatcgcag gaacacctga tgctgaagat ttacgtgtta 5460 taaactccac tattaactct ttgattgagc aaaataatca acaagtactg attaacgacg 5520 aaattgataa acgcttgtat gatataacta gaattgcaaa cagcgtactc caactggaag 5580 aagaaagatt tcatcaaaga tcgaatgaaa tcatccaact aattatcatc tcaaatctcg 5640 atactctgca aaactattta gaaacactgg aagatgctat tttactagca aaacatggaa 5700 tacccagcag caaactatta tctcttgacg acctgaacca aatggttatg tttttagctg 5760 agcacaacat acatgtatca tctaccgaag aaatgttgat aagatccaca gctcaggtaa 5820 caatgaacaa aacacatatc atatatatgt tgaaataccc atttgaatcc aaacatactt 5880 ttgaatacaa ttatattgat tcaattataa aaaatgacaa aagaattctc ttgcatcaca 5940 actatatact gaagaacaac actcatgtat ttgaatcgtc acagccctgc gaacaaatag 6000 ataatggcaa ttatctttgt gacagcatac tcctagaacc atcaagtaag tgtattcaaa 6060 aactagtaca tggatatcat tccgattgta cttatgaaaa agtatattcc gatggaataa 6120 ttaaacggat tcatgatgcc gttatactta ttaacaatgc cacagtcaga atttcatcga 6180 gttgcaacaa tgacacacaa tttctcacag gctcatatat aatacatttt gaaaaatgca 6240 acatctacat caacggagaa gaatttccaa acctggaaat aacaatacca aacaaagcat 6300 ttcaaccaac actggggtta atagcaaccg aaaccgacgt catcgacata ccaccatcaa 6360 aatatctaaa caatctaacc attgagcata gagaaatatt gaaaaaaatc aaactagaga 6420 acaagtcttt gtcatggaaa gttaatttgt ttggatcaat aagtggttca attgtattca 6480 ttgtgtttat cgtaatagcc gtattgataa atatttacag atatcaaagc ctgaggttca 6540 tattaccatc tgcatcaaga gaaacaacat aaagtaaacg agagcatcta gaaacttcga 6600 ggacaaagtc atttaagagg agagg 6625 // ID AgaP8 repbase; DNA; ANG; 7508 BP. XX AC DQ301497; XX DT 22-AUG-2006 (Rel. 13.08, Created) DT 12-SEP-2008 (Rel. 13.1, Last updated, Version 2) XX DE Anopheles gambiae str. PEST clone AgaP8 transposon P-like, DE complete sequence. XX KW P; DNA transposon; Transposable Element; AgaP8. XX NM AgaP8. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7508 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-7508 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301497; Positions 1 7508. XX SQ Sequence 7508 BP; 1988 A; 1499 C; 1429 G; 2011 T; 581 other; caaggttatt agactctata cagattatga caaaagccct ttttggccga tctccaataa 60 aaaaatgaat gaaaaaattt tgacagacaa attggaaagg tttgtttaca atttgaacgg 120 tttgtttaca atcagtttgc aatcggtcag tcgcaaagcc atcgttattt ccagagtgta 180 tttcaacaac ctttatcgtt ttcaacatat tataattaat tatttattat gtcttcaaac 240 tataagtgct ctgtagcctc ttgtaaaaac aataggtgca atgtgaagaa gatgggagct 300 aaaatttact tccataaatt cccagagtgt ttagcaacga agcaaaaatg gattgtattt 360 tgtggaaagg ataatcctta gatgccttct ccaaataatg ttgtatgctc ggaacatttt 420 gtaccatctg attatcaatt aagaaacgtg caggaaatta agcaggggtc taactggtta 480 aaacctcaag gtaagattct ataaattctg atgattattc tatttacgat ttttttaact 540 ttttgttgtg tttacagcca ttccgtcagt tttatcgccg ttagaaaatg atagtgatat 600 tgcattattg cactttgaca acaataccaa aaaccaaaac acaaacgaag ggagaatcgg 660 ttgacccctg ccggaaggta agttgttacc aaattatgtg ttagtattac atactttcat 720 ctatactaat atgcaaatca atataatttc agctggttct tcgttttgtg atgaacatat 780 cattgaacaa agtgcaggca actttactat tgaaataaat gaaaaagatg tgtggttcaa 840 gtgtaaagac tatgaagaga aaatatgtga tctagaatct aaacttaaat cagtagaaga 900 caggaacgag cggttagctg atgtaaataa tcagctaagc gaaaaattga aaatatttta 960 tggaaaagaa aaagctcata ttaagcaaat acaaacactt gaaaaagcaa ataacaaaat 1020 taaggaagaa tggccacgaa acaaatttat agaaaatata aaagaggcac tgaaaggcat 1080 tttatcaagc aaccaaattg atctcatttt agaaatcaag aaaacagtaa gatggtcaat 1140 gcaagaactt tctgcagcat ttacattaaa atatttcagt caaagggctt acaaatatat 1200 aattttttat ttaaaaattc ctctgccatc cataagaaca ttacaacgtt atgcaagcag 1260 gattgacctt aaacaaggca ttctagatga tattttatca ttcattgggt cctttgctta 1320 aacattgagt actatggata gagaatgtgt attgccgttc gatgagatga aggtgtctcg 1380 tatattggaa tacgatccat cggcggatga agtggtaggt ccttttgact tttgacaaat 1440 agtaatgatg cgcgggcttt ttaaacaatg gaaacaacct atttttattg cttttggcca 1500 aaaaatcacc aaagatattt tgattgacat aataacaagg ttcagtgata agatgataaa 1560 tgttgtagcg atagttagcg ataattgcca agcaaacata aaatgttgga aggagttagg 1620 tgctagggat gacatcgaaa aaccatattt tttacatcct aaaacacaaa acaatgtata 1680 tgtagtccca gatactccgc atcttctaaa attgttgaag aactggttat tggatcatgg 1740 gtttgaacat aacggcaaac acatagaaac cagcaacctt ttacgtatgg tagctcaaag 1800 aatggagtcc gaaatgactc ctttcccaac cgacaatttc aaatattttc gacgtcgggt 1860 gtacgtcgtg tgaacgttca cccgacgtcg ggtagtcgtc gcagttaagt caatatacga 1920 cgaagcgaaa ttttttgaaa cgaaaaacgc cacaattata gcagattttc tggagtattt 1980 cgctcgtggt ttttgtctag gacaacggcc ataaaaaggt ggttcaaaaa gtcgataccc 2040 tctctccccc tcactcttcc tctatgggta ccgtacgatc catctcgcta ttctcacatg 2100 gtagcgggag tgatatcgct tatctgctgt cttctttatc cgttctccca gatcatattc 2160 gttttgagtg acgttgcttc cccacctccc tcattcccac ctttcccttc cagcagttat 2220 tggcgaatgc ttgaccttgc cttgtgtgct tgacttggtt tttttttttc cttcccgttt 2280 ttttttggct gcttgatgca ttgcgcgttg gctagagccg tcgttcgaat ctggatttga 2340 tcagcgtgca agttgtcgtg tggatgtatg aatagcttta ccgttgaagc tatttgccgg 2400 gtcgatcgga tgcatgctgc gtacgggttt tatattgttt gttagtgcac ttagtgtcag 2460 ttagttgcac ttgttaaaaa aatcaatcaa aagacgatca tcaatgcact agcacacaca 2520 acatatcatg tgaagccaac caatgcttaa cattgtcaaa tattcaaatg agggatgatg 2580 tctttgaaca catttttctc ttcatttttt tattgttttt attcaacaat gcacaattct 2640 tccaattctc ccccccctct cctctccttc cttcccacac aatcgggaat acaggtaagg 2700 atttaacaac catattgtac cgtctttttc ctaattaacg ggggctcctc tccggaagag 2760 aaaaaggcac gtctcccctg cctctcgctc aatgctgtaa gcgagcaggg cccgcctcgg 2820 cttaggtggc gcacgcttct ctcgctcgcg ttgcttagct tgcatcttct ctcgcttttt 2880 ttggaaacgc atcatcgttg cggatggtgc tgcgcggcca tatcgctatc tagagggtgg 2940 catagatagt tgtgttctag cccgtcgctg cactcgccat tgacgcagca tgcgtaaggg 3000 acgtttatgc tgcatgcatt gtcgatagcg ccgcggctgc ggtcggggat tggctgtggc 3060 gaccggcccc ccgggagcct taaatgccgg tggtgcagtc ggagccacgc gccaagctgc 3120 tccacgtgtc cccgcattct atggatggct tagctggttt cgggccgccc gttatcgtgc 3180 tacccgaaca caacataaca acatacccgt ccaaggttca agcccggtat ggtccagtta 3240 gtaacccgga aaaccagtta gtaacaggta aagtcaactc gcacgaatta acaacatgct 3300 cgttttcggt tcaatcctcg tatggactat ctctcgtagc aagggctaac tatccgaaat 3360 ttctttccaa agtgtggctc gaaggaagaa catcgaccta caaaaaatgc atgttgctcc 3420 cttcctgtat gtcctacccg ttcctgtatg ctcccttcag ctcgcttcgc acttgtccct 3480 cattgtttag aacttatcgt cgctccggaa caatgcgcgc tgctgcacgc aacccacgcc 3540 ttcatctttt catctcacct cttcgattat cacccttccc gctcctcctt ccgctcttgt 3600 ttgccccctt tttccttact cccttccccc ttctttcctt tttattcgtg ttgctgttta 3660 ataaggctag catagtttag gatgtaaatg taaactaaat gtaaaatgta aactgtattt 3720 gccgtgctta aggcaaacaa tagtaaggtg cgggcagggg cggttcaacc agtacgcaat 3780 ctaagcggcc gcgtggggcc ccgaggagag gaacccatcc ccccactcgg aagtaaaagt 3840 gcttcgcaat ccttctctgg tttagaattc ctcatacaac ttctcgctcg acacatccga 3900 atggtcaact ttcttgtgtc cggttagggc ctccactcgt gctaatacgc cactggatgc 3960 gagttgacca accgggtcgg aaccatccgg atcatttggg aggtgcgatt gctttccaaa 4020 tcactgctcc cacgagtcgg aatggtacct gccgaactac ctggcgcccc gccgactaca 4080 tatcagcggt ccttggaacc acgtagtcga accacagccc gtggccaata agaccccaca 4140 gctcccgtat aaattttatt cttcttcttt gacgtaacga acaatgctgt ggacagggct 4200 aaaacttctt caccttacta aaatcaagct ataggatagt cggttcttgc tatcggggga 4260 atggtccgga tgagaatcga tctcgtatgg cctcatagga tccggtctct tagtgatgat 4320 gacgcagtgg atcatcaatg agtgacgatt aaaaacattc tgccgtttgg tttaaagggt 4380 atacaatatg cggcatttta ttgttgctct tccgcataaa aagtttacca ccatgtgtcc 4440 caaatttaca cacatcataa tactgtgttt gtggttgtgc taatcacaaa acgctttctg 4500 tatagcgtgg cacataaacg tgcacgaggt ggtacatcat ttggtgtgcc agagccattg 4560 cttctaacaa ttttccttta actaactaac ggatcatgag gatactgatc gttcgcagcc 4620 aatgccacac acgggaaaaa caaggagaaa gagaattccc tcacgtggga gaaagagcag 4680 caacaaccaa accgtagagc cgatgggatt gtgagaatgc gacgggagat catcgtctgt 4740 gagcgagagt gaaacagaag tcagaaccat aaacttccca tccataatct gtgaccataa 4800 tttgggtgtg tgtcttcgct ccatacaaat cgtcaaaaaa acattttcgc ttggtcggaa 4860 atcgcggcga aaaccgcctt ttgcggatag ggtttgttca agcttacaat ggcacacatc 4920 tatatgaccc cccaacagcg tcaaaatgtt cgtcgagctg cagaattgct gtctcgcact 4980 accgctgtag cttttcgaac ttattaccct tacaatgaaa atgcaaaagt tttagctgat 5040 tttattgaaa aggttgattt atggtttagt gtgtcaaatt cttgacgact aaacgacgag 5100 cgagtgacga tcaaattggc gccttgagag atatgtttga aactgtatca actatgacca 5160 taccaggaaa gtcaaatttg caagtattcc aacgatccat tataatgcaa atcacgtctt 5220 tgcaaatggt ttttgcagat aggaaaaaaa aacagtcttt taggccgtcc acaaggacag 5280 aggaggcgtg gtaggcccaa attgaggtgg caagatggcg tggaggcgtc cgccattaag 5340 gccgggataa cggactggca gacgaaggcg cgagaccgtg agcggtttcg gacactcctg 5400 aggcaggcca agaccgcaaa gcggttgtag cgccggataa gtaagtaagt aagaaaaaaa 5460 aacatgatat acaattcatc tgcacacata aagtaatcaa tatttaagta tttagagttt 5520 tgtatttaat aaagtttatt tttttatttt tcagctaaac caagacgtat tagagaattt 5580 attttcgcaa ataaggcaaa tgggaggagn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5640 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5700 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5760 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5820 nnnnnnnnnn nnnnnnnnna cttttcgggt caacccgaac gactccgggt cgcttacgga 5880 tggctccgac ccgataggtt cgggtcgacc cgcccatcac tattaacgat gcatttattt 5940 tgaatggtta ctcgcaacat ttttataatt tagcttcatg caaatagtaa aattagctat 6000 gttcaagtag actggaccaa gtgtaacaag acatcatgta ccgggtttct cttagcatgt 6060 ttgatgtgag atgtgtaacc agaccaacca atcctcttgt tgtcctacaa aaaagaataa 6120 aatattaatt ccataatcac aattgaatct gcacacacgc gcagagctcg catatcaaaa 6180 tataatatca atcaacaatg tgaagttagt ggcggcttca acgataatcg gactagttac 6240 cgcttaaggc acccggcacc ctaaatcctt tactgtgcga agcagcatat aacacttctt 6300 ggtcgcttcg tatagcatgt ttatttcgtg cctcatgagc taagcgcgat tttttagcca 6360 aatgattgaa accaacaagc atctgattga gcggtcagca cggttacatc aatacgtcta 6420 gatgacggat tgacttgtaa acatagggtt gtaatgcaca cctgataccc acctgattcg 6480 ccatcaagcg ctgaactggc cagagtcagt tatattatca gtcacctaat gctgagcgca 6540 ccgtaggatc cactctctgg tctgcaaatc ggtccatcaa ccgacctgct gatgtcattg 6600 tatgattagt ccatcgtccg atcagccggt tcactacccg cccggtccgg taaccatagt 6660 ggctttccgc acttgctcct gcactcagtc gttgctcctc agataaccat gttatgttat 6720 agaattgacg ttaatcgttc gatcgtgggc cattcattga aatcctacat tcactttgta 6780 gaaccagcat acgcgtttga aaactttgcc cgctgcatta tttctcacag tcgccttagc 6840 agctgattgt tgtgaaagaa gaattacggc aaagctgtaa ggatcgacca tataccgggt 6900 ctctatgatt ggctatgctg agctaaaata taaaattatn nnnnnnnnnn nnnnnnnnnn 6960 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7020 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7080 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7140 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7200 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7260 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ttatattaac tgtgggatat gttttagttt 7320 attgaaaaag tacgcaaaag ttttgtcttc cgttaactta cctctgttgt tgatgctaag 7380 catcaactga ttggcagcga ttttgacaga caaattgtaa gatattcaaa aagatattca 7440 ccaagatatt caaaatggtg agctgttcaa gagctctgtc gtaacctgta tagagtctaa 7500 taaccttg 7508 // ID GYPSY43-LTR_AG repbase; DNA; ANG; 196 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY43-LTR_AG is an LTR of retrotransposon GYPSY43_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY43_AG; GYPSY lineage; GYPSY43-I_AG; GYPSY43-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-196 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY43_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 79-79 (2004). XX DR [1] (Consensus) XX CC GYPSY43-LTR is a long terminal repeat of GYPSY43_AG (its internal CC portion is deposited as GYPSY43-I_AG). XX SQ Sequence 196 BP; 70 A; 39 C; 28 G; 59 T; 0 other; agtaatagat agtgtgttta ttattatatg tttacatata aaacgttcaa tcacacctca 60 ccatcatata aaaaccttat tcaactgacg tgttgcgctg tcagtttgat gataaacgca 120 ggctcctcaa taaagtcatt attattccga tcgttaaaga gaaaggacac aaacacaaca 180 cccacgactt gtaatt 196 // ID GYPSY5-I_AG repbase; DNA; ANG; 4539 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY5-I_AG is an internal portion of the GYPSY5_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW AP protease; GYPSY5-I_AG; GYPSY5-LTR_AG; GYPSY5_AG; Gyspy clade; KW gag; integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4539 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY5_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 79-79 (2003). XX DR [1] (Consensus) XX CC GYPSY5_AG is a family of autonomous Gypsy-like LTR CC retrotransposons. CC GYPSY5-I_AG, an internal portion of GYPSY5_AG, is flanked by CC GYPSY5-LTR_AG LTRs. The GYPSY5-I_AG consensus sequence was CC reconstructed based on multiple alignment of 15 copies; they are CC less than 1% divergent from the consensus sequence. Two elements CC are CC 100% identical to the consensus and contain intact ORF, thus the CC family CC appear to be still active. CC The consensus sequence encodes the 1469-aa GYPSY5_AGp protein CC (pos. 35-4441), composed of gag (zinc-finger, 309-357), protease CC (403-492), reverse transcriptase (pos. 668-837) and integrase CC (pos. 1181-1332) domains. XX FH Key Location/Qualifiers FT CDS 35..4441 FT /product="GYPSY5_AGp" FT /translation="MLTKEELLCALEVANIEVPPKATLPQLRMLYEQSVPK FT NKMEEQSTQNFIPQRVCADEDEVTNNGNHVAAAAILMKDKAAPTSSTQGAS FT FDATSHQLELMALRAKIMEMEQRQTFTDGRLVHPEELKHLIPEFSDGLGIN FT KWINTIRYNSELYGWQDRTMLLYAGSRLTGAASEWYNGFRNTLKTFDEFAD FT TIKKAFPDRCNEAVIHSQLASVYKKISESYTSYVYRVNALGMSGHVSEEAI FT ITYVIRGLSRDPLYDSLVTKDYRDIYDLIDNIKRYESHLLLRKNPERRSPS FT HINTISPRPIPPRQTTTEPLRCYNCSNHGHHSSQCTQPRRAPGSCFRCGST FT SHVIRNCPVPDRRQLTVAAVQGNDNETAHLDSGENGNFVQLEAYQEVSVAF FT KRNNVWSPGLIITSLFDSGSSKSFINEAIVPVTKLSAPQPSGFRGIGNVNL FT QTLGTVQLKLSFRNQTFIHNFYILPKSYMSLSMIVGRDLLSEFNITLAQFR FT KHYSKLMLMNLNKDKILNLKKPGFYHKLQTLGLLRSSIQAPLPEVCKDSKL FT SDPNISSKFISISDKSEHKQQYDTTFSEMCSINISDEASTINVGEHLSKEQ FT GIALRSIVSNNYINFPDKHIIPSAHKMRISLTHDTPIFTKPRRLSFDERNK FT VKVIVKDLLEKNIIRPSNSPYASALVLVRKKNGEIRKCVDYRPLNKVTIRD FT NYPLPLIETCLEHLGNKKFFTLLDLKSGFHQVAMDEDSIKFTAFVTPDGQY FT EYTRMPFGLKNAPAEFQRFINTILRKFIENEKLVVYIDDILIASQDFKEHL FT EIVSEVLHTLRNNGLELRLDKCKFAYDELDYLGYKANHSGICPSDNHVKII FT KNYPVPQNTKQVQQCLGLFSFFRRFVPHFSSIAKPLTNLLKNNVPFIFDDE FT CKKAFETLRDKLIVAPVLAIYDPKRETELHCDASSIAFGSVLLQKQDDGRY FT HPISYFSKTTSADEAKLHSYELETLAVIYALKRFHTYVHGIPIKIVTDCNS FT LVETLKNRNSSAKIARWSLFLENYNYIIQHRPGLAMSHVDALSRLEHLAAF FT DDVDIDFQIRVAQARDPLIQTLKKELETTDVEGYQLQDGIVFKRSPSNRLK FT LYVPTEMVNNLIRSIHEQIGHLGAEKCCNQIDQNYWFPNRKTRIINFINNC FT LKCIIHSAPSRVNNRNLHSIQKEPYPFDTLHIDHFGPLPTSSLKKKYLLVV FT IDAFTKFVKLYPTTSTSTKEVCNALKQYFSYYSRPKRIISDRGTCFTSAAF FT SNFLSSRGISHVLNATGSPQANGQVERVNRIIRPILSKLSNQTDHVDWVTH FT LLSTEYALNNTIHSSTRFSPSMLLFGVNQRGPSVDILTEYLEDKNKAFSDL FT ETIRAEASFNILKSQQNNEKQHAKHHRPAPVFNEGEFVVIKNVDNTPNSNK FT KLIARYKGPYVIHKRLPNDRYVIRDIDGIQMTQIPYDGVLESDKLRRWIVP FT LGGA" XX SQ Sequence 4539 BP; 1491 A; 929 C; 859 G; 1260 T; 0 other; attcagaagt gggataaacc gtttgtgccg aaaaatgctt acaaaagagg aacttctttg 60 tgctttggaa gttgctaaca tcgaagtgcc tccgaaagca actttaccac aactacgcat 120 gttgtacgag caaagtgtac cgaaaaacaa aatggaggaa caatcaaccc agaatttcat 180 tcctcaaaga gtgtgtgcag atgaagacga agtgacgaac aatggaaacc acgttgcagc 240 agccgccatc ttgatgaaag acaaagctgc tccgacatct tcgacgcaag gtgcttcatt 300 cgacgcaact tctcatcaat tggaattaat ggcgctacga gcaaaaatta tggaaatgga 360 acagcggcag acgtttacgg atggtcgatt ggttcatccc gaggagttga aacatttgat 420 accggaattt tctgacggcc tcggtatcaa taagtggatc aatacgattc gctacaatag 480 cgaattatat ggatggcagg atcgcacgat gcttttatat gcaggcagtc ggttgactgg 540 agcagcaagc gaatggtaca atggcttccg taacactttg aagacattcg acgaattcgc 600 tgacacaatt aaaaaggctt ttcctgatcg ctgtaacgaa gccgttattc atagccaatt 660 agcatcggtc tacaagaaaa tttctgagtc gtacacaagc tatgtatacc gagtaaatgc 720 gctgggaatg tcaggccacg tgagtgagga ggctatcata acttatgtca tcagaggact 780 ttctcgtgac cctctctatg atagccttgt gaccaaggat taccgcgata tttacgacct 840 gattgataac attaagcgat atgaatccca tcttctgttg cgcaaaaacc cagaacgccg 900 cagcccatct cacatcaaca ccatttcccc gagaccgatt ccaccaagac aaacgacgac 960 agaacctctt cgatgttata actgctcgaa tcatggacat cattcatcgc aatgcacaca 1020 acctcgccga gctccgggtt cctgtttccg atgtggtagc acatcacatg tcattcgcaa 1080 ctgtcctgtc ccagatagac gtcaactaac ggttgctgcg gtacagggca acgacaatga 1140 aacagcgcat ctagattctg gggaaaatgg aaacttcgtt caacttgagg cctatcagga 1200 ggtaagtgta gcttttaaga gaaacaatgt ttggagccca gggttaatta taacatctct 1260 ttttgattcc ggtagctcta aaagcttcat aaatgaagcc attgttccgg ttacaaaact 1320 aagcgctccc caaccaagtg gatttcgagg aataggaaat gtgaatttac aaactcttgg 1380 aacagtacag ctaaaactta gttttcgtaa tcaaacattc attcacaatt tctatatttt 1440 accaaaaagc tacatgtccc tatccatgat agtagggaga gatttgttat cagaatttaa 1500 catcacactc gctcaattcc gtaaacatta cagcaaactt atgctgatga atttaaataa 1560 agataaaatt ctgaacttaa aaaaaccggg tttttaccat aagctccaaa cgttgggtct 1620 tttacgtagt tcgatccagg ctcccttacc agaggtatgc aaagattcga aattgtccga 1680 tcccaatatt tcttcaaaat tcatttcgat atcagacaaa tctgaacata aacaacaata 1740 tgacacaact ttttctgaaa tgtgttctat aaatattagc gatgaggcta gcactataaa 1800 tgttggagaa catctttcta aagaacaagg cattgcttta agatcaatag tgtccaataa 1860 ttacattaat tttccggata aacatataat accatctgct cataaaatgc gaattagctt 1920 aactcatgat actcctattt tcacaaagcc tagacgactt tctttcgatg aaaggaataa 1980 agtaaaagtc attgtgaagg atttattaga gaaaaatata ataagaccca gtaactctcc 2040 ttatgcttcc gcactggtac tcgttaggaa gaaaaatggc gaaattcgta agtgtgtcga 2100 ctaccgacct cttaataaag taacaatccg cgacaactac ccgcttccac tcatagagac 2160 gtgcctagaa cacttgggca ataaaaaatt ctttacgtta ctagacctaa agagtggttt 2220 tcatcaagtg gcaatggacg aggattcaat taaattcaca gcgtttgtta ctccagacgg 2280 ccagtacgag tacacgcgta tgcccttcgg attaaaaaac gcgccagctg aatttcagcg 2340 tttcattaat acaattttgc ggaaattcat cgagaatgaa aagctggttg tatacattga 2400 cgacattctc atagcttctc aagactttaa agaacatctt gaaatagtta gtgaagtttt 2460 gcatactcta cgaaataatg ggttggagct acggcttgat aaatgtaagt ttgcttacga 2520 tgaattggat tacttaggat acaaggccaa tcattcaggt atatgtccta gcgataatca 2580 tgtaaaaatt atcaaaaact atcctgttcc tcagaacaca aaacaggtac aacaatgttt 2640 aggtcttttt tcgtttttcc ggaggtttgt tccacatttc tctagtattg caaaaccact 2700 tactaatctt ttgaaaaaca atgttccatt tatttttgat gatgagtgta aaaaagcatt 2760 tgaaacactc cgagataaat tgatagtagc tccggtgctt gccatatacg accctaaacg 2820 cgaaactgaa cttcactgtg acgcaagttc gatagctttt ggttctgtgc ttttgcaaaa 2880 acaggatgat ggtagatatc accctatttc gtatttttcc aaaaccactt cggctgatga 2940 agctaaattg cacagctacg agcttgaaac attggccgtt atatatgcgc tcaaacgttt 3000 ccatacctat gtacacggca ttccaataaa gatagtgact gattgtaatt ctctggtaga 3060 aacgcttaaa aatagaaatt cgtccgcaaa gatagcacgt tggtcactgt tcctggagaa 3120 ttataattac attattcaac atcgccctgg attagcaatg agtcatgtag acgcattaag 3180 ccgactggaa catttggctg ccttcgatga tgttgatatt gacttccaaa tacgtgtagc 3240 tcaggctagg gatccgctta tccaaactct gaaaaaagag ttagagacga cggatgtcga 3300 aggttaccaa ttacaagacg gtattgtatt caaacgatca ccttctaata ggctaaaatt 3360 atacgtgcca acagaaatgg ttaataatct gatacgatca atacacgaac aaataggtca 3420 tttgggagcc gaaaaatgtt gcaaccaaat agatcagaat tattggtttc ctaacagaaa 3480 aacacggata ataaacttta taaataactg cttgaagtgc attattcact ctgctccatc 3540 tagagtaaat aatcggaacc ttcatagcat ccaaaaagaa ccttacccat ttgatacctt 3600 gcacattgac cattttggac cgctacctac atcatcttta aagaaaaaat acttattggt 3660 agtaatagac gcatttacga aattcgttaa attgtatcca acaacttcca ctagtacaaa 3720 agaggtgtgc aatgctctaa aacaatattt ctcttactac agtcgaccta aaagaataat 3780 tagcgatcga ggtacatgct ttacttcagc cgcattttcc aatttccttt cttctcgcgg 3840 aattagccat gtgttgaatg ccacaggatc accacaagct aatggtcaag tggaaagggt 3900 caatcgcatt attcgtccca tattgagcaa attatcaaat cagaccgacc atgtagattg 3960 ggtgacccat ttgctatcta cagaatacgc tcttaataat accattcatt cgtccactcg 4020 tttttcccct tctatgctat tgtttggtgt taaccaacga ggcccttcag tagatatatt 4080 gaccgaatat ttagaagaca aaaacaaagc attttcagat ttagaaacta tacgtgcaga 4140 agcatcattc aatattttaa aatcacaaca aaataatgag aaacaacatg ctaaacacca 4200 tcgccctgct cccgtattca atgaaggaga gtttgttgta attaaaaatg tagataacac 4260 accaaattca aacaaaaaac tcattgccag atataaaggt ccatacgtta tccacaaacg 4320 tttacctaat gacagatatg ttatacgtga tatcgatgga atacaaatga cgcaaatacc 4380 atatgatgga gtgttggagt cggacaagtt gagaagatgg atagtgcctc tggggggagc 4440 ttgatcggaa tatgttcata aagctacact taagattaag ggtagggcta tacatatgca 4500 ggaaaaattg aggtcaattt ctcgtcagga tggccgagc 4539 // ID RETRO16_AG_LTR repbase; DNA; ANG; 205 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO16_AG DE retrotransposon - a consensus. XX KW Copia; LTR Retrotransposon; Transposable Element; AACOPIA1; KW Long terminal repeat; RETRO16_AG_I; RETRO16_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-205 RA Jurka J. and Drazkiewicz A.; RT "RETRO16_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 2-2 (2002). XX DR [1] (Consensus) XX CC Related to AACOPIA1 from Aedes aegypti. 5 bp target site CC duplication. XX SQ Sequence 205 BP; 51 A; 29 C; 48 G; 77 T; 0 other; tgttaagaat aagcaacgtt ttgttgtttg ggtttgacag tactggcgta cggtagtata 60 taagcgaacg aaaaattgtg aacttgtaca gttgcgtctc aggaataaaa gaattaacat 120 cgaagcttgg tcgttattct ttttgttcaa agttcgtcgc gttcatttct cggttgatgg 180 tttgcttttc gtcgttgttt taaca 205 // ID BEL-21_AG-I repbase; DNA; ANG; 3339 BP. XX AC . XX DT 01-SEP-2010 (Rel. 15.09, Created) DT 01-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE BEL-21_AG-I. XX KW BEL; LTR Retrotransposon; Transposable Element; BEL-21_AG-I. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3339 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1433-1433 (2010). XX DR [1] (Consensus) XX CC The three sequences in this family have a p-distance of 0,0033 CC sd= 0,00077, presenting LTRs of 227 nts that are identical within CC each element. The consensus sequence presents an ORF of 927 aa CC that has no conserved domains for known proteins. The LTR finder CC program detected a PBS for Asparagine and a PPT signal. XX FH Key Location/Qualifiers FT CDS 104..2878 FT /product="BEL-21_AG-I_1p" FT /translation="WNGKINQRKMRRTLRVEQNLPASPETPQESGNVSGGP FT RYDPSSPGTSSTVTTEGSTPRSDATTTNVAVLCKEGQSKGISSPFHALNHG FT ANQGRDDIRESRRRRLQEIEHEMQMLRDAEDENGSAAIAEHPNHGGTGGTH FT YPEFAGWVKSFEESLRRIYPAGKALTQGQIPGGNDLGQRAAAEQAQLPQPS FT PAIASTYATLPSQRMHPHATQTLPHATHSHSAHTIQTHPTYTSHTHSAHTI FT LPQHEYVQPHFNTSTLSQSQIAARQPVPRDLPIFSGEPEAWSFFIATYNRT FT NTACGYTDDENIGRLQYALRGAAFEAVGHLLSFPDGLNEVMATLKARFGRP FT DLIVESMTEKIRKMAPPKIERLSTVVEFGYAVKRLVGTMTASGLRGYMYDV FT ALLKELVRKLPPVLCIDWARTRSKLSEVTLLEFGKWVGDLAEDLCGVVDVM FT SIMDNSDSTHQQPEPPTHHSRTQPQRFQPHRAPPAQLDRRPPIRTGRVHLA FT YCNATVLQDENLGDPSSSQNTPELLTVCPLCGRDCPTLVQCEQFQRTTVAA FT RRSFVGERRICRKCLGYHRGGCSVRAPCAVNGCNRQHHELLHVNDQTPASP FT HQRDHLNGSGKVVYTGVANVLTHSGPKDSSLLKYVPVTLHGPRGRIDTFAF FT LDDGSTSTFMEHGLAQELGVTGTPYPLCLQWTGDVTREEQDSVRLSVRISG FT RNMSHAIYKLSEVHTVKELALPEQSVNVAQLTARYTHLRDLQFESYASAVP FT RILIGIDNCNITRTLKTVDASCNEPVASKTRLGWVVYGPCSVASAKPSPSN FT RSFSGRPFHDLSCNSASDKAVRPLRPRDDERALAILERETNGQRYEMGPLW FT RYDSVNLPYNKKAFKRDGFLKQKMAKYSKLANDVNRTNVAEFLPTLHHRTE FT RSEPVRPTEIGDIVVVACNNS" XX SQ Sequence 3339 BP; 924 A; 861 C; 891 G; 663 T; 0 other; tttcaaaaaa ttagcattcg cgcgaaaaag ccaataaagt tgatcgaaaa ctcgtttcgc 60 agagcagaaa gcacagtaag tcgcgcgcgc aagtgatata tagtggaacg ggaaaataaa 120 ccaacgtaaa atgcgtcgta cgcttcgggt ggagcaaaac ctaccagcga gtccggaaac 180 gccgcaggaa agcggtaacg tctctggcgg gcccaggtac gatccatcct ccccgggaac 240 atcgagcact gtaacgacgg aaggaagcac cccgaggagc gatgccacca caacgaacgt 300 tgcggtactt tgcaaagagg ggcagagtaa ggggataagt tcaccgttcc acgcattaaa 360 ccacggtgcg aaccagggcc gcgatgacat tcgcgaatcg cgaagacgac gattgcagga 420 gattgagcac gagatgcaga tgctacggga tgcagaagac gaaaatggct cggcggccat 480 cgctgagcat cccaaccacg gcggcactgg ggggacacat tatccggagt ttgctgggtg 540 ggtgaagagt ttcgaggaat ccttgcgtcg aatatacccg gcgggaaaag cccttacgca 600 aggccaaatt ccagggggca acgacctcgg gcaacgcgcc gctgctgagc aagcccaact 660 ccctcaaccg agccccgcga ttgcgtcgac ctacgctaca ctaccgtcac aaaggatgca 720 tcctcacgca acacaaactc ttcctcacgc gacacattct cactcagcgc acactataca 780 gacacacccg acgtacactt cgcatacaca ttcggctcac acgatactac cacaacacga 840 atatgtgcaa ccacatttca acaccagcac actgagtcag agccagatag ctgctcgtca 900 accggtgccc cgggaccttc cgatcttttc cggggaacca gaagcgtggt cgtttttcat 960 agccacgtat aaccgcacta acactgcatg tggttacacc gatgacgaaa acattggccg 1020 tctacaatat gcgttgagag gagcggcgtt cgaagcggta ggccaccttt tgtccttccc 1080 ggacgggttg aatgaggtga tggcgactct aaaggcgcgt ttcggcagac cagatctgat 1140 tgtggagtcc atgaccgaga agatcagaaa aatggcacca ccgaagattg aaaggctgtc 1200 aacggtcgta gagtttggat atgcggtgaa gcggttggta ggaacgatga cggcctccgg 1260 gttacgaggg tacatgtacg acgtggcgtt actcaaggag ttagtcagga aactaccccc 1320 cgtcctatgc atcgactggg cgagaacgag gagtaaactt tcggaagtga cgcttttgga 1380 gtttggaaaa tgggttggtg acttagcaga agatctctgt ggtgttgtcg acgtgatgtc 1440 catcatggac aatagcgact caactcatca acaaccagaa cccccaactc atcacagccg 1500 cacccagcct cagcggtttc aaccccatcg tgccccgcca gcgcaactcg atcgacggcc 1560 accgatccgt actgggcggg ttcatctagc ctactgcaac gcaacggtat tgcaggatga 1620 gaatcttggt gatccatctt catcgcaaaa tacaccagaa ttgctgactg tctgtccctt 1680 gtgcggcagg gattgtccca ctctggtcca gtgtgagcag ttccaaagga caacggttgc 1740 cgccagaaga tcgtttgtag gtgaacgcag gatatgccgg aaatgcctcg gataccatag 1800 aggtggatgc agcgtaagag caccgtgcgc ggtgaacgga tgcaacagac aacaccatga 1860 gctgctccac gtaaatgacc agacaccagc cagccctcac cagcgcgatc atctaaatgg 1920 ctcaggtaag gtcgtgtaca ctggtgtcgc taacgttctt acccattcag gaccgaaaga 1980 ctcttcgctt ttaaaatacg tacccgttac gctgcacggg cccagggggc gaatagatac 2040 gttcgccttt ttagatgacg gctctacttc cacgtttatg gagcacggac tagcacagga 2100 gcttggagtt acagggacac cgtacccgtt gtgtctacaa tggacgggag acgtgacaag 2160 agaagaacaa gactccgtaa ggctatcggt gcgcatatcg ggtagaaaca tgtcgcatgc 2220 gatatacaaa ctatcagaag tacacacggt caaagagctt gcccttccgg aacaatctgt 2280 aaacgtagca caactgactg cacgctatac tcatctgaga gatctccagt tcgaatccta 2340 cgcttccgcc gtgcctcgca tcctcatagg catagacaac tgcaacataa cgcgaactct 2400 gaagaccgtg gatgcaagct gcaacgaacc ggtcgcttcg aaaacccgcc ttggatgggt 2460 tgtctatggg ccttgctcgg ttgcgagtgc aaagccgagt ccttcaaacc ggtccttctc 2520 gggccgcccg tttcacgacc tctcttgtaa ctccgcgtcc gacaaagcag tgcggccgtt 2580 gcggcctagg gacgacgaaa gagcgttggc gatattggag cgggaaacga acggacagcg 2640 atacgaaatg ggacctttgt ggcggtacga cagcgtcaat ttgccctaca acaaaaaggc 2700 gttcaagcgt gatgggtttc taaaacagaa gatggcaaaa tattcgaagt tagcaaacga 2760 tgtgaataga acaaatgtag cagagttttt acctacactt caccacagaa ccgagcggtc 2820 cgaaccggtg cggccgacag aaataggcga tatagttgta gtggcgtgta acaattccta 2880 ggaattgttg gcctaaggga ggaatcgtgg tagtaaaccc tgataggaat ggacaagtta 2940 gtcaagccac cgtcaaaacg gcccattaaa cgtacgatag gtctacgcac ggatgtgcaa 3000 tttccttaga gtaattgggg taaaaccgcg tggttacgac gcaacagatc gaacagtgag 3060 cagacagtaa acaaaccaat gcacatacac atacacgcaa aacacctgac agataatagg 3120 aagtgtagcc acggttgata gtgtcgcagg ctaggatggc gaaaaaagag gttgcaccat 3180 ggttaagcgg gattgacgtt tttgtgtaac gggctataga gcttaagcgg gttacacgat 3240 cagcgtttac gggctataag ctcaagcggg ttacacgttc agcgtttacg ggctactaac 3300 aaattcaacc gcctcaaccg tggtacactg gggggacaa 3339 // ID MARINERN10_AG repbase; DNA; ANG; 270 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE MARINERN10_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN10_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-270 RA Kapitonov V.V. and Jurka J.; RT "MARINERN10_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(3), 61-61 (2003). XX DR [1] (Consensus) XX CC There are ~100 copies of MARINERN10_AG in the genome, CC they are ~97% identical to the consensus sequence. CC MARINERN10_AG copies are flanked by 2-bp target site CC duplications. CC This family is characterized by a target site specificity: CC The consensus sequence has 13-bp terminal inverted repeats CC (4 mismatches) and a 3' terminal palindrome (pos. 194-241). CC Putative classification: a nonautonomous Mariner/Tc1-like DNA CC transposon. The genome harbors several subfamilies of CC MARINERN10_AG. CC One subfamily is composed of MARINERN10_AG elements harboring CC MARINERN9_AG. XX SQ Sequence 270 BP; 77 A; 59 C; 50 G; 84 T; 0 other; cccgctgcgc aaagcgatcg aaaacttctc aaactcaaac tgcaaacata tttcacagcc 60 catggctgtg aaatatttcg cattcaaaat tgttgcaaat ttataaatga aatatttcga 120 aattttcagc gctgaaattt ttggattgaa atatttcttc gatcggaatc caagcgtttt 180 ccctgacatt tctgctcgct ttttcgatcg cttttggcgg aaagcgatcg aaaatccgag 240 ctacaaaact gctcgatttt gtcgtgcggg 270 // ID GYPSY48-LTR_AG repbase; DNA; ANG; 374 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY48-LTR_AG is an LTR of retrotransposon GYPSY48_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY48_AG; CsRn1 lineage; GYPSY48-I_AG; GYPSY48-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-374 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY48_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 89-89 (2004). XX DR [1] (Consensus) XX CC GYPSY48-LTR is a long terminal repeat of GYPSY48_AG (its internal CC portion is deposited as GYPSY48-I_AG). XX SQ Sequence 374 BP; 115 A; 61 C; 110 G; 88 T; 0 other; tgtggtgttc gcctgttcaa attaagtgaa ttgggaagga cagctttttc tcatcacttg 60 aggtctgcga tattaagggt taattagggt ctcaagtaga tagttagaga gcttaggaga 120 taacgggaga gtcgggcagg acacgcataa tgagcgcgag aaatgcgtag cctcgcgaaa 180 gtgcgtaggt gcgtgagaaa ggacaggtgc gcgagaagga ataggcgcgc gagcgtaagg 240 cgagagagag aaggtataaa aaggagcatc cgatcggata agctctcttt cttaatactc 300 cggtatatcg cgacaatact gttctcctgc gcacgattaa gataataaag tgaatataat 360 tgtagaaact taca 374 // ID GYPSY45-I_AG repbase; DNA; ANG; 5607 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY45-I_AG is an internal portion of retrotransposon GYPSY45_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY45-I_AG; GYPSY45-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY45_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5607 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY45_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 82-82 (2004). XX DR [1] (Consensus) XX CC GYPSY45_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY40_AG, GYPSY41_AG, GYPSY42_AG, GYPSY43_AG, CC GYPSY44_AG, CC GYPSY46_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY45-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 351-aa GYPSY45_AG1p gag-like CC polyprotein (pos. 278-1330) and the 1054?aa GYPSY45_AG2p CC pol-like polyprotein (pos. 1273-4434). CC The sequence of the LTRs flanking GYPSY45-I_AG is deposited as CC GYPSY45-LTR_AG. XX FH Key Location/Qualifiers FT CDS 1273..4434 FT /product="GYPSY45_AG2p" FT /translation="SQFGRLFISEPFFRVGSQMVKPILPFVKLKTNIGEFP FT FLIDTGANVSLIDPKLANSCKIGKPYDLCDRNIASASGKFITSSAIDINFF FT SPKIDQICTFLLHSFHSFFFGIIGTDILKCLHAVIDLETDNLILKKNSQVL FT HIPLFSYSSSKTTLNPIFRLEHLNCNEQAKLKKILDANNSVFHEPKLQLTC FT STRVECAINTNDDIPIHQKVYPYPAAYTQEVNNQIKTLLDNGIIRPSHSAW FT TAPVWIVPKKIDASGEKKFRMVVDYRKLNQKTIADRYPMPEINYVIDQLKG FT HQYFTTLDLASGFYQIRMRPCDIEKTAFSINNGKYEFLRMPFGLKNGPAIF FT QRVIDDVLRSEIGKTCYVYMDDVIVFGKSFEEHLNNLNNILQLLSNANLKV FT QIDKSEFLHDKIEFLGYIITSEGIKPNYKKIEAISRYPEPSNIKELRGFLG FT MMGYYRRFIKDFARIAKPLTNLLRGDENKASKKKIDLSIAEKACFQKLKSI FT ISSEDILIYPDYSKPFLLTTDASDIAIGAVLSQGQIGKDRPIHFASRSLSK FT TEEKYSVPEKEMLAIFWALKIFRNYLYGTKFTVLTDHQPLTFALSAKNTNA FT KLKRWKSYLEEHDFEIAYKPGNTNVVADALSRNVCSLTGTQHSAETSDDFF FT IISTETPLNAFKHQVIIKNGCDNVQRRQIFSSHNRITISLNDTNEISLLQV FT LKKYFDPSKINGLFTSEAIMGKIQEVYRKYFGNTNLLKIRYTQKLVRDIED FT EEEQWQIAKTEHVRAHRCAEENTQQILRNFYFPKIRAKSKQIAKECQTCHE FT NKYNRKPIDYPIQKTPIPSNPFEIAHIDILFLENNYFLTYIDKFSKFAQIK FT LIESRATIDILPAVKELTTKFFPPKTLVMDGEKSFCSRELELFYKAYNIET FT YVTATGRSEMNGIIERFHSTLLEIYRITKKENPTFCTPDLVELSVQKYNSS FT IHSVTKHTPIEVILPSVRSSEIINNVHKNLIQKQSQDLIFHNKNKKSVPID FT INNDTYEKTRARLKHKPRYKKKNIMNINNSTVLLEGNRKVHKNDLKIR" FT CDS 278..1330 FT /product="GYPSY45_AG1p" FT /translation="MTTETNVPNIYELGKVPDFVKDLREFDGRHTELMNWI FT ADVEEILTLCVSCKISQLLGTLVQRTIRRKIVGEAAEVLNTNNISTDWKEI FT KEVLLLHYGDKRDLFTLDFELSTMRKSNGESLTAYYGRVKETLALIIAHVQ FT TNEKYREHAPIHTQFFNEKALHSFIRGLDKPLSFILKTSAPANLSKAYQLC FT QEYYSTDLRFARMFEPRGQADFNQKPVPKPRGSFNPMPYPRASTSFVPSTA FT SHSVPNTTHTIKSNRYQTPTPMEVDGSIRYDNGSHSRRFTQQTYPRVPSTQ FT VKDFKRQANTVHDDTDRKINNTGLESDYQHQQSEYEAGHDHNLDGSLYQNH FT FLEWDPKW" XX SQ Sequence 5607 BP; 2096 A; 935 C; 954 G; 1622 T; 0 other; ggcgcccgaa taggttgtat gtgtgtgagg gaagagtata agtgcatata aagtataaaa 60 aaatcctaca cgtgaagcta aacacttcca tcgagtggtt gaaggagcca ctaattattt 120 gcaccaagcg agaactggga cactgtggat ttcgcccgat tgattgctga ggcccctagt 180 acccgaacgg aaaaaaggat catcctctgc aagctgccca gctttccccc gaagacatca 240 tactgtgaca attttttttg gtaagtagga agtaaaaatg actacggaga caaatgtgcc 300 gaacatttat gagcttggaa aagttcccga ttttgtaaaa gatcttcgag aattcgatgg 360 tcgacataca gagctaatga actggatagc agatgtggaa gaaattttaa ctctgtgtgt 420 gagttgcaaa atatcgcagc tgctaggtac tttagttcaa agaacaataa gaagaaaaat 480 tgttggtgag gcagctgaag tattaaatac gaataatatt tccacagatt ggaaagaaat 540 taaagaagta cttctcttac attacggtga taaaagagat ttgttcacgt tagatttcga 600 attatctacc atgcggaaaa gtaacggtga aagtttaacc gcatactatg ggagagtaaa 660 agagacgtta gcacttataa ttgcacatgt acaaactaat gaaaagtatc gagaacatgc 720 accaatacat acgcaatttt tcaacgaaaa agcactgcat tcgtttatac ggggattaga 780 taagccattg tctttcatcc taaaaacatc tgctccagca aatttaagca aagcttatca 840 gttgtgtcag gaatattaca gcacagattt acgtttcgct agaatgtttg aaccacgcgg 900 ccaagccgat ttcaaccaaa aaccggttcc gaagcctcgt ggtagtttca atccaatgcc 960 ctatccaaga gctagcacaa gcttcgtgcc atccacagca tcccacagcg taccgaatac 1020 aacacacacg ataaaaagta atcgttatca gacaccaaca ccgatggaag tggatggtag 1080 tatcagatac gataacggtt cacactctag acgcttcaca caacaaacat atcccagagt 1140 accatctaca caagtgaagg attttaagcg tcaggcgaat actgtacatg acgacacgga 1200 cagaaaaata aataatacag gcttagaaag tgattaccaa caccaacaaa gtgaatacga 1260 agcgggtcat gatcacaatt tggacggctc tttatatcag aaccattttt tagagtggga 1320 tcccaaatgg taaaacctat tttaccattt gttaaattaa aaactaatat tggagaattt 1380 cctttcctaa ttgatacagg agccaatgta agccttatag atccaaaatt ggcgaattct 1440 tgtaaaatag gaaaaccata tgatttatgt gacagaaata ttgcaagtgc tagtgggaaa 1500 tttattacat catcagcgat agatattaac ttttttagtc cgaaaattga tcaaatttgc 1560 acatttcttt tgcattcttt tcattctttc ttttttggaa taattggtac agatattctt 1620 aaatgtttac atgctgtaat agatttagaa acagacaatt taatactaaa aaaaaattct 1680 caggtcttac atattccact tttttcttac tcatcttcta aaactacatt aaatccaata 1740 tttagattag aacatttaaa ttgtaacgaa caagcaaagc ttaaaaagat actggatgca 1800 aacaattcag tattccacga gccgaaacta caactaacat gctcaacaag agtagaatgt 1860 gctatcaata ctaacgatga tataccaata catcaaaaag tatatccata cccagcggcc 1920 tatacacaag aggtcaataa tcaaatcaaa acccttttag ataacggtat tataagaccg 1980 tctcactccg cgtggacagc tcctgtgtgg atagtcccga aaaaaataga tgcatcggga 2040 gaaaaaaaat ttcgtatggt tgttgactac cgaaagctta accagaaaac aattgccgat 2100 cgatatccga tgccagaaat aaattacgta attgatcagc ttaaggggca tcaatacttc 2160 accacattgg acctggcatc aggattttac caaattcgca tgagaccatg tgatattgaa 2220 aaaactgcat tttctattaa taatggtaag tacgagttct tacgaatgcc ttttgggctg 2280 aaaaatggcc cagcaatttt tcagagagtg atcgatgatg tacttcgtag tgaaattggc 2340 aaaacatgct atgtgtatat ggacgatgta attgtatttg gtaaaagttt tgaagaacat 2400 ttaaacaatt taaacaatat tttacaatta ttaagcaacg ctaatttaaa agtacagatc 2460 gataaatcag aatttctaca cgataaaatc gaatttttag gttatattat aacttcggaa 2520 ggcattaaac cgaattataa aaaaatagaa gcaataagca gatacccaga accttccaac 2580 ataaaagaat tgcgtgggtt tttaggaatg atgggatatt ataggagatt tatcaaagat 2640 tttgctagaa tagctaaacc gctaacaaat ttattgagag gtgacgaaaa taaggcttca 2700 aaaaagaaaa tagatttatc tattgctgaa aaagcatgtt ttcaaaaact caaaagcatt 2760 atatcgtcag aagatattct tatctatccg gattacagta aacctttctt acttacaact 2820 gacgcttctg acatagcaat aggtgccgta ttatcacaag gtcagattgg caaagataga 2880 ccaattcatt ttgcctccag atcattgtca aagacagagg aaaaatattc tgttcccgaa 2940 aaggaaatgt tggcaatatt ttgggctctc aaaatattcc gtaactatct gtatggtact 3000 aagtttacag ttttgacaga tcaccaacca ttaacatttg ccttatccgc aaaaaatact 3060 aatgccaaat tgaaacgttg gaaatcgtac ctagaggagc atgatttcga aatcgcttac 3120 aagccaggaa acactaatgt tgtggcggat gctttgagca gaaatgtatg ttctttgact 3180 ggtactcaac actcggctga aacatcggat gattttttta taatcagtac agaaacaccg 3240 ctaaatgctt tcaaacatca ggtgataatt aaaaacggtt gtgataatgt gcaacgaaga 3300 cagatatttt catcgcataa ccgaataaca atatcattga acgacactaa tgaaatttct 3360 ttattacagg ttcttaagaa gtacttcgac ccatcaaaaa taaatgggtt atttacatcg 3420 gaagcgataa tgggaaaaat tcaagaagta tataggaagt attttggtaa tacaaattta 3480 ttgaaaatta gatacactca aaaattagtt cgagacatag aggacgaaga agaacaatgg 3540 caaattgcta aaacagagca tgttcgagct catagatgtg cagaagaaaa tacacagcaa 3600 attcttcgaa atttttattt tcccaaaata agagcaaaat caaaacaaat agcaaaagaa 3660 tgccaaactt gccatgaaaa taagtacaat cgaaaaccaa ttgactatcc tattcaaaaa 3720 actccaatac catcaaaccc gtttgaaata gctcatattg atatactgtt tcttgaaaat 3780 aattattttt taacttacat agataaattc tccaaatttg cccaaattaa attaattgaa 3840 tcaagagcaa ctatcgacat tttaccagca gttaaggaat taacaacaaa attctttccg 3900 cccaaaacct tggtgatgga cggagaaaag tctttttgct ctagagaatt ggaattattt 3960 tataaagcat acaacataga aacatatgta acggctacgg gaagaagtga aatgaatggt 4020 atcatcgaaa gatttcattc aacgttatta gaaatctata gaataacaaa aaaagaaaat 4080 ccaacatttt gtacaccaga tcttgtcgaa ctttccgtac aaaaatataa cagttcaatc 4140 cattcagtaa ccaaacatac tcctattgag gttatcttgc cttcagtaag atcgtctgag 4200 ataattaata atgtacacaa aaatctgatc cagaaacaat ctcaagacct aatttttcac 4260 aacaaaaata aaaaatcagt cccaatagac ataaataatg acacttacga aaaaactaga 4320 gcaagactta aacataaacc aaggtacaaa aagaaaaata ttatgaacat aaacaatagt 4380 acagttctct tagaaggtaa tagaaaagtt cataaaaatg accttaaaat tagataaaaa 4440 aattaaaaga actaaaccaa aaaaaatgat ttatcaacgt tatttgtaaa tttaaattga 4500 attcttaaca taattgattg attaatcaac ttaatttaaa ttgatactca tacttaattg 4560 ctactacttt tttctctctt ttcattattc tatttcttat atgttaaatt tatgtatgtg 4620 ttgtctatgt gttttgtcca aaaaaaaaaa aaattttttg tttggcacta tttaataatg 4680 ttagtatgta tatatatctt tttatatcta catatatatt gtggttagtt aaagattagt 4740 tttgattctt ttgataggaa gcaaaggatt cataacgaga gccacaagaa cacctgaact 4800 gttcctcttt tcattctttt ttaagataaa atagataata gaatagagta ggtaaaattt 4860 aaaataattt aaatatgcca agttttactt aagacaacaa aaatgcattt gttaaacaaa 4920 gaatagatac aaacatgcta gattatttcc acttatgcta aaaaaaagag gaatgataat 4980 aataacgtag ataatcaacg gcaaaatatg gcttgtgcaa ttcttctaca tcttaaacaa 5040 atggatttgt agacagtcta catgtgtaca tgcgtgtgtg tatttatata ttaatgctta 5100 agtatattta tatatctata cacatcggaa tatgtggaca tgtgggttct acacatgtat 5160 atttgtgtaa atagtggtat tgcatactga gaatggttat actagaaagt gaggaaaaaa 5220 tgggctgtac aattttaagt gatgaggagc tacaatctgt tcagatttcg agattcgttg 5280 tgttctccaa tttgtcgcaa aggaattcaa agatccgtta cgatcttttg atgcgaaaga 5340 ggccaacgac aaagatctgt caggaaaaag gaactaagtc ttataatcac ttgcgatcgg 5400 taaaggtcct tgtgaaaagt ccattatgga ttatacgaac atgaacaaca tgaacaacat 5460 accatccagt gacagaaatg tcaggtgttt cgaatgaaac aaaggataca aaaatatcca 5520 agcatgaatg ttgttctaat actatgaaat gttccgcaag cagtatcact aaaagacgag 5580 gacgtctttt cttgttggcc cgggagt 5607 // ID HATN1_AG repbase; DNA; ANG; 1724 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE HATN1_AG is a hAT-like nonautonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW HATN1_AG; nonautonomous DNA transposon; hAT superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1724 RA Kapitonov V.V. and Jurka J.; RT "HATN1_AG: a family of nonautonomous hAT-like DNA transposons RT from African malaria mosquito."; RL Repbase Reports 2(11), 14-13 (2002). XX DR [1] (Consensus) XX CC HATN1_AG is a family of nonautonomous DNA transposons that CC belongs CC to the hAT superfamily. CC The HATN1_AG consensus sequence was reconstructed based on CC multiple alignment of ~100 copies of this transposon identified CC in the sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC HATN1_AG CC occurred recently (in the last 1 Myr). CC HATN1_AG elements are flanked by 8-bp target site duplications. CC HATN1_AG has 15-bp terminal inverted repeats. XX SQ Sequence 1724 BP; 575 A; 350 C; 289 G; 508 T; 2 other; caggtatgtc aaactggcgg cccgcgggcc gcatgcggcc ctcgtcaatg ctgaatgcgg 60 cccgcgagaa tattttgaga attgctaaaa tcactccaaa ccaaaaaact gttattacta 120 tttgtcgaag tgtatttaat tttccagaga agaacaatca tttagatggt atatttttcg 180 aatctaacat gaatgtaccc ataacatctc ataaacttat tctacacaat cgtacagaac 240 aatcgagacc cttaagaaac attgaattga acgcaaactc agtcggtatt atgtttcgat 300 taaattctga atattagcat agtattgtgc acactcgatt gcacattcta tatggaaaaa 360 tagatttgcg actaccttyt accaacgaaa aagtaatgaa aataacttga tacacacact 420 tggaaacttc gcaacaacta atgatggatt gtcagaatcg catccacggc tcaaaaccat 480 tttcgatctt agctatagca acttcgaatc cggccaaatc aaaccaacag ttccgttcgg 540 tgccgtccga agcctataga gccgtttaaa gccattcgga accgttggag tcatcggttg 600 tygcacggag cggctaatct acagtgagaa atggatccgg tcggtcccac cgaatttgac 660 catctcaatg cccccgagtg ccgagtaaag gattcattcg gctccgaatt tctccgattt 720 gcacggctcc aaccttataa tgttacatca ttgatgttgg attggagcca cttctgtttt 780 tttgtcataa tcatgccacc atcaatttta ataaagatga ttgtagccaa gttggaattt 840 tgtctacccc acattttgca agacgaatta atgtactttt gaacctcccc aaaaatctaa 900 accaaaatga tatgattacg cttaagacta agagatttag attaaggatt aaaataatcg 960 aaatgccaaa gccctgcaat acatagctta ttaatagatt agctcacaca gtccatacct 1020 caattcgcga atcatctccc ttcaagtgca aaacaagatt tgccaatctt tatgggatca 1080 cacaccacga atcttcgtca gataaaatta tcgtaattta acaaaattat tcgtaaaata 1140 tgggttcttg taataagaaa tgccaccata ccaccatata ccaccatata ccccttcgaa 1200 cttagaaatt agcgaaaaat aatgtttaaa cacgtttttt ttacaatttt cctcttgatc 1260 tatcaaaaaa ttgcatgcaa atatttttgt tattcatcca gcggtggata gctcggatgg 1320 cttgaaaaag ttccgcaaca ggctggtaga cttcccctac ctcagtaaaa aagtgtggtt 1380 gagtgatatt ttctaacaga agtgccgtgt tagaccgcga ctaatgagaa atttcctatt 1440 ttaataataa ggatgaaaac aacgcatgtg actttaaaag tgcattaaac attttttttt 1500 aaataaaaaa gtgcacaata aaaaaaccat agcaaatttg taaataagaa actattccag 1560 atttgaagca aggagtacaa aaagtcaatt tgtttaacaa taatgatttt aaacagtatt 1620 ttatatgatt attgaggtat attttttatt ctcaactttg cggcccacga atcactgaaa 1680 atctgtactt gcggccctct tatgaattga gtttgacaca gctg 1724 // ID Clu-15_AG repbase; DNA; ANG; 460 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-15_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-460 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1439-1439 (2010). XX DR [1] (Consensus) XX CC TA TSD. >97% identical to consensus. XX SQ Sequence 460 BP; 141 A; 88 C; 85 G; 145 T; 1 other; tacagggttt tcgaggatta tatagacatg ttcagcgagt tgtagattcg ttcaggcata 60 taatagactc tttcagcgag atatagaatt gttcggaagt aggcgtagac atgttcagcg 120 attctgtaga gctgtcattt gttttgttta cacactgtgc cgcagggttc aatttttcat 180 tcttaaaagt cctaaaagta gctaataatc attctccaag ctgtttaata cataaattag 240 ttgaaaaaag catatttttt cgaaaaccgt ttgtttacct tctgatgaga atgatccaac 300 ttgcagtcca aggggtacga aagtgacagc tctacagtat aaccgaacat gtctacacct 360 acwtccgaac aattctatat ctcgctgaaa gagtctatta tatgcctgaa cgaatctaca 420 actcgctgaa catgtctata taatcctcga aaaccctgta 460 // ID RETRO43_AG_LTR repbase; DNA; ANG; 242 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO43_AG DE retrotransposon - a consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; MDG1; RETRO43_AG_I; RETRO43_AG_LTR; KW STALKER2; retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-242 RA Jurka J. and Drazkiewicz A.; RT "RETRO43_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 10-10 (2002). XX DR [1] (Consensus) XX CC Related to MDG1 and STALKER2 from Drosophila melanogaster. 4 bp CC target site duplication. XX SQ Sequence 242 BP; 69 A; 53 C; 56 G; 64 T; 0 other; tgttatggga tctgcatgtt aaggttagcg tagtgtaacg catgctttag tttatagcaa 60 acccgaagct agagagtgcc cgcgagaggc actcagttag gctttataag tagtggcaat 120 gtggtaacga agctctctct tttgctcgac accgaagcga tcggttcgac gcacacaccc 180 atcctgttat cttaagtggt gataataaaa agtacttgta acccctgtaa acccacaaaa 240 ca 242 // ID CR1-10_AG repbase; DNA; ANG; 3448 BP. XX AC NT_078268.2; XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 2) XX DE CR1-like non-LTR retrotransposon. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; CR1-10_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3448 RA Jurka J.; RT "CR1-like non-LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 636-636 (2009). XX DR EMBL/GenBank/DDBJ; NT_078268.2; Positions 1489876 1493323. XX CC It is likely to be 5' truncated. XX FH Key Location/Qualifiers FT CDS 1..3195 FT /product="CR1-10_AG_1p" FT /translation="SPRIPRIFGLQCSPGFSSTTAAAIPSQAINRCDPRTR FT ITSVYPYRFSAANFPPVDEPAATTAQSNSTSTPITPLHHAAERFETLPSVA FT SPITVSAPLPLHASFPHPIDDLRFYYQNTRGLRTKLDDLRLASADTEYHVL FT ILTETWLSENIPSSLAFDNNFVIYRCDRNSRNSSRSRGGGTLIAVSRKLAA FT CEISPASDSLEQCWIRLKLPSCSLFIGSIYLPPTLSADASTLNTLDASIDE FT IISRMKKNDRLLLFGDFNIPSITWHASPDPAPPFLPLLVPASSAHSSSLFI FT DIVRYHSLSQLSSVSNFRQVQLDLIFTFPKNQSNATSTCTYVSQAPVPILQ FT IDPHHPPLELSLKITGSLTPSNRSQHSSSRLTTAPSLDFRKTDFNLLNTSL FT SNIDWSQLYSCKDVNLAVSTFTSLFLSTLPQCCPKRNPPRSPPWSDATLKS FT LKRDKSRYQRAYWSTRSVNALRTYKYAAAAYRLYNRTRYNGYILILQTRLR FT WYPRAFWKFMNARRKGSSLPSTMHLESVYASSPEDICSLFRNHFSNIFSPP FT SQIPNTASQCLSYTPQDVIDVSCFGFDEHSVSEALSKLKPSFSAGPDGIPP FT SLLKKTAEVIAPLLSFIFNLSLRLCCFPSSWKSGWLCPIFKKGNSADVSNY FT RGIVLLPACSKVFEIAVHSVISFQVKSWISVNQHGFMPSRSTVTNLMQFAS FT HVIPSLDSGLQVDTIYTDFKAAFDTVPHHLLLLKLKKLGFSNMLIDWLYSF FT LNSRSIRVKTDTTFSSSFCPSSGVPQGSVLSPLLFVLFVSDVHAILPSEGH FT LLYADDLKIYLTISTHSDCLRLQEFLNSFSLWCSYNQLSLCPEKCSVISFC FT HNRNPIHYEYSLSNSVLSRTSLIRDLGVLWDDKLSLNEHLESVVASASRTL FT GLICRLTRDFRDPLCLKSLYCSLVRSTLEYASVVVGPVRSGWILRFEKVQK FT RFTRFAFRRMAGPNAAMPSYESRCNLLGLESLEVRRSQARVLFIGRMLNGL FT IDAPTLLEKIGLYAPSRNLRARSALNMVSRQTTFGSNEPLLLCVRTFNTST FT DPPHPRL*" XX SQ Sequence 3448 BP; 812 A; 981 C; 599 G; 1056 T; 0 other; tcacctcgca ttccgcgaat ttttggacta caatgctcgc cgggattttc atcaaccacc 60 gctgccgcca tcccgtccca ggcaatcaac agatgcgacc caagaacacg aatcacttcc 120 gtctatccgt accgattcag cgccgccaac ttcccccctg ttgacgaacc agcagcgaca 180 actgcgcagt caaacagcac ctctacgcca ataacccctc tccatcatgc tgctgagcgc 240 ttcgaaactt taccttcggt cgcttcacct attacggtta gtgccccgct acccttacac 300 gcatcattcc cgcatcctat cgatgatcta cggttctact accaaaatac ccgtggctta 360 aggaccaagc tagacgatct cagactcgct tcagctgata cagaatacca tgtactcatt 420 ctcaccgaaa catggctctc tgagaacatt ccctcatctc tagcatttga taacaacttc 480 gttatatacc gatgcgatcg caattctcgc aacagctcac gctctcgcgg cgggggcact 540 ctcatcgctg tgtcccgaaa gctcgcagcc tgtgagattt ctccagcttc cgattcgctc 600 gagcagtgct ggatacgttt aaaactccca tcctgctctc tcttcatcgg atctatctat 660 ctgccgccta ctctcagcgc cgatgccagc acgctcaaca cgttagatgc atcgatcgac 720 gaaatcatct cgcgcatgaa gaaaaatgat cgtttactgc tgttcggtga tttcaacatt 780 ccatctatta cctggcacgc gtcaccggac ccagctcccc ccttcttgcc gctcttagtt 840 cctgcctcat cagctcattc ttcaagttta ttcatcgata tcgtgcgtta ccacagcctt 900 agtcagctta gtagtgtcag taatttcaga caggttcagc ttgatctaat ttttactttc 960 cctaaaaatc agtctaatgc aacgtctaca tgtacatacg tctcgcaagc tccagttccc 1020 atcctacaga tagaccctca ccaccccccg ctggaactgt ctcttaaaat cacaggttca 1080 cttaccccca gtaatcgttc tcagcatagc tcttctcgtt taactaccgc cccctcgctc 1140 gattttagaa aaactgattt taacctatta aatacttctc tttctaatat agattggtca 1200 caactttact cttgtaaaga tgtaaatttg gccgtttcta ctttcacatc cctcttttta 1260 tccacactcc cgcaatgttg tcccaagcgc aatccccctc gttcccctcc ctggtctgat 1320 gctaccttaa agtccctgaa gcgtgacaaa tcccgctacc aacgtgcata ctggagtacc 1380 cgttctgtta atgctcttag aacttacaaa tatgcagctg ccgcatatcg actttataac 1440 cgcactcgct acaatggtta tattcttatc cttcagacgc gcctccgctg gtacccccgt 1500 gccttctgga aatttatgaa tgctcgacgc aaaggcagct ctctcccgag cactatgcat 1560 ttagagtccg tatacgcctc atcgccggag gatatttgct cgctttttcg caaccatttc 1620 tcgaatattt ttagccctcc ctcgcaaata cccaataccg catctcaatg tctttcgtat 1680 accccccaag atgtgattga tgtgtcctgc tttggcttcg atgaacactc tgtctcagaa 1740 gctctttcaa aactaaaacc ttctttctct gcaggccctg acggaattcc tccctcactt 1800 ttgaaaaaaa cggctgaggt gatagcccct cttttgtcat tcattttcaa tctatccctg 1860 cgcctctgct gcttcccttc ttcatggaaa tcgggctggc tttgtcccat cttcaagaaa 1920 gggaattccg ctgacgtgtc aaactaccgt ggcattgttc tgctaccggc ttgtagcaaa 1980 gttttcgaaa ttgcagttca ctccgtaatt tctttccagg taaaaagctg gatcagcgtg 2040 aaccaacatg ggttcatgcc aagccgctcg acggtcacga atttaatgca attcgcgtct 2100 cacgtcatac cctcgcttga ttctggtttg caagtagaca cgatctacac ggatttcaag 2160 gccgcctttg ataccgttcc ccaccacctt ctgctgttaa aactaaaaaa acttggtttt 2220 tctaatatgt taattgactg gctttattct tttttaaaca gcaggtccat tcgtgtaaaa 2280 accgacacta ccttttcttc gtccttttgt ccctcctctg gagtgccaca aggcagcgtc 2340 ttgagcccac tcttattcgt actttttgtc agtgatgttc atgcaattct cccgtccgaa 2400 ggtcatttac tgtatgctga tgatttaaaa atatatttaa ctatcagtac gcacagcgac 2460 tgccttcggc tccaggagtt tctgaactca ttctctttat ggtgctccta caaccagcta 2520 agtctttgcc ctgaaaaatg ctctgtaata tccttctgtc ataaccgcaa ccctatccat 2580 tacgaatatt cgctctccaa ttctgtgctg tctcgaacct cccttatacg tgatcttggc 2640 gtcctttggg atgataaatt atcactaaac gaacatttag agtcggttgt cgcaagtgct 2700 tctagaacgc tgggtcttat atgtagactt accagagatt ttcgtgatcc actttgcctc 2760 aagtcacttt attgttcttt agttagatca acattggaat acgcttcagt ggtcgttggt 2820 ccagtccgtt ctggatggat cttaagattt gaaaaggtcc aaaagcgttt cactaggttc 2880 gcttttcgac gaatggctgg ccccaatgcc gctatgccca gctacgaatc tcgctgcaat 2940 ttacttggtc tggaatccct ggaagtaaga cgctcccaag cgcgcgtcct tttcatcgga 3000 cgcatgctca atgggcttat tgatgctccc actctccttg agaagattgg cttgtacgca 3060 ccctccagaa atttaagggc acgtagcgca ctaaacatgg tttctcgaca aaccacattt 3120 ggttccaacg agcctcttct cctgtgtgtt cgcactttta atactagcac agatcctcct 3180 cacccccgtt tataatcctt tataatcctt ttccctgata ctcttcccta tcctcttttt 3240 tctcatcctc ttttttccga ctatcatatc taatattatt cactcttctt ttcttacgtt 3300 tgtattttat cctaatatct gtctttcatg tgctcaaaag ttcgctttag tattaggaat 3360 ttagataggt taagttagga attagttaga tttaggaatt gtgaaaccat caggctgaca 3420 atttgaataa acttgaaatg aaatgaaa 3448 // ID GYPSY27-LTR_AG repbase; DNA; ANG; 205 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY27-LTR_AG is an LTR of retrotransposon GYPSY27_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW GYPSY27-I_AG; GYPSY27-LTR_AG; GYPSY27_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-205 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY27_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 20-20 (2004). XX DR [1] (Consensus) XX CC GYPSY27-LTR_AG is a long terminal repeat of GYPSY27_AG (its CC internal CC portion is deposited as GYPSY27-I_AG). XX SQ Sequence 205 BP; 54 A; 44 C; 50 G; 57 T; 0 other; tgttggagac gctaacccgt cggcataaac gcgttgcgtt gtttgacgtt tgctgcttgg 60 gtcggaaccg aagccaggaa agtgcgcgct ctttgtatgc gatctccacg gtgtacggac 120 gtgtgtgtga accagaataa tttgtaaccc tttagaacta tctgaattac cgaaatacaa 180 gaacttttac ttcgtaaata caaca 205 // ID BEL17-I_AG repbase; DNA; ANG; 5596 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 18-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE BEL17-I_AG is an internal portion of the BEL17_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL17-I_AG; BEL17-LTR_AG; BEL17_AG; Bel clade; PHD Zn-finger; KW integrase; peptidase; reverse transcriptase. XX NM BEL17-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5596 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL17_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 41-41 (2003). XX DR [1] (Consensus) XX CC BEL17_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL17-I_AG, an internal portion of BEL1_AG is flanked by CC BEL17-LTR_AG CC LTRs. The BEL17-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 11 copies; they are ~2% divergent from CC the consensus sequence. CC The consensus sequence encodes a 1842-aa BEL17_AGp Bel-like CC protein CC (pos. 34-5559). CC BEL17_AGp is composed of the PHD Zn-finger (pos. 9-56), peptidase CC A16 CC (pos. 223-353), reverse transcriptase (pos. 839-1006) and CC integrase (1540-1700) domains. XX FH Key Location/Qualifiers FT CDS 34..5559 FT /product="BEL17_AGp" FT /translation="MGPKRNDHCQACGDQRDDPDFVACDKCNLWWHFSCAG FT LTEPAEAVEQRKWLCVACQAKERSGLIQRTPVKHTEQMEAANANLNLEEVT FT QNVAEKHNLKLVEVIQQAAETPMPSSSGTSYNAVSKPNLTPAKVSDSVANR FT LAIMKRRQEAERKRMELELQLKFVKEEEDLLSEELGVMAISGASTVTPSRP FT DGESMRGSQQDERGPAPRLESFHRNVPTELPEFSGDPAEWPVFIAHYDYTT FT EKCGFSNWENMIRLQKALKGPALEVVRSRLVLPEVVPQVIATLRSRYGRPE FT HLISALIGKMRRMPAPCREKPDTVVAFGEAVRSMVDHMQAAGLRAHLTNPL FT LLQEIVERLPTSEQYSWARHIRGVTEPDLIVFGEFMTEWMDDAETLTRLDS FT PSLKAVDRKKPNTKGYVHAHVENQGATTSGTRAIQVGQSCFVCNKRGHLVS FT KCFAFGAMAVKDRWRKARALSLCFSCLERHNWRTCQNRAVCSIGGCTRRHH FT ALLHGAEESREIGSEQNNIESREGGIDGGVIAESNHHQYVSSSSKALFRIV FT PVTVYGPAATVTTFAFLDEGSSMTLVDDDLAEQLGVEGKVEPLCIRWTGNT FT TRVEAGSRRVNLKVGPVGSTKRFAIHSVRTVPGLNLPRQSFVQDEGRWQHL FT ERLPIRQYRDAEPKLLIGLDNLRLAVPLRTKEGGVGDPIAVKTRLGWCIYG FT KPANLECERLLHICECNDQGNIHETIREYFDMQLIGAAHGVEQDPDERRAK FT QILDTTTARIGKRFESGLLWRKDDIELPPSIDMARRRFNCLERRMERDGHL FT KEQVHRQIRDLLSKQYVHKATLRELEEADQRRVWYLPIGVVTNPNKPGKVR FT LIWDAAAKAHGTSLNDMLLKGPDELSSLLGVLFRFRLYAVAACADVKEMFL FT QIMIRKEDKHAQRFLWRYEPTDELETYIVDVVMFGSACSPATAQYVKNRNA FT REHMEQFPRAVEGIIESTYVDDFLDSFETEEEACQVSHDVREIFRNGGFEL FT RNWTSNSMELMRCLGEANGDIKCLSSMGDEAERVLGMRWNPASDELGFCTR FT ACTTVSDLLIAERIPTKREVLRCVMSLYDPLGLLAMFVIHGKILIQDLWRT FT GTQWDEEINDMQLRHWRRWIDLLPAIADLRIPRSYFAAASKKMYENGEWHL FT FVDASQHAYACVLYLRIFDDAGEPQCTLIGGKAKVAPLKPLTIPKLELQAC FT VLGARFLRYTQEHHPINVRRRVLWSDSTVALSWIRSDPRNYKPFVAHRVVE FT ILESTSVDEWRWVPTDHNPADEATKWKGKPNFDFGGNWFQGPEFLLHGEDD FT WPSQRHNSDNPSEEIRQVNLHVEDSNTGLLPIRYERFSRLERLQRMIGWIV FT RYVGNLRRKYRGEPILGGALRQEELYEADKILWRQTQLEYYPEEVRILSLD FT DNDGKPGGRTVSKQSHIYHLLPFVDDEGVLRMRGRIGAAADVPYSAKYPVI FT LPRGSRLAELIVERYHRLYRHANNETVTNELRQQFQIPKLRALVTKTVKNC FT VFCKIRRSLPQVPPMAPLPKERLTPFVRPFSYVGLDYFGPVLVKRGRSNEK FT RWIALFTCLTVRAIHLEVVHSLSTESCVLAVRRFVARRGAPVEIFSDNGTN FT FLGASRQLRREIEERNETLAAIFTNAHTRWTFNPPGAPHMGGVWERMVRSV FT KAAISTVMEAKHAPDDETFETVILDAEAMINSRPLTYVPLDPENQEAITPN FT HFLLGSSSGVKQQPVLPTNYRDSLKGNWKLAQHMLDGIWRRWIKEYLPVIS FT RQSKWFENVREIRKGDLVLVVDGTIRNQWKKGIVERIMAGPDGHIRQAWVR FT TNTGAVRRPVAKLALLDIAT" XX SQ Sequence 5596 BP; 1465 A; 1184 C; 1717 G; 1230 T; 0 other; ttaaaaagcc tttaaaacgg ttccaaaggt gaaatgggcc caaaaaggaa cgatcactgc 60 caggcttgcg gtgatcagcg ggatgatccg gattttgttg catgtgacaa gtgcaattta 120 tggtggcatt tttcgtgcgc cggattgacg gaacccgcgg aagctgtgga acagcggaaa 180 tggttgtgcg tggcttgcca ggctaaggaa cggtccggct tgattcagag gacgccggtc 240 aagcacacag aacaaatgga agcggcaaat gctaacctca atctggagga ggttacgcaa 300 aacgtggctg agaaacataa cctcaaactg gttgaggtta tccaacaagc ggcagaaaca 360 cccatgcctt cttcaagtgg gacaagctac aacgcggtga gcaaacctaa cctcactcct 420 gctaaggtat ccgacagtgt ggctaatcgg ctagccatta tgaagcggcg gcaggaggcg 480 gagaggaaac ggatggagct cgagttgcag ctgaagtttg tgaaggagga ggaggacttg 540 ttgtccgagg agttgggtgt gatggcgatt tcgggagcat cgactgtgac accgtcccgg 600 ccggatggag aatcgatgcg aggctcgcag caggatgagc gtggtcccgc cccgcgactg 660 gaaagttttc atcgaaacgt gccaaccgaa ttacccgagt tttcagggga cccggcggaa 720 tggcccgtat ttattgcgca ctacgattac acaacggaga aatgtggttt ttccaactgg 780 gaaaacatga tacggctgca gaaggcgctc aaaggacctg cgctagaagt tgtgcgaagc 840 cgtttagtgc taccggaggt ggtgccacaa gtgattgcga cgctgcgatc gcgctatggt 900 cggccggaac atctcatttc agcgctgatt gggaaaatgc gtcggatgcc tgcaccttgc 960 agggagaagc ccgatactgt tgtggcgttc ggcgaggcag tgcggagcat ggtagatcac 1020 atgcaggctg ctggtctacg ggcacacttg accaacccgt tgctgctgca agagatcgtg 1080 gaaaggttgc cgacgagcga gcagtacagc tgggcacggc acatacgagg cgtgacggaa 1140 ccggatctta tcgtgtttgg cgaattcatg acggaatgga tggacgatgc tgaaacgtta 1200 accaggctgg attcaccctc attgaaagca gtagacagga agaaacctaa caccaagggt 1260 tacgtgcatg cgcacgtgga aaaccaggga gcgaccacat cgggaacaag agccatccag 1320 gtaggacaat cttgttttgt gtgtaacaaa cggggacacc ttgtgagcaa atgcttcgcg 1380 tttggagcaa tggcggtgaa ggaccgctgg cggaaggccc gtgcactttc tttatgtttt 1440 agctgtctgg agcggcacaa ctggcggacg tgccaaaaca gagcggtttg cagcattggt 1500 gggtgcacgc gccgacatca tgcgttgcta cacggcgctg aggagtctcg cgagatcggc 1560 agcgagcaga ataatataga aagccgtgaa ggaggaatcg atggtggtgt cattgcggag 1620 agtaaccatc accaatacgt gtcatcatca tcgaaggcgc tatttcgaat agtgcctgtc 1680 accgtgtatg gaccggctgc tacggtgacg acgtttgcgt ttttggatga aggctcgtcc 1740 atgacgctgg tggatgacga tttggctgaa caattaggtg ttgagggcaa ggtggagcca 1800 ctttgcatcc gttggacagg caacactaca agggtcgagg ctggatccag acgagtaaac 1860 ctgaaggtgg gacctgttgg ttctacaaag cggtttgcca tccactcggt acggactgtg 1920 ccagggctaa acctacctcg acaatccttc gtgcaggacg aagggagatg gcaacatttg 1980 gagcggctac cgattcggca atatcgggat gcggaaccca agctgcttat tgggttggac 2040 aatttgcggt tggcggtccc tctcaggact aaggagggag gcgttggtga tccaatcgca 2100 gtaaagacac gccttgggtg gtgtatttac ggaaaaccgg cgaatctgga gtgtgaacgg 2160 ttgttgcata tttgcgagtg caacgaccaa ggtaacattc acgagacgat tcgggaatat 2220 ttcgacatgc agttgattgg tgctgcacac ggcgtcgaac aggatccaga tgagcggcgt 2280 gcgaagcaga ttctggacac taccacggca cgaatcggga agcgattcga atcgggatta 2340 ctgtggagga aagatgacat cgagctacct cctagcatcg acatggcgcg tcgtaggttc 2400 aattgtttgg agaggaggat ggagcgagat ggacatctta aggaacaagt acatcgccag 2460 attcgagatt tgttgagcaa gcagtacgtt cacaaggcta cgttgcgtga gctggaggag 2520 gctgaccagc gacgcgtatg gtacctacca ataggggtgg ttaccaaccc aaataaacct 2580 ggaaaggttc ggctgatttg ggacgcggca gctaaggcgc atggaacatc gttgaacgac 2640 atgctgctga agggaccaga cgagctgagc tcgttacttg gcgtactctt ccgatttcgt 2700 ctgtacgcag tggcggcgtg tgcagacgtg aaggagatgt ttttgcagat tatgatacga 2760 aaggaagaca aacacgcgca gcgtttcttg tggcgttacg aaccaacgga cgagttggaa 2820 acctacatcg tggatgttgt aatgttcggg tctgcgtgtt cacccgcaac ggcgcagtat 2880 gtcaagaacc gaaacgctcg agagcacatg gaacaattcc cacgagcggt ggagggaatt 2940 attgaaagca cttacgtaga tgactttctg gacagcttcg agacggaaga agaagcatgt 3000 caggtatccc atgacgtaag ggagattttt aggaacggcg gatttgagtt gaggaattgg 3060 acctcaaaca gtatggaatt gatgagatgc cttggcgaag caaatggcga catcaaatgt 3120 ttatcatcca tgggagatga ggcggaacga gtattgggaa tgcgatggaa ccctgcatct 3180 gacgagcttg gattttgcac tagggcgtgc acgacggtgt ctgacctctt gatagcagag 3240 aggattccaa cgaaaagaga ggtattgcga tgtgtgatgt ctctttatga tcctcttggg 3300 ctgcttgcga tgttcgtgat ccacggcaag atcctgatac aagatctttg gcgaactggc 3360 acgcagtggg acgaagagat taacgacatg cagttgagac attggcgtag atggattgat 3420 ctgcttccgg caatagcaga cctacgcatt ccgcgcagtt actttgcggc agcatcgaag 3480 aagatgtacg agaatggcga atggcatttg tttgtcgatg caagtcagca cgcttatgca 3540 tgcgtcttat atctgaggat atttgatgat gctggagaac ctcagtgtac actcatcggt 3600 gggaaagcta aggtggcgcc actgaagcca cttactattc cgaagcttga gcttcaggct 3660 tgcgtgttgg gagcgagatt tttacgctac acgcaggagc atcatccgat taatgtgaga 3720 cgacgagtgc tttggtcgga cagcacagtt gcgttgtcgt ggataaggtc ggatccaaga 3780 aactacaaac ccttcgtggc tcatagggtg gtcgaaatac tggagagcac atcggttgac 3840 gagtggagat gggtgcctac tgaccacaac ccggcagacg aggccacgaa gtggaaggga 3900 aagccgaatt tcgactttgg tggcaactgg tttcaagggc cagagttctt actccatggg 3960 gaggatgatt ggccatcaca gaggcacaac agcgacaacc cgtcagaaga aatacgacag 4020 gtaaatcttc acgtggaaga ctctaacacc ggactattac ctatccggta tgaacgtttt 4080 agccgattgg agaggctaca aaggatgatt ggttggattg tcaggtatgt gggcaatctg 4140 agacgaaagt atcgtggcga acctatttta ggaggtgctc tacgacaaga agaactctat 4200 gaagcggata agattctgtg gaggcagaca caactcgagt actatccgga agaagttcgc 4260 attctgagtt tggacgataa cgatggaaaa cctggaggaa gaacggtatc aaagcaaagc 4320 cacatttacc atctattgcc gtttgtggac gatgaaggcg tattgcgaat gcgaggaagg 4380 ataggtgcag cggctgatgt tccttattct gctaagtatc ccgttatact gccgaggggt 4440 tcccgactgg ctgagttgat agtagagcgg tatcatcgat tgtatcgtca tgcgaacaac 4500 gaaacagtga cgaatgagtt acggcaacaa tttcagatcc cgaagctgag agcgttggta 4560 acgaagacgg tgaagaactg tgtcttctgt aagatcaggc ggtcactacc acaagtgccg 4620 ccaatggcac cgttaccaaa ggagaggctc acgccctttg ttaggccatt cagctacgtc 4680 gggctggatt actttggacc agtgttggtc aagagaggaa gatcgaacga gaagcgttgg 4740 atcgctttat tcacgtgttt gactgtgcgc gcgattcact tggaggtggt gcacagtcta 4800 tcgacggaat cgtgcgtgtt ggcggttaga cgatttgtgg ctagaagagg tgcaccggta 4860 gaaatcttca gcgacaacgg gaccaatttc ttgggtgcta gcaggcagct gcggagggag 4920 atcgaggagc gcaatgaaac tctggcggcg attttcacga atgcgcacac ccgttggacc 4980 ttcaacccac ctggcgctcc acacatgggc ggcgtgtggg agcgtatggt acgctctgta 5040 aaagcggcga ttagcacggt gatggaggca aagcacgcac ccgacgacga gacgtttgag 5100 acagtgatct tagacgcaga ggcgatgatc aactctagac cgttgactta tgttcccttg 5160 gacccggaga accaagaggc aatcacaccg aatcatttcc tgttggggag ttcttcaggt 5220 gtaaagcagc agccagtgtt acctacgaac tatagggata gcttgaaggg aaattggaag 5280 ttagcgcagc atatgctcga cggaatatgg aggcggtgga ttaaggaata tttaccggtg 5340 atctcgcggc agagtaagtg gtttgaaaat gtgcgggaaa ttaggaaagg agatttggta 5400 ctggtagtgg acggaacaat caggaaccag tggaagaaag gaatagttga gcgaattatg 5460 gcaggacctg atggtcatat aaggcaagcg tgggtacgca ccaatacagg agcagttagg 5520 aggccagtag ctaagctagc actgctcgac atagcaactt agggtgacca aatatggttg 5580 gtcacgggcg ggggaa 5596 // ID GYPSY66-LTR_AG repbase; DNA; ANG; 197 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY66-LTR_AG is an LTR of retrotransposon GYPSY66_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY66_AG; GYPSY66-I_AG; GYPSY66-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-197 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY66_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 172-172 (2004). XX DR [1] (Consensus) XX CC GYPSY66-LTR is a long terminal repeat of GYPSY66_AG (its CC internal portion is deposited as GYPSY66-I_AG). XX SQ Sequence 197 BP; 57 A; 46 C; 37 G; 57 T; 0 other; tgttatgtat gtgaacgcag gataagtgac agacgtcaac agctgtcatc caagcagacc 60 acgccgcttg cgtccgcgct ttttggactt cttcttttct ctccgattgc cattgttatc 120 gcgtataaaa cgaattgtgt aaacccgaag tattttttct aataaaccag tgaactcaca 180 acagtaaata cagaaca 197 // ID P4_AG repbase; DNA; ANG; 7669 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 21-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE P4_AG, a P-like DNA transposon - a consensus sequence. XX KW hAT; DNA transposon; Transposable Element; HATN2_AG; KW P superfamily; P4_AG; composite transposon. XX NM P4_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7669 RA Kapitonov V.V. and Jurka J.; RT "P4_AG: a family of P-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 3(2), 26-26 (2003). XX DR [1] (Consensus) XX CC The A. gambiae genome harbors many divergent families of P-like CC DNA transposons. One of those families is P4_AG. CC P4_AG elements are flanked by 8-bp target site duplications. CC Terminal inverted repeats are 27 bp long (4 mismatches). CC Subterminal inverted repeats are 28 bp long (4 mismatches), their CC positions are 64-91 and 7606-7580. CC The P4_AG consensus sequence was reconstructed from CC 6 copies that are only ~2% divergent from each other. CC Presumably, P4_AG copies have multiplied in the genome CC during last 2 million years. P4_AG elements carry a copy of the CC hAT-like HATN2_AG transposon, that was inserted into an ancestral CC form of P4_AG. CC The P4_AG encodes a 877-aa P-like DNA transposase called P4_AGp. CC Putative exons are based on FGENESH. XX FH Key Location/Qualifiers FT CDS join(312..569,2572..4146,4211..5008) FT /product="P4_AGp" FT /note="P-like DNA transposase" FT /translation="MSTCAASFCQHSRYIVKKMGLDVIFHKFPTDPTLLRK FT WVEFCQREEAWVPSISSILCSAHFNKTDYQLINSPSKANRKILKKLKPSAF FT PSVIKSQAREPSNNVVQQCSTNNVVQQIETDEGRIDTSVEHHDDDLPSDNI FT THAKCQNCVQNETEIELLNQTLKKTQDKCNNLLEVNTFLSKQLEIVSKELT FT QSQKEIELLKTNHNKFKDVAISPNEFTTRMKNVLKDTLTSNQIDLITEERK FT RVRWTKAELSKFFTLRYLGKRAYQYLRDDLNFPAASISTLQRYGRTLNLKQ FT GILDDVINLLKNITVDLPECHRECVLSFDEMKVNRILEYDPASDEVLGPHN FT YLQVVMARGLFKNWKQPVFIGFDQQMTKEILFELIKRLYAIKINVVAIVSD FT NCQSNIGCWKDLGAHDYCHPFFSHPITKCNIYVFPDAPHLLKLIRNWLIDT FT GFEYNNKLIKADKLFELVAYRNAAELTPVHKLTQNHLVMTPQERQNVRRAA FT EVLSRTTAIALQRYFPDDCDAQELASFIEKVDMWFSVANSYSPCAKLHYKK FT SFNANENQLAALSDMFELMSNITALGKKSMQVFQKSLLMHITSLKMLYEDM FT RKKHSIVYISTYKLNQDVLENFFSQLRQIGGVHDHPSPLHCMYRIRMMILG FT KSPTTLKNHTELKNDDVENSHEHHEEFLSATVFSVADIPQSVPDISVMEKT FT NQICQAIEECSQESDLISTVSSTCNVQSAQESDGLQYVMGYIANKYNTKYP FT ELDLGVQTFKLTTDHCYSQPPTFVQHLSAGGLFEPSPTFLLLGNRMEKIFL FT KMHPDGTFSKTKKIVAKIAKNIQNQISELPVEIIRTFAKQRMIVRMRFLNL FT KSSTENLMKSKRKHVNQHGKGAKK" XX SQ Sequence 7669 BP; 2520 A; 1306 C; 1374 G; 2469 T; 0 other; caaggttagt tgaggataga ggttgaagca tccagtgtaa attgacaagc agctgtggga 60 aaattttgac agttcggtaa gaggttgttt atgttctcag acctacggca tgtggatgtg 120 gactcgtgta gaaacaaaca ttgtaaatct gtgtcgcgca atagagttgt gaaaaatcac 180 taaatttgta caaagcatat cgtttatatt gatctaagta catcacacgg ctttctaaaa 240 aggtaaattt ttgattatta tgtgttttac tgcaacaagc ctaatttatg tttgctattt 300 cagtgtgcat catgtcaacc tgcgctgcat ccttttgcca acacagccgc tacatagtga 360 aaaaaatggg tctggatgta atttttcata aatttccaac agacccaacg ctgttacgca 420 aatgggtaga attttgtcaa cgagaggaag cttgggtgcc atcgataagc agcattttat 480 gttcagcaca tttcaacaaa actgactacc agctaatcaa ttcgccttcg aaggctaata 540 gaaaaatttt aaaaaagctt aagccgtctg gtaagttttc atgtaatgtc ccacactaat 600 ctatactaat ataagcatac atttgataaa gttactaaat ttgaagggtt catacatgta 660 gctgctgttt cttcgacgtt tcgcccaacg tcgatgtttt actccgaatc gagttcatgt 720 tttccttcgt agaccataag gttactccga cttttaagta ttgaacgagt tattaggaac 780 gcaatgataa gcaaccagaa agtaaataat caagttcatt attaaaacca ggactaatta 840 tcgtgaacga ttcgcaaaga ctaatatgat ctgcatacat catctacatc agtggtctcc 900 aacctgtggt ccgttgcgtt aacagacgtg gtccacgaaa gaattatttt tgaaaaagaa 960 aagcttaata ctatggatta tagtacacaa tcattataaa attttcaaca agcatatcta 1020 aaatgagatt taaacgtatt tttgatcaaa aacattaagt aaagttaaaa aaaagtatgc 1080 actcattaga tgtgagccgc ggcaaatgta ttttcaagcc aagtggtccg cgatcctaaa 1140 aaggttggag agcactgatc tacatcattg cacacatctt gctttctcaa acttcctcct 1200 caaattgtgc cattaaccct taccgtttta aattttgttt gcttcgttgt tcccagttgc 1260 ttcattagaa tgtgcgcaac atatattctt acaggtcatc caccacatcc gaagttttgt 1320 tttactttat gtgattttag tactattttc agttccaaca tccgttcaac gcaggtttta 1380 caggaaagag cttcttttga aaatcctgcg attgatttgt tcgtaaaatc ctaccgtgta 1440 tgcttccgcc gtgaatcgct ccggtgcgat acttttcgat gttcatatag atctgcacat 1500 acattcattt atttttggta atcgatggtt tgtccgccag ttcaccatcg aatgaattgt 1560 cttatgcagt aatacaacat acatgacggt atcgttgctg ttcgattcca ttgtagcaga 1620 aggcttagca gaaggtccat ttgaaacgtt gttttcgttt ttgttcgatt tttctgcaag 1680 agaaaatgtg gaagttgatt caagtggcag tcccactact cctcgcgaag ccgttggctt 1740 tgggtttgat tcaattttct tgtcttcttt tggtacttga ttcagttttc ctgtgtgtat 1800 cagattattt attgactcat tatgaagcac actttgttac gcaccattgt gtagtatatt 1860 tgcaaataaa aaaatcatat ttaaactaca taaattcttg cctttcctaa tgccatcgaa 1920 gtaactttga ttttggatac tgtcgccatg tagtagctgt gaactcgtgc tcgccagagt 1980 ggccttcaaa catagttaaa gaaggtaatc gttgcgaaag aaggttacaa gttacaagtt 2040 ataaacagtc gctttttgcg tgattagttg ctatgcattc tcgattcacc tacttttcta 2100 taagtgtgta tcaattgctt ttgtatgaat ggtgtggcct gctttattcc ttaactggta 2160 ctatacaaat tattatacaa ataataataa taataataat aataataata ataataataa 2220 taataataag ttattataag tcaaatataa ggtaaaaatt ataataattt aagactcttt 2280 tcttcaatga tgcactaaac aaaagcttag attgaaatat tactattcta agctcgaaag 2340 tgataaataa cggatgggaa ttaacgtaga taagtgaaac ggtaatattt tgctgaatca 2400 tctatttgtc tgcaaagact attatacata actttaagcg cttaattgta acaataacag 2460 ttattcacat ataataaaat gtaacaaata tttcttaaat aaaaaataca taaatatgag 2520 gcaaatctat ttcaattcgt taattatttc attcttacct ttttctattt agctttcccg 2580 tcagtgataa aatcacaagc tcgtgagcct tctaacaatg tagtacaaca atgtagtact 2640 aacaatgtag tacaacaaat tgaaacggac gagggtcgca tcgatacatc tgtagaacat 2700 catgatgacg atctaccatc agataatatc acacatgcga aatgccaaaa ttgtgtacaa 2760 aatgaaacag aaattgaact tttaaatcaa actcttaaaa aaacacaaga taaatgtaac 2820 aacttattag aggttaatac atttttatca aaacagcttg aaatcgtcag caaagaactt 2880 acacaatccc aaaaagaaat tgaactttta aagactaatc ataataaatt taaagatgtt 2940 gctatatcac caaacgaatt cacaaccaga atgaaaaatg ttttaaaaga tacgcttaca 3000 tcaaatcaaa tagatctgat tactgaggaa cgtaaaagag ttagatggac taaagcagaa 3060 ttaagtaaat tttttacact tcgttattta ggaaaaagag catatcagta tttaagagat 3120 gatttaaatt tccctgcagc atcaatttca acactacaac gatacggaag aacattgaac 3180 ctcaagcaag gaattttaga tgatgtaatt aatttgctaa aaaacattac cgttgatctg 3240 ccggagtgtc atcgggaatg tgttttgtca ttcgatgaaa tgaaagtgaa tagaatttta 3300 gagtatgatc cggcctctga tgaagttctc ggtcctcaca attatttaca agtagtaatg 3360 gcaagaggat tgttcaaaaa ttggaagcag ccagttttta ttggttttga ccaacaaatg 3420 accaaagaaa ttttatttga attaatcaaa cgtctttatg ctataaagat aaacgttgtt 3480 gcaatagtta gtgacaattg ccaatctaat attggatgct ggaaagattt aggtgctcat 3540 gactactgtc atcctttttt cagtcatcca ataacgaagt gcaatatata tgtttttcct 3600 gatgctcctc atctgttaaa actaataaga aattggttga tagatactgg ttttgaatat 3660 aacaataagt tgataaaggc agataaattg tttgagctgg tagcttatag aaatgcagct 3720 gaattaactc ccgttcataa actaacacaa aatcatttgg ttatgactcc tcaagaacgt 3780 caaaatgttc gaagagcggc cgaagtttta tccagaacca ctgctattgc attacaaagg 3840 tactttcctg atgattgtga tgcacaagag ttagcctcat ttatagagaa agtcgatatg 3900 tggtttagtg tagctaactc atattctcca tgtgctaaac ttcattacaa aaaatctttt 3960 aatgcaaatg aaaaccaatt agcagcatta agtgacatgt ttgaactgat gtcaaacatt 4020 acagcattgg ggaaaaaatc tatgcaagtt tttcaaaagt ctcttttgat gcacataaca 4080 tcgctaaaaa tgttgtacga agatatgaga aaaaaacaca gcatcgtata tatctctact 4140 tataaggtta gtattgaaca atagtgaaag atttcagtca actaatgtta tgttttattt 4200 ttgtttacag cttaatcagg atgttttgga gaacttcttc tctcaacttc gccaaatcgg 4260 aggcgtacac gaccatccat ctccattaca ttgcatgtat agaatccgaa tgatgattct 4320 tggtaagtca ccgactacat tgaaaaatca tacagagcta aaaaatgatg atgtagagaa 4380 tagtcatgag catcacgagg agtttctatc tgcaacagtt ttttctgtag cggatatccc 4440 tcaatctgtt ccagatattt cggtcatgga aaaaacaaat caaatctgcc aagcaattga 4500 agagtgtagt caagagtcgg acttaataag tacagtcagt agtacctgca atgtgcaatc 4560 agctcaagaa agtgatggac tgcaatatgt aatgggatac atagctaata aatataatac 4620 taagtatcca gaattagatt taggcgtcca aacttttaaa ttaacaactg accattgtta 4680 tagtcaacct cctacctttg tgcaacattt gtccgcagga ggattgtttg aaccatcgcc 4740 tacattttta ttattaggta atcgaatgga aaaaattttc ttaaaaatgc atccggatgg 4800 tacctttagt aagactaaaa aaattgttgc caaaattgcg aaaaatattc aaaaccagat 4860 aagcgaacta ccagtcgaaa taatacggac ctttgccaaa caaagaatga ttgttagaat 4920 gcgcttcctt aatttaaaaa gtagtaccga aaatctcatg aaaagcaagc gaaaacatgt 4980 aaatcaacat ggaaaaggcg caaaaaaatg agaaaaatat tgaactagtc tatacattat 5040 ctattctcat gtaaatcaat taatgagaca tgtaattaat ttgttttatt tttgttgtgt 5100 taatgccgat agtatttatt taattattta ttggactatt tactaagata tttgctagtt 5160 agtttatgta aattatcttt ttatctatta ttttatttat ttattcattt atttatttat 5220 ttatttattt atttatttat ttgtttgtta gtataatcaa attgttggta gtcgtataca 5280 tagattttcc ttgtgtttta tccattaaac agacttcatt tttttattct gaaggctacc 5340 atatgtgatg cttatttatt aatctgtgtt atgtagtaat ctgcatccat ttataatatt 5400 ttctttggtt atttgtaatt ggctaccggt ttttgattca atgtttcagt cctaatactt 5460 gtaaatgttt gaatgagtga tttatcgtgc attttttggg acagtctggt ggtacagtca 5520 agaacatacc cgtcatgtgt ttaagcccgt atctatcgta ctccccaaaa gattagaaga 5580 tgtacaagaa gattcataac gccgaataga tctagtttgc tttcatttca tcgtctgcaa 5640 tgccccagaa ttaattacat cacactacaa tggtcttcat ggggttctca attaggtata 5700 tttaaataga tcatgatgaa tcgttgtagt aagaaaatac tgatatgctc ttttatattt 5760 aataagaacg gacctgccgt attccacgta ttattttttt ttattctgta ttattgtaat 5820 gaactattga ttttttgctt taataacccg taatattttg aaatgtatta tagcaaatat 5880 gtaaaatagt ccacattacc tcattttcat gtataaataa ccattttgta acaacaatat 5940 gatctataat cgtaaatcac ttgttaaggt ttgtcagtat attttaggat ttcacactga 6000 tgtaaaacac tgaattaaaa catgcaaaaa cgctagcacg taaggtattc cgctgtatta 6060 gaaattattt agaataatac ggtaataaaa catgagaact ttttaaattg ttatttaatg 6120 catgaaagtg cattgtgtta ccaatacact gaaatgtaca aataaagcaa aaacaatcaa 6180 gaaactttac ccgttcacca actgtatttt ctcactcact catccgaaaa attaccattt 6240 tgttaaagag taagtaagta agtaatggaa gatccaaaat taaggtgtta cgatcagtgg 6300 ctcggattaa gggtgtcggg ggccctaggc ggtaagacta gttgaggccc cctgtcaatt 6360 gtaaatggtg ttctaggggg tcagtcgata atcagtcaca ggctctaaaa tttgctggca 6420 gtggaggggg gggggggggg ggagggtgaa aatgatttgc cagccttggg gccacaacgt 6480 ccatccttcc gggggccatt gccgctatta ctctgtccat acggcatact atagacttca 6540 gaactacggg tcccctaaat cggcggggcc ccaggcgacc gcctagtccg tctaccggta 6600 gatccgccac tggttacgat gagtatctca tagaaatgca ttattttgca tacttcttgt 6660 ttcctattct caagatgagc taaaacgcac cagtttttgc caatatatct aagcagtttg 6720 ttccttcctg ggcaaatctt cctaacgtgg ccacgtatgg aatcataaat tagaaaataa 6780 ttaaaaccgg tataaactta cagctacgcg cagttgttaa ccattatacg ctgtcagtaa 6840 aattttcgag gttaatgata agcatatggc acaattcacc gcttgactgt aagtggttcc 6900 tgacactctt gcggaaggta aaccattatc ctgattacaa actgcttcca atttttaggt 6960 ataactaatg ccatacaaat caacaatgct aagctttctg aggtgtacgg tgaatcggat 7020 atcctaacgg tagtcagggc ctatgatacg atggctgggg catgtaatga aaatgccgta 7080 gtaatgccct ccaagaaagt ggtcggtcct gtgtttggta ctaggtggat ccggaatcag 7140 gctaagcagg gtctgtcggt gatcggatgc ctgcattgat ggacgctgaa ggcatcgagc 7200 ctcctgataa aataatagta gtcccggcca ttcaacggcg ctgggctcaa tcgtaagcag 7260 gccaatgaga aagaaaaagt gagataagag caggatttga ggattggagc ttttggaagt 7320 tattggagtt agtttccgtg acctgtgatt acatggagct agatatgtac acgaagctgc 7380 atatgaagtt atcagcacga gtaacaagga actggaaaga ttgcgataaa taaaagttta 7440 gtttgtaaca cctaaagcaa taaaccacat agtcagtttt acatatacgc acgatttgta 7500 ttatttcaca tgttttttct gtatttgaac taaaaaatgc gagctaaaag aacagcaccc 7560 atctgctgca attatttgta aacactttct tactgaactg tcaaaacagg attcaaaatg 7620 tcagaccaga aacaagctat aattcgacct ctatcctcaa tagaccttg 7669 // ID GYPSY58-LTR_AG repbase; DNA; ANG; 289 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY58-LTR_AG is an LTR of retrotransposon GYPSY58_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY58_AG; GYPSY58-I_AG; GYPSY58-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-289 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY58_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 156-156 (2004). XX DR [1] (Consensus) XX CC GYPSY58-LTR is a long terminal repeat of GYPSY58_AG (its CC internal portion is deposited as GYPSY58-I_AG). XX SQ Sequence 289 BP; 83 A; 69 C; 74 G; 63 T; 0 other; tgttgtgagc taacctggcc cggactcatt cacggcggta actcacggat gacagcaaag 60 cgtcatccgt caatgtatgc gtgtgatcga tccgggggtc atcggttgcg cggtcacgga 120 cgttgaccgc caaacaccag cgcgcactga agaactagcg gtcagcaagg aagcgtgctc 180 gaagccaaca ccaaattgta accaagtgaa gttgaatata tacgttagta aatttattag 240 taacgcgtga gttttattcg accacctctg cgatcgaaag aacataaaa 289 // ID GYPSY24-I_AG repbase; DNA; ANG; 4459 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY24-I_AG is an internal portion of retrotransposon GYPSY24_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY24-I_AG; GYPSY24-LTR_AG; GYPSY24_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4459 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY24_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 13-13 (2004). XX DR [1] (Consensus) XX CC GYPSY24_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, CC GYPSY25_AG, GYPSY26_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY24-I_AG consensus was reconstructed after multiple CC alignment of 7 copies. CC The consensus encodes the 1412-aa GYPSY24_AGp gag-pol like CC protein CC (pos. 199-4434). CC The sequence of the LTRs flanking GYPSY24-I is deposited as CC GYPSY24-LTR_AG. XX FH Key Location/Qualifiers FT CDS 199..4434 FT /product="GYPSY24_AGp" FT /translation="MANDNRELLEALSGMLVQALKASIGPAVEQVSAELRN FT TGENIPAPLPKAPSFAMPEYRANEGTSVADYFNRFEWALQLSKIPEIQYAD FT YARVHMGAELNTSLKFLVAPKKPQEVPYSEMRKILVAHWDQKKNKFVESIK FT FRTIVQQRDESIAQYVLRLKQGSANCEYDNFLDRMLIEQMLHGLTERDICD FT EIVAKNPSTFQDALDVALALEATRNIARDINTSQPASEATNKLGYEKPNVK FT KPYTRRNTTNKQHANSPENTAFNNTHGNQPVACNGCGGPHLRSECRFRSAK FT CNNCHKKGHIAKVCKSGKSNHHISQQDISSPSGSIDQVQRLNRIHNIPSSE FT KKMIDVKIDGKSLKMELDTGAPCAIVSEATLKSIKPHFTLQTSDRQFSSYT FT GHRISCIGRMNVNVTIGATTRKEQLYVVSGAHDSLLGREWISHFADQIDLN FT RMFSSRTSIHTVSNSTLSPNCETQLTRLLDSYADVFSESPGKLTGPPAKVH FT LKENATPVFARARDVPLALRERYAKEIDSKINSGFYEKVEYSEWASPTHVV FT VKKNGKLRITGNYKPTVNPLMIIDEHPIPRIESIFNRMKGATLFCHLDVTD FT AYTHLPIDEQFRHVLTLNTTTHGLIRPTRAVYGAANIPAIWQRRMEEVLLG FT LTNVVSFYDDIIVFAKDFEELLQALTSILSRIKESGLKLNRSKCVFATPSL FT ECLGHRIDREGLHKSTKHIEAIRDAPRPSSPEQLQLFLGKATYYSAFIPDL FT STRAKVLREILSADRFEWTAEAEEAYRDIKNILISPQVLTQYDPTLPLILA FT TDASKTGLGAVLSHRLSNGVERPIAYASCTMSATEQRYPVIDKEALAIVWA FT VKKFFNYLYARKFTLVTDHKPLTQILHPEKSLPTLCISRMANYADYLAHFN FT FDVVYRSTNENKNADYCSRIPSPSTQSSVNSLSLRRGGNEDQDDFEDFVLN FT QIQQLPIKADQIARETRKDEHLGKILKDLEMGRNLSQIGYKAPEAKYTMVA FT NCLLFEHRVVIPDIFRPAILQDLHAAHIGVVRMKSLARSYVYWPGIDKDIE FT QLAKSCHECAQTVSAPPKFNQHHWEYPSNPWERVHVDYAGPVAGAMLLIIV FT DAYSKWVEVKVTHSTTTEATIKILDELFASYGAPLTVVTDNGTQFTAAEFT FT TFLQRSGVKFHKRSAPYHPATNGQAERYVQTVKRALKAMHSSSTTLQANLN FT EFLLQYRKVPHSETGEAPAKLFLGRNIRSRLDLVRPQSVQTRTAEKQRVAF FT EPSYRTFLPGQLVYCLSGSTRMDKWIRGTVVSRLGDLHYSINCNGNQMKRH FT VDQMRPTLDDNRTEQPRSVPVPTQTPEVHHHRRHYYGSTDSPQTSSVPVSS FT RTVSVSSDSSTSSDSSYDTPTGSPIRASDAPPFVRRSTRLRNPPLRYSP" XX SQ Sequence 4459 BP; 1355 A; 1124 C; 963 G; 1017 T; 0 other; ttttggtgtc agaagtggga tagtccagga tacgtgtcgg atacatcgga gccagcggaa 60 aagcgatttg tcgcccgcta aacaaacgat acatcacaca cacatcccaa gcagaaaagt 120 gaggttagca tcaccacgtg cgaacaacat tacgacaagg aggtttcttt cgctgtacgc 180 tgaatacaac agaaaacgat ggcaaatgat aaccgtgagt tgctggaggc cctttccgga 240 atgcttgtgc aggcacttaa ggcatccatc ggaccagccg ttgaacaagt aagtgcagaa 300 ctacgcaaca caggtgaaaa tatcccagcg ccgcttccga aagctccgtc atttgctatg 360 cccgaatacc gtgccaacga gggaacatcg gtcgctgatt attttaaccg ctttgagtgg 420 gcgcttcagc taagtaaaat cccggaaata cagtacgcgg attatgctcg tgtgcatatg 480 ggagccgagc taaacacgtc gctaaaattt ttagtcgcac caaaaaaacc acaagaagtg 540 ccatattcgg aaatgcggaa aattttagta gctcattggg accagaaaaa gaataaattc 600 gtagaaagta ttaaatttcg aaccatcgtg caacaacgag acgaatcgat tgcacagtac 660 gttctccggt taaagcaagg ttcagcaaat tgcgaatacg acaatttttt agaccgaatg 720 ctcattgagc aaatgttaca tggattgaca gagcgcgaca tctgtgacga gatagttgca 780 aagaatccat ccacatttca agacgctctc gatgtagccc tcgcgttaga agcaactcgc 840 aatattgctc gagacattaa cacgtcgcaa ccagcttctg aagctactaa caagctaggc 900 tacgaaaagc caaatgtaaa aaaaccatac acgcgtcgaa acacgacaaa caagcagcat 960 gcaaactcgc cagagaacac cgcattcaac aatacacacg gtaaccagcc agtagcttgt 1020 aatggctgtg gaggtccaca cctcagaagc gagtgtcgtt tccgtagcgc caaatgtaac 1080 aattgccata agaaaggtca tattgctaag gtctgcaaat cgggtaagtc caaccatcac 1140 atttcacaac aagatatctc ttcgccctcc ggtagtattg atcaagtgca acggcttaac 1200 cgtattcata acataccgtc gagtgagaaa aaaatgatcg atgttaagat cgatggtaaa 1260 tcgctgaaga tggagcttga taccggtgca ccttgcgcaa tcgtatcaga agcaaccctc 1320 aaatcaatta aaccacattt caccttgcag acaagcgaca gacaattttc tagttatact 1380 gggcatcgca tcagctgtat tggtaggatg aatgtcaatg taactattgg agccacaacg 1440 cgaaaggagc aactctatgt agtgtccgga gcacacgatt cactcctggg acgcgaatgg 1500 atctctcact ttgcagatca gatcgattta aatcgcatgt tctcctcgcg tacatccatc 1560 catacagtgt caaatagcac attatctcca aattgcgaaa cgcagctaac aagattatta 1620 gacagctatg ctgatgtttt cagtgagtct ccgggtaaac tgacaggtcc cccggcaaaa 1680 gtacacttga aagaaaatgc aacaccagtg tttgctagag cacgcgacgt tcccctcgcg 1740 ctgcgagaaa ggtatgccaa agaaatcgac agtaaaataa attccggttt ttacgaaaag 1800 gtcgaatatt cggagtgggc atctcctaca cacgtggtcg ttaagaaaaa cggtaagctg 1860 aggataacag gtaattacaa acctactgta aaccctttaa tgataataga cgaacatcct 1920 attcctcgaa ttgagagtat tttcaaccga atgaaaggtg ctactctatt ttgccatttg 1980 gatgttaccg acgcatatac gcatcttccc atagacgaac agtttcgtca tgtcttaacc 2040 cttaacacca caactcatgg gctcatacga ccaaccagag cagtatacgg tgccgccaac 2100 atacccgcaa tctggcaacg tcgaatggaa gaagttctct taggccttac aaatgtcgtt 2160 agcttctatg acgacattat cgttttcgcg aaagattttg aagagctttt acaagcctta 2220 acaagtatcc taagcagaat caaggaaagt ggtctgaaac ttaaccgatc taaatgtgtc 2280 tttgccacac catcactcga gtgcttaggt caccgaattg atcgcgaagg tcttcacaag 2340 tcgacgaaac acattgaagc gatccgagac gcaccaagac cgtcttctcc cgaacaatta 2400 cagctatttt tgggtaaagc cacatactat tcagcgttca taccagattt gtcaacaaga 2460 gcaaaggtat tgcgtgagat attatcagca gatcgttttg agtggacggc tgaagccgaa 2520 gaagcctacc gcgatatcaa aaacatttta atttcaccac aagtccttac tcagtatgac 2580 ccaacactac cattgatatt agctactgac gcaagcaaga cgggtctcgg agcagtgctc 2640 tcccatcgac tcagtaacgg ggtagaaaga cccatagctt atgcaagctg tacaatgtcg 2700 gcgacggaac aacgctatcc ggttatcgac aaagaagctc tcgctatcgt ttgggcagtc 2760 aagaagtttt tcaactattt atatgcacgg aagttcacgc tcgtcacgga ccacaaaccg 2820 ttgacgcaaa tcctgcatcc agagaagtca ctgcctacac tttgtataag tcgcatggca 2880 aactacgctg actacttagc gcactttaat ttcgatgtag tgtaccgatc gactaacgaa 2940 aataagaatg ccgattattg ttcacgcatt ccaagtccct cgacacaatc cagtgtcaac 3000 agcctttctc ttcgtagagg aggaaatgag gatcaagacg attttgaaga ttttgtgctt 3060 aaccaaatcc agcagctgcc cattaaagcc gatcaaatcg cacgcgaaac gcgaaaagat 3120 gagcacttgg gtaaaatttt gaaagacctc gaaatgggac gaaacctatc acaaatcggc 3180 tataaagcac cagaagccaa atacaccatg gttgccaatt gtttgctgtt tgaacaccgt 3240 gtcgtgattc ccgacatctt tcgtcctgca attctgcaag atttgcacgc agcacatatt 3300 ggtgtggtga gaatgaagtc tttggcccgt tcatatgtct actggccggg catagacaaa 3360 gacatcgagc agctagccaa atcatgccac gaatgcgctc aaacggtctc agcacctcct 3420 aagttcaatc aacaccattg ggagtatcca tctaaccctt gggagcgtgt gcatgttgac 3480 tatgcgggac ccgttgctgg cgcgatgcta ctgatcatcg tggatgcgta cagcaagtgg 3540 gttgaggtga aagtgactca ctcaaccact accgaggcaa ccataaaaat cctcgacgag 3600 ctatttgcat cctatggagc ccccctaact gttgtaacag acaacggaac acaattcact 3660 gcagcagagt tcaccacatt tcttcaacga agcggtgtca agttccacaa acgctccgct 3720 ccatatcatc cggcaaccaa tgggcaagca gagagatacg ttcagacagt taagcgagct 3780 ttgaaggcta tgcattcgtc cagcactaca cttcaagcta acctgaacga gttcctgctc 3840 cagtaccgca aagtcccgca cagtgaaacc ggtgaagcac cagcaaagct tttcctaggg 3900 cgaaacatcc gttcacgtct cgacctggtt cgaccacaat ccgtccagac aagaacagca 3960 gagaagcaac gagtcgcttt tgaaccatcg taccgaacat tcttgcccgg acaactcgtc 4020 tactgtctct cgggaagtac gagaatggat aagtggatcc gaggtacagt ggtatcccga 4080 ctaggcgatc tacactactc catcaactgc aatggtaacc agatgaaacg ccacgtggat 4140 cagatgcgac caaccctaga cgacaacagg acagaacagc cgcgaagtgt acctgtacca 4200 actcaaactc cggaggtaca ccatcaccgc aggcactact acgggtcaac cgactctcca 4260 caaacatcga gtgtccccgt ttcatctcgg acagtctccg tgtcatcaga ctcatctact 4320 tcgtccgatt catcgtatga cacaccgaca ggaagcccga tccgagccag tgacgccccg 4380 cccttcgtcc gccgttctac aagactgcga aacccgccgt tgcgatactc gccgtagttc 4440 atttctaaga agggaggag 4459 // ID GYPSY1-I_AG repbase; DNA; ANG; 3885 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY1-I_AG is an internal portion of the GYPSY1_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW AP protease; GYPSY1-I_AG; GYPSY1-LTR_AG; GYPSY1_AG; Gypsy clade; KW gag; integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3885 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "GYPSY1_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 73-73 (2003). XX DR [1] (Consensus) XX CC GYPSY1_AG is a family of Gypsy-like LTR retrotransposons. CC GYPSY1-I_AG, an internal portion of GYPSY1_AG, is flanked by CC GYPSY1-LTR_AG LTRs. The GYPSY1-I_AG consensus sequence was CC reconstructed based on multiple alignment of 12 copies; they are CC less than 1% divergent from the consensus sequence. CC Some copies of GYPSY1_AG are 100% identical to each other. CC They can be active retroelements. CC The consensus sequence encodes the Gypsy1_AG1p 239-bp gag-like CC protein (pos. 67-782) and the 1029-aa Gypsy1_AG2p, composed CC of the AP protease (pos. 40-130), reverse transcriptase (pos. CC 227-394), CC and integrase (pos. 744-900) domains. XX FH Key Location/Qualifiers FT CDS 67..819 FT /product="GYPSY1_AG1p" FT /translation="MSTENVSNEETSPATAAVSVKLPEFWKNDPSLWFSQA FT EIQFLLAGVHKDETKFYHIVAKLEQSVLCHIADYVKQPPATGKYEAVKQRL FT ISRFELTEQAKMDQLLGSYDFGDLRPTHLLTKMQELAAGLNVNDSLLKRLF FT LQKLPANIRAILSIHDGSLSKLAEMADKMIEMAPQTSVIHASVQKETTENL FT AEEVAAMKVELRQMKARQPERGRLRSTSQNRSNENICWYHRKYGNRATRCR FT SPCQYHQSKN" FT CDS 783..3869 FT /product="GYPSY1_AG2p" FT /translation="MSKPLPVSSVKKLDFRPSEIGEVGGLRISRRLQIFDK FT SSGIRFLIDTGSDVSIIPASKIEKTREPSPFLLHAANGTKIRTYGSKFVSV FT DLGLRRKFSWNFLQADVTSAIIGADFLAHFGLLVDLGNRKLIDGGTKLHTV FT CGLSKSSVYGVTTIAKDHPFRDLLVEFREITAPPTMRTEVRHNVTHHIQTT FT GPPVASKPRRMPPDKLQAAKKEFETMMELGICRPSKSSWASPLHCVPKKNG FT QWRFVGDYRSLNRITVPDRYPVPHIHDLLNNFLGKNCFTTLDLVRAYHFVP FT VEESDVPKTAVITPFGLFEFTKMQFGLCNASQTFQRFMHHVFGDLDFVVVF FT VDDICIASSNEEEHLSHVRTVFERLKSNGLVLNLDKCKFVQKEVNFLGYHI FT NASGIKPQANRVQAVVDYSRPITVKDLRRFLALLNGYKRFIRNAVSLQQPL FT QALIIGNRKNDTRKLQWTIAADEAFVKCKESLANAALLSYPDSSKRMGLMI FT DASDTAAGATLQQNVAGAWQPLGFFSQKFSPSQKKYSVFGRELTAMKLAVQ FT YFRHLVEGREFTIYTDHRPLTYALNSNSNHLPHEERYLQYISSFTKDIRHI FT SGKDNSAADALSRVNTISAPSTVDFELLSKAQHDDPELQKLLADRTTSMNL FT QLRSSVSTNQLLYCDVSDNVHVRPYVPEKLRLEVLRNIHCLSHPGVRATRK FT MVARRFVWPSMNRDVARFVRSCIDCQRSKIHRHTSAALNEFELPKSRFRHV FT HIDLVGPLPTSNGKRYLLTMIDRFSRWPEAVPLPDILAETVAKAFCECWIS FT RFGVPETITTDQGRQFESELFTELTRLLGALRIRTTAYHPEANGLIERFHR FT TLKTSLTCVDSKRWCDKLPLVLLGLRTAIREDIDCSVAEMTYGQPLRIPGD FT FLEPSKTEICRSEFAKLLCRTMQQIGPIRNSHHDKRSVFVPKDLQSCKSVF FT VRIDSVKRPLTHPYEGPFQIIERHEKYMDLNMNGEKRRISIDRIKPAYICE FT KDSNEDNERTKVTPSGHRVRFLA" XX SQ Sequence 3885 BP; 1101 A; 850 C; 865 G; 1069 T; 0 other; ttggtgaccc cgacgtgatc tccggattgt tattagttct cctaatttga aaatttgtga 60 acgaacatgt ctaccgaaaa cgtttctaac gaagaaactt cccctgctac tgcagccgtt 120 tcggttaaac ttccggaatt ctggaagaac gatccatcgt tgtggttttc gcaagctgaa 180 attcagtttt tgttggccgg cgttcataag gatgaaacaa aattttatca tattgtcgcc 240 aaactcgaac aatccgtgct ttgtcatatc gccgattatg taaaacagcc tcctgcgaca 300 ggaaaatatg aagccgttaa gcagcgcctc atatctcgat tcgagctcac ggaacaagcc 360 aaaatggatc agctacttgg atcgtacgat tttggagacc ttcgtcctac gcatctttta 420 acgaagatgc aggaacttgc tgccggattg aatgtgaatg attcgttgtt aaaaagacta 480 ttcctgcaaa aacttccagc taatatacgt gcaatactta gcatccacga tggaagtctt 540 tcgaagctag cggagatggc agataaaatg atagaaatgg ctcctcaaac atcagttatc 600 catgcttctg tgcaaaaaga aacgacggaa aatttagcag aagaagttgc tgccatgaaa 660 gtagagctac gccaaatgaa agcacggcaa cctgagcgcg gtcgattgcg ttccacttct 720 caaaatcgtt ccaatgaaaa catctgttgg tatcatcgga agtatggaaa tcgagctacg 780 cgatgtcgaa gcccttgcca gtatcatcag tcaaaaaact agatttccgc ccatccgaaa 840 tcggcgaggt gggcggatta agaatcagtc gccgtctgca aatcttcgac aaatcttctg 900 gtattcggtt cttaatcgac acgggatcgg atgtatcgat aatacctgca tccaagatag 960 agaagactcg agaaccatcg ccgtttttac tccatgcagc aaacggaacg aaaatacgaa 1020 cgtatgggag caagtttgtt tcagtggatc tcggactacg ccggaagttt tcgtggaatt 1080 ttttgcaagc cgacgttact tctgcaatta ttggtgccga tttcctcgcg cattttggtc 1140 ttcttgtaga tcttggaaac agaaaactta ttgatggtgg tacgaaatta cacactgttt 1200 gcggattatc gaaatcttcg gtttacggtg taacaactat agcaaaagat catccttttc 1260 gagacttgct ggtcgaattt cgagaaatca ccgccccgcc gacaatgcgc actgaggtac 1320 gacataatgt aacccatcat attcaaacca ctggacctcc agttgcttcc aaacctcgta 1380 gaatgccgcc agacaaactt caagccgcta aaaaagaatt tgagaccatg atggagctcg 1440 ggatttgtcg cccttccaaa agcagttggg caagtccact tcattgcgta ccgaaaaaga 1500 atggtcaatg gcgctttgtc ggagattaca ggagtttaaa ccggataact gtgccagatc 1560 gttatcctgt gccacatatt cacgatttgc taaacaattt cttaggtaag aattgtttta 1620 ccactctaga tttggtacga gcatatcatt ttgttccggt cgaggaaagc gacgttccaa 1680 agactgctgt aatcactccg ttcgggctct ttgaatttac caagatgcaa tttggattat 1740 gcaacgcgag ccagacattt cagcgcttta tgcatcatgt cttcggtgat ttggattttg 1800 tggtggtgtt tgtcgatgat atatgcattg catcgtctaa tgaggaggaa cacctatcgc 1860 acgtgcgaac tgtctttgag cgtctcaaat caaacggctt ggtactgaac ttggacaaat 1920 gcaagtttgt ccaaaaagaa gtcaactttt tggggtatca catcaatgca tctggtatca 1980 aacctcaagc taatcgcgtt caagccgttg ttgattacag tcgtccgatt acggtgaagg 2040 atcttcgacg atttttggca ttgctgaatg gttacaaacg cttcatccgg aatgctgtct 2100 cattgcaaca accattgcaa gcactcatta ttgggaatcg aaaaaacgat acaagaaaac 2160 ttcagtggac gatcgcagca gacgaagctt tcgtgaaatg caaggaaagt ttagcaaacg 2220 cagctttatt gtcctatcct gactcgtcaa aacgaatggg actgatgatt gatgcttcgg 2280 atacagcggc aggagctact ctacaacaaa atgtcgctgg cgcatggcaa ccgctggggt 2340 tcttctcgca gaaattttcc ccttcacaga aaaagtattc tgtcttcggt cgcgagttaa 2400 cagccatgaa attagcggtg cagtattttc gacatctagt ggaagggaga gaattcacta 2460 tttatactga tcatcgtccg ctcacgtatg ctctgaattc taattcaaat catcttcctc 2520 atgaagaacg atatttgcag tacatttcga gttttacaaa agatattcgg cacattagtg 2580 gcaaagacaa ttctgctgca gacgcattat ccagagtcaa caccatctca gctccttcaa 2640 cagtggattt tgaattatta tcgaaagcgc agcatgatga tccagaacta cagaagctac 2700 ttgctgatcg aactacatca atgaacttgc agttaagatc atctgtttca actaatcaat 2760 tgttgtattg tgatgtgtcg gataatgtac atgtcagacc gtatgtacca gagaagttac 2820 ggttagaagt tcttcgcaac atccattgcc tttctcatcc cggtgttcga gcaacgagaa 2880 aaatggttgc gcgaaggttt gtttggccct caatgaatcg agacgtcgcc cgcttcgtca 2940 gatcttgtat tgattgtcaa cgatcgaaaa tccatcggca tacatctgcg gcgctcaacg 3000 aattcgagct tccaaaaagt cgtttccgcc atgttcatat cgatttggtt ggaccacttc 3060 cgacgtcgaa cggaaagcgg tatttattaa cgatgatcga ccggtttagt cgttggccgg 3120 aagcagttcc tttgccagat atactagctg aaacagtcgc taaagcattt tgcgaatgtt 3180 ggatttctcg gtttggtgtt ccagaaacaa tcacgaccga tcaaggacga caatttgaat 3240 ccgaattgtt cacggaattg acgcggcttc ttggggctct ccgtattcgt actacagcgt 3300 atcatcccga agctaacggt cttatcgagc gttttcatcg aacgttaaaa acttcgctca 3360 cttgtgtcga ttcgaaacgt tggtgcgata aactgccgtt ggtcctgctc ggtttgcgaa 3420 ctgctatcag ggaagatatc gattgttctg ttgccgagat gacatacgga cagccactgc 3480 gaattcctgg cgattttttg gaaccctcga agacggaaat atgtcgctca gagtttgcca 3540 aactgctttg ccggaccatg caacaaattg gaccaatcag aaactcgcat catgacaaac 3600 gatcggtatt tgttccgaag gacttgcaaa gttgcaaaag cgtttttgtt cgaatcgatt 3660 cagtcaagcg gcctcttaca cacccatacg agggaccttt tcaaataatc gaaaggcatg 3720 aaaagtatat ggacttaaat atgaacggtg agaaacgaag gatttcgatt gatcgtatta 3780 aaccagcata tatttgtgaa aaggattcga atgaagataa cgaaagaaca aaagttacgc 3840 catccggtca ccgtgttcgg ttcttggcgt aactgagggg gactc 3885 // ID Ag-Jock-13 repbase; DNA; ANG; 4443 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE A Jockey clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW Jockey; Non-LTR Retrotransposon; Transposable Element; KW Ag-Jock-13. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4443 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4443 RA Kojima K.K. and Jurka J.; RT "Jockey clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 5 CC sequences with >99% identity. XX FH Key Location/Qualifiers FT CDS 289..1521 FT /product="Ag-Jock-13_1p" FT /translation="MEVENRSHDSVRVETTNTAVTPQTAASTSCPNTVIRS FT ARIPPIVVNAPYHQLRAELAGIPGIVYQFAGAKVKLIISLVETRDRVLTLL FT KANRKEFFTHELRSEKPFKAVIRGLPDLPEEDIITALREQSIEPLVVHKIS FT KKHYEGHSRQACLYLVHFTKGTITLAALKCIRTVDSIRVSWEAHRSGKGRI FT VQCHRCQAFGHGTRNCSMKKRCENCSEEHDSETCPTKAPEATKCANCNGSH FT RSTDPDCPSRHNYNLSRQKTSTITHRQPKPSGQPPPALTNANFPPLKHLGL FT PTPANAPNAYHNTAATTTAANTVPDVTTNPWLPRDRAVTESTTRATIMRTQ FT PMNGSAMRAPSSHNFTAPADTLPGDEEGDFSIEEWVEILRIMTQRFRLCRN FT RYEKFAVIAELAIRYGC" FT CDS 1514..4162 FT /product="Ag-Jock-13_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="MDAKLRIVTWNARSIAAKKIPLMEFLLRQKVDVALVS FT ETHLRPDINFSLKGYHFLRLDRQGTTTRGGGVAIIVRSGINFNQISHLNTT FT VIEALGIEVQLSIGLIKIIVAYCPMQCRRNDGKAAAFKNDLNIITRSHQRL FT IVGGDLNARHQAWNNLRRNTNGELLFRHSETGQFTVDFPDSPTYISAGGTF FT STLDLFLTNVKISKPETLDELTSDHFPVVTEVDCSVSAGSIRRRKDYQNVN FT WQRFGRLVDNQIQSTEILSVPEVNIAIANLEIAVRSAEAACVKESMIRGEF FT SDMDSHTLALIKERNRLRRIFQHTGDITAKRLASTIAKQISARVEIIRNEN FT FGRSIQRMDTRAPAFWKVSRILKQRPKPVPPLTSLGQIQVTPSEKSNALAS FT QFANAHSDGVNRASRNEAKVATTLIRLEDSVFTVPLQEKVTITDVRIAIGR FT MKNMKAPGFDKIFNILIKHLQVKALCLITKVFNICFELGYFPSTWKCAKVV FT PILKPGKDPTLPTSYRPISLLPSLGKLFERIILDRLQNWVSELNLIRPEQF FT GFRQEHSTVHQLLRVKGCIEQNKTDSKSTAVALLDVEKAFDSVWHGGLLHK FT LVDFGLPVYLVKIISSYLKHRTFRVALHSALSDPNPVPAGVPQGSLLAPLL FT YILYTTDIPPLPCDGMLFLFADDTAIAVKGRNMIELKSRLQRCLDAFLRFA FT ADWKIKINPSKTQAIVFPHRFKKTLTPPLSPGLLVNGTTVPWSPSVKYLGL FT TIDYKMIFRGHVESILERGHLLLKCLYPLISRRSRLSQLNKLAVYKQIILP FT VATYAAPVWSTCAETHLCRLQIMLNKLLRMITDTSRFTRNADLYTIAGVLP FT FKEEIQNQSEKLYNRCSVSTFPLINSLVQA" XX SQ Sequence 4443 BP; 1358 A; 1069 C; 903 G; 1113 T; 0 other; ccttcagagg tgagtacgtc gttatcagca gctgatgccc cagataaacg cctgaagcac 60 tctcagctgg aagaatcatc tgaaaccgaa gagaacgaca tcggcgacga tgatttcatc 120 gaggttcgct ctcgcagtag tggtcgatct agatcgaaca aaaaaacaac tctcgcaccg 180 attgcagcac ccataccaag gggatcacag gaatctaaac ctgcttgtgc agacaagcat 240 tcaaatgcct cgctctccaa agagagtatc tattcccttg agggagctat ggaagtagaa 300 aatcgctcgc acgatagtgt tcgtgtcgaa acaaccaata cagcggtaac ccctcaaacc 360 gctgcctcaa cctcatgccc caatacggtg atccgttctg cacgcatacc accgatcgtg 420 gtcaacgcac catatcatca gctacgcgct gaacttgccg gtatacccgg tattgtctat 480 caatttgctg gcgcaaaagt gaaactaata atcagccttg ttgaaactcg tgaccgagtt 540 ttaacgttac ttaaggcaaa tcgaaaggag ttctttaccc atgaactacg ctcagagaaa 600 cccttcaagg cagtaatccg tggtctcccg gatcttcccg aagaggatat tatcaccgct 660 ttacgcgagc aatccattga acccttggtg gttcacaaaa tatccaaaaa acactacgag 720 gggcatagca ggcaggcatg cctttacctt gttcatttca caaagggtac catcacactg 780 gccgctctca agtgcatccg cacggttgac tccattcgtg tctcatggga ggcacatcgc 840 agtggtaagg gccgtattgt tcaatgccat cgttgccaag ccttcggtca tggaacaaga 900 aattgttcta tgaagaaacg atgtgaaaat tgctctgagg aacacgactc tgaaacgtgc 960 cctaccaaag ctccagaggc tactaaatgt gcaaactgca acggtagcca ccgtagtact 1020 gatcccgact gtccaagtcg acataactac aatctaagtc gtcaaaaaac ttcgaccatc 1080 acccatagac aaccaaaacc ctctgggcaa cctccaccgg cattaactaa tgccaatttt 1140 ccgcctctta agcatttggg tctgcctaca cctgccaatg cacccaacgc ctatcataat 1200 acagctgcta ctacaacagc agcaaatact gtccctgacg ttactacaaa cccgtggcta 1260 cccagagatc gggctgttac agaatccacc acacgtgcaa ccatcatgcg tacacaacca 1320 atgaatggct ctgctatgcg tgcaccatca tcgcacaact ttactgctcc agctgatacg 1380 ttaccaggag acgaagaagg cgacttcagt atagaggaat gggttgagat actacgaatt 1440 atgacacaac gcttccgtct ttgccgaaat cgttatgaaa aattcgccgt tattgcggaa 1500 ttagccattc gttatggatg ctaaactccg cattgttacg tggaatgcgc gatcaatcgc 1560 tgctaagaaa atacccctga tggaattcct tcttcgacaa aaagtggacg ttgcacttgt 1620 tagtgaaaca caccttagac cagatattaa tttctcctta aaaggatacc acttcctacg 1680 actggaccgg caaggcacta ctaccagagg cggaggggtt gctatcatcg ttcgtagtgg 1740 tataaatttc aaccaaatat cgcacctgaa caccacggtc attgaagctc tgggtataga 1800 agtgcaacta tcaatcgggc taataaaaat catcgtagca tactgtccga tgcagtgcag 1860 gcgcaacgat ggcaaagccg ccgcgtttaa aaatgatctc aacatcatca cccgttctca 1920 ccaaaggctc atcgttggtg gggatctcaa tgcacgccat caagcatgga acaacctccg 1980 acgtaataca aacggtgaat tactctttcg ccattcagag acaggacagt tcacagtcga 2040 ttttccggat tctccaacct acatctcagc ggggggcact tttagcacgc tggatttatt 2100 tttgactaat gtaaaaatca gtaaacctga aaccctggat gagcttactt ccgatcactt 2160 tccggttgtg acagaggtag actgttccgt ctccgcgggt tccatccgtc gtcgtaaaga 2220 ctaccaaaat gtaaactggc agcgcttcgg tcgcctggta gacaatcaaa ttcaatcgac 2280 ggaaattctc tccgtgccag aagtaaacat agcgatcgca aatcttgaaa tagctgtcag 2340 gtccgcggaa gcggcttgtg tcaaagaatc aatgatcagg ggtgagtttt cggacatgga 2400 ttcccatact ttagcactga taaaagaaag aaatcgactc cggcgaattt ttcaacacac 2460 aggtgatatc actgcaaagc gacttgcgtc gactatagct aaacaaatat ccgctcgggt 2520 tgaaattata cgaaatgaaa attttggtcg ttccatccaa agaatggata ccagagctcc 2580 agctttctgg aaagtgtcaa ggatactcaa acagaggcca aagcctgtcc cccctttgac 2640 ctctcttggg caaatccaag ttactcctag cgaaaaaagt aatgctttag ccagccagtt 2700 tgccaatgct cactcggatg gtgtaaacag ggcgagtcgc aacgaggcta aagtggctac 2760 caccttaata cgattggaag attcagtttt tactgtacct ctacaagaga aagtgaccat 2820 cactgacgta cggattgcta tcggtcgcat gaaaaatatg aaggctcccg ggtttgacaa 2880 gatatttaac atccttatca aacatttaca ggtgaaggca ctgtgcctga tcacaaaggt 2940 atttaacatt tgtttcgaac taggttactt ccctagcaca tggaagtgtg ccaaggttgt 3000 gcctatactc aaacccggta aagaccccac gcttccgaca agttatcgtc caatcagctt 3060 gctcccctcg ttgggtaagc tttttgagag gataattctt gaccgcctac agaactgggt 3120 atctgagctt aatctcattc gaccggaaca atttggtttt cgacaagaac actcaacagt 3180 tcatcaactt ttacgagtca agggttgcat cgaacaaaac aaaacagatt caaaatccac 3240 agctgtagca ctgttagatg ttgaaaaagc attcgacagc gtttggcatg gtgggctttt 3300 acacaagctg gttgattttg gtcttccagt ttatttagtt aaaattatta gcagctactt 3360 aaaacacaga acattcagag tggccttgca ctctgcctta tctgatccaa atcctgtgcc 3420 agcaggtgta ccacaaggga gccttcttgc tccgttactt tacatcttgt acactacgga 3480 tatccctccc ctcccctgtg atggcatgct gtttctgttc gcagatgaca ctgccattgc 3540 agtcaaaggt agaaacatga ttgaactaaa gagcaggttg caacgatgtc ttgatgcctt 3600 cctgagattt gctgcagatt ggaagatcaa aatcaatcca tccaaaaccc aggcaattgt 3660 atttccccat cgttttaaga aaacgctgac tccacctcta agcccggggc ttttagttaa 3720 tgggacgact gtaccatggt cgccgtcggt aaagtatctg gggcttacga tagactacaa 3780 gatgatcttc aggggacatg tcgaatccat cctagaaaga gggcatcttc ttttgaaatg 3840 cctttatcca ctgatcagtc gtaggtctcg tttatcgcaa ttaaataaat tggcagttta 3900 taaacagata atcctaccag ttgcaacata tgcagctcca gtatggagta catgtgcgga 3960 aactcacctc tgtaggctac aaatcatgct taacaaattg ttacgtatga taaccgacac 4020 aagccgcttc acaaggaatg cagatttata caccattgct ggtgttcttc ccttcaaaga 4080 agaaatccaa aatcagtcgg aaaaactgta caatcgctgt tcggtctcga cttttccttt 4140 aataaatagc ctcgtccagg catgataaat ctaatgtagg gtaagctaga atagattaca 4200 tattaagata ttattgttaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 4260 aaaaaaaaaa aaaaaaaaaa catctacgga ttcaaacgcc ctaccaatgt actgagtaga 4320 acatattaat gcttagaagg caattgcaaa tcattttgta aaactgtttg ttggtcttaa 4380 tggaaatcaa aaatttaagt ggtacacccc cgaaaaacat acaaataaat ttatatatat 4440 ata 4443 // ID BEL15-LTR_AG repbase; DNA; ANG; 275 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL15-LTR_AG is a long terminal repeat of the BEL15_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL15-I_AG; BEL15-LTR_AG; BEL15_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-275 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL15_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 38-38 (2003). XX DR [1] (Consensus) XX CC BEL15-LTR_AG flank an internal portion of BEL15_AG (deposited as CC BEL15-I_AG). XX SQ Sequence 275 BP; 73 A; 53 C; 56 G; 93 T; 0 other; tgtttacgtt ttttgttatt gttgcgtttt gacagctagg gctacgttca tccagcatat 60 tgtgtgcgta ttttcactgg gtgaagcgtt agcgttcgtt tgaattgtct atgtattagt 120 gttaggtccg ttaggctaag aacaaaattg tcaaccattg tcttaaaagg attcgtaata 180 tactgacgcg tacgccaaac ggtcaccttt aaaccctaag catttttcac tgcactttaa 240 cataaaatat cacaacgtag ggccgatcac gaaca 275 // ID Mariner-N15_AG repbase; DNA; ANG; 249 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 28-FEB-2009 (Rel. 14.02, Last updated, Version 1) XX DE Putative nonautonomous Mariner DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW Mariner-N15_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-249 RA Jurka J.; RT "Putative mariner/Tc1-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 640-640 (2009). XX DR [1] (Consensus) XX CC TA TSD. XX SQ Sequence 249 BP; 77 A; 50 C; 48 G; 74 T; 0 other; tacagggttt cccacgattt attggtcagt tcccatgatt ttttggtgcg ttcccacgat 60 tttttggtcg tatcccatag atttttggtt cgttcccata atttattggt attttccgat 120 tggatatcaa tacaattgga ccaaaaaatt ctgggaaacg accaaaaaat cgtgggaacg 180 caccaaaaaa atatgggaac ccaccaaaaa ttgatgggaa ccaaccaata aatcgtggga 240 aaccctgta 249 // ID TRANSIB1_AG repbase; DNA; ANG; 4303 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE TRANSIB1_AG is a TRANSIB-like DNA transposon - a consensus DE sequence. XX KW Transib; DNA transposon; Transposable Element; KW TRANSIB superfamily; TRANSIB1_AG; transposase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4303 RA Kapitonov V.V. and Jurka J.; RT "TRANSIB1_AG: a family of TRANSIB-like DNA transposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 26-25 (2002). XX DR [1] (Consensus) XX CC TRANSIB1_AG is a family of DNA transposons that belongs CC to the TRANSIB superfamily originally identified in Drosophila CC (see CC description of TRANSIB1-TRANSIB4 in drorep.ref). CC Members of this superfamily encode the TRANSIB-like DNA CC transposase CC and generate 5-bp target site duplications. CC The TRANSIB1_AG consensus sequence was reconstructed based on CC multiple alignment of several copies of this transposon CC identified CC in the sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC TRANSIB1_AG CC occurred recently (in the last 1 Myr). CC TRANSIB1_AG has 9-bp terminal inverted repeats. The consensus CC sequence encodes a 708-aa TRANSIB1_AGp transposase (positions CC 1539-3665). TRANSIB1_AGp is 29% identical to the TRANSIB2p CC transposase CC encoded by TRANSIB2 from D. melanogaster. CC The A. gambiae genome harbors several families of TRANSIB-like CC DNA CC transposons. XX FH Key Location/Qualifiers FT CDS 1539..3662 FT /product="TRANSIB1_AGp" FT /translation="MIEILLFNDFFNCVQHAHLIFFPLIQNVTQITNRELV FT LQFNFQNSAKLNVENAFRFVCETLESSDIDEDAVKQKLNLLERRAKHLWKK FT TNSDRNKFDRANFAWLNTQFLCDFKREKITIIQSKKLFIELTNRQKRRRIA FT QMSSSVKCESVDEALMFARKIATQQKNTYVVNALTNMINCSTPAMKSGEEA FT LSFLIEGNFSKKQYCIIRKECPSKFRSYMEILKAKKTCSPSAESLHVTESS FT VKVDLQSLVHMTTKSIVRMEKDKIFAYMDSRKLNHLKLILLMAWGMDGSSG FT HSQYHQNYSNETADDSNVLITSLSPINLTVEGDSSNEIMWSNLTPQSVRFC FT RPISMIFAKENRDLVINKKSEIEAEIEALRPYLFETENREKSVIVEFRFVL FT SMIDGKILSYITQIPTSCCPICGSKPTEMQNVKYIDNGFQPSPGSLIHGIS FT PLHCWIRVLECILHISYRLDFTKWKVTAHYREAYNNRKRIVLRDLFTKFGV FT KVDQVRAGSSGTSTTGNVCRRIFSNPKLASEALGVDEDLIRRFRNILIAIN FT SERPFNAETISLYCKDTYYKFINLYSWYYIPATVHKILAHAGELILNTPVP FT LGSIGEECGEARHKIFRKDREYHARKCTRESNLVDVFMRAIHSSDPFISSL FT SLQNRVQRKMSDNYPEEVKLFFEFEQLNSNENVPPLEESFEDIMDTLDKMN FT EPLVQDMG" XX SQ Sequence 4303 BP; 1502 A; 707 C; 737 G; 1357 T; 0 other; cacagtggtc taaaggcgga ctttatggaa aagtattttt tctatttatt tctatccctg 60 ttatttttaa cttaaactag ctatagttgt ttagtataaa gttgaaattt caaattttcg 120 aatttcattg atttgttcaa aagttatggc gctttgaatc taaaagtctt ggctaattgg 180 ccgtttttaa gccgtatcca ctgacatgat gtatttgctt caagcttcgt ttgtgttcta 240 cctaaaaatt taaataatac atcgttggaa aggtattttt ttcgtcttta agaatatgtt 300 aatagactta agctactgtg acgaataaaa aagttactac aaactatata tagaaaaagc 360 tgaaaaatat acaaataagc caatattata gagcgccatt ttatttcctg taacttcctg 420 ttgttgctac ccacattgaa ttaaagttta acactttcta taacttttgt gaaaaaattt 480 cggtcccctt ttgtcccctc taagcgctgc agcaaattaa aagttgattg attgaaatat 540 taaatgcaac aaacagagct aaaaatacga cattgttttc cgattaacta cactgtggtt 600 attcttgagg tacagttgtc attttttaca acctaacaac ataccctgca tgggttcaaa 660 ccccgtacag ttcgtgctcc cttcctcctg cttcactata tgcaaacctg gtcaatttgt 720 tatggacaca caaatcagtc aaagacattt tagccgcaac gttgcttaca cagtattgat 780 attgatagta ttgtattgat aaatttagtt cccctttttt cttactgttt ttaaataacc 840 aacaaactaa aataaaagtt ataacattta tctatttttt tcttctgttt gagcaaaagc 900 aaacactatc tgttgcttgc tataagtatt gtattgcaaa taatcatacg aagggttcgt 960 taatctctaa atctatcctg ggtttactag tccgaacaat aactgtcccc atgtttatta 1020 caatattttc ctcaaataaa caactcagca ccaggataca tgtgattggc tgaggtggct 1080 attcaaaagg atcattgtta taaatatgtg gacagttgtt cattgcattg gaaaccctgc 1140 aatacgaaca gccgaataaa tgaaacgtag gctaacggta ctatttgagt accttcaccg 1200 acaaataaag tctacttcaa ttcaatatac gaaatgcata ctgatacaga tatacaagta 1260 gtagtaaaag taaggatcga ctgatcgcta gagcaataca ataaaaatcg ttaagcttat 1320 cgtattctca tgagagagta cgaatactaa tgcacactga tagcacaatg cgttagatat 1380 ttttgattct gatgcgtatc gtagtagcaa gaccgacgtt aatttgatca caagtgaaaa 1440 atattccaga ccgtataaat tgcgcagcta gtgaatttgg gcacttttat cttgtcctaa 1500 caatttaata acattatgca ggtagataat acacaccaat gattgaaata cttttattta 1560 acgatttctt caattgcgtt caacacgcac atctaatatt ttttccatta atacagaacg 1620 taacccaaat taccaaccgg gagcttgttt tgcagtttaa cttccaaaat tcagctaaac 1680 tgaacgtaga gaatgctttt cgtttcgttt gcgaaacatt ggaatcgtca gatattgatg 1740 aagatgcagt aaagcaaaaa cttaacttgc tggaaagacg agccaagcat ctgtggaaaa 1800 aaacgaacag cgataggaac aaatttgata gggctaattt tgcgtggttg aacacacaat 1860 ttctctgtga ctttaagaga gaaaaaatca ccatcataca gtcaaaaaaa ctctttattg 1920 aacttacaaa caggcaaaaa cgaagaagaa tagcgcaaat gtcttcatca gttaaatgtg 1980 aatccgtaga tgaggcattg atgtttgcaa gaaaaatcgc aacacagcaa aaaaatacgt 2040 atgttgtaaa tgcgttaaca aacatgatta attgttctac tcctgcaatg aaatcaggag 2100 aagaagcctt atcatttttg atagaaggaa atttctccaa aaaacaatat tgtatcatta 2160 ggaaagagtg tccatcaaag ttccgtagct acatggaaat acttaaagct aaaaaaacat 2220 gttccccgtc ggcagagtca ctacatgtaa ctgaaagctc tgtgaaagtt gatttacaat 2280 cacttgtaca tatgactaca aaaagcattg tcagaatgga aaaggacaaa atatttgcct 2340 atatggatag ccgaaaactc aaccacttaa aattaatttt actgatggca tggggtatgg 2400 acgggtcatc aggccattca caatatcatc agaactattc taatgaaact gccgacgata 2460 gtaatgtgtt gataacatca ttaagtccca tcaacttaac tgtcgaaggt gattcttcaa 2520 atgaaatcat gtggtcaaat ttaactcctc aaagtgtccg attttgtcga ccgatttcta 2580 tgatctttgc aaaagaaaat cgtgacttag tgattaataa aaaatctgaa attgaagcag 2640 aaattgaagc acttcggcca tatttattcg aaactgaaaa cagggaaaaa tcagtaattg 2700 tagagtttag atttgtcttg agtatgattg acggaaaaat cttatcatac atcactcaaa 2760 ttccaacgtc ttgctgcccg atatgtggat caaagcccac agaaatgcag aatgtaaaat 2820 acatcgataa cggattccaa ccttctccgg ggtctttgat tcatggaata tcacccctgc 2880 actgttggat aagagtttta gagtgtattt tgcatatatc atacagatta gattttacaa 2940 aatggaaagt gactgcgcat tacagagagg catacaataa tagaaaacgc attgttttgc 3000 gtgatttatt tactaaattt ggagtgaaag tagaccaagt tcgagcaggt tcatcaggca 3060 caagcactac aggaaatgtt tgcaggcgaa tattttccaa ccctaaactt gcaagtgaag 3120 cgttaggtgt tgatgaggac ttgataaggc ggtttagaaa tatattaata gctatcaata 3180 gtgaacggcc ttttaacgca gaaacgatta gtttatattg taaggatact tattataagt 3240 ttattaattt gtatagctgg tactacatac cagcaactgt acataaaatt ttagcacatg 3300 ctggagaact aattctgaac actcctgttc cgttaggaag tataggggaa gagtgcggag 3360 aagcacgcca taaaatattc cggaaagata gggaatacca tgcacgcaaa tgcactagag 3420 agagcaatct ggtcgatgta tttatgcgtg caatacacag cagtgatcct tttattagtt 3480 cactttcttt gcagaatcgt gtgcaaagaa aaatgagcga caactatcca gaggaagtca 3540 aacttttctt tgagtttgag caattaaatt caaatgaaaa tgtacctcct cttgaggaat 3600 cctttgagga tattatggac actttagaca agatgaatga gccactcgta caggatatgg 3660 gttaaaagtt atattttata agttggtaga caataaatca tattaggaat cgactttgtt 3720 tcaatgtaga acaaggtata taaaataaat tatactttta ataatttaaa ttaaagacgt 3780 tacattttat aggaagtata cgatgaaatt gactaccctt tttatatttt tttccacaat 3840 tttttaattt aaatcaagtg tttcattcaa tatgtttcat atcaaaatag tatgttcaat 3900 attttttaat caattgattt catttgaaaa tcgacgattg aaatgaaaag aataaattaa 3960 ttatttattt tataaatatt tttttaattt gaaatttgtc attgtaaata aacagttgtg 4020 ggcaacagga aacagtaaca tgaaacatta aaataataat aattaaaaat gtttacctgt 4080 acccaaattg aaactacgag atcagtttaa gattaatttt tttactaggg ttatactgaa 4140 gcttaaaaaa gattgtctag tggcgccatc tggatgtcca atataattca aaattctttg 4200 taattgtctt agatgtgaaa ctaaagtcaa tcgatactga aataggccaa acactgacag 4260 taaatagcaa tgtcctgcga ttgctctgaa gcaaaccact gtg 4303 // ID WaldoAg2 repbase; DNA; ANG; 4895 BP. XX AC AB090815; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon WaldoAg2 DNA, complete DE sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; gag-like domain; WaldoAg2. XX NM WaldoAg2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090815; Positions 1 4895. XX FH Key Location/Qualifiers FT CDS 160..1635 FT /product="WaldoAg2_1p" FT /translation="MSGNESLPPRPLGSALKDIGAFFGRSSKTPRSPPSDL FT GECSASPTVEVVASTSVDSAVVEAVTGEENVMAAEDSVTSQPVESFSSSKE FT PALVVGGSKLQEALKVAGELHAYTKDRNNVHHPIKKMAVSILSALACVERE FT LMTTRLRAERTEKSLKEALEGCSQTETPVNGKRGRNLRSTEEADDAKRAKN FT DAPSGSSLVASAGCGPEMNETKGSWSTVVRKNRRKPKESVIPDNTGEKQVH FT PASTREALPPRRPKTEAVLVAPGENITHVEILRKLKADPELQAFGKKVVRI FT RGTKNGGLLFELGKSDDDCGVDYAKVVQNSIGNNGTVKTLGQMETVEIRYF FT DAETQTSDVEKDLRDLFTELDGVTFETKMTKSFNGMQTASVKLPTKLATLV FT AARGKIRIGWSICPVKIQIPKRRCFKCWETGHFSRDCKGPDRTDCDAVKQG FT ILPKHASNLPDVSFVHRVQTCIIPAAFSVQRASRNRHGDSSVEPEPL" FT CDS 1508..4522 FT /product="WaldoAg2_2p" FT /note="endonuclease and reverse transcriptase." FT /translation="MRQTSQMCPLSTGYKHVSYQRRFLSSEQAEIDMEIVQ FT LNLNHCEEVQDMLGQLLIEEKGDVAMLSEPYRCPSGVNNWVSDSTGTAAIW FT ASGRFPIQQIISRHGEGYVIAVINKITFCSCYAPPRWDLEKFEEMLKRISD FT EVYDVNPIIISGDFNAWATEWGSKSTNARGNAVLEHFSRLNLVLVNVGFCP FT TFVRNSRTSIIDLTFCSPALASSMNWRVSNAYTLSDHRVIRYTAGSKCHRV FT AQGSGFPAWKTQCFNEELFIEALRFGDFSNTSSALKLASAIANACDTSLPR FT RKGGPYPRRRAYWWTTEIAQCRSHCIEARRKMNRAKSSEQREDLRRLYILA FT RSNLKRKIKASKRRCFLALCDEVENNPFGAYRTLMGKMVGQDLPRERNPTA FT LKTIIEQLFPNHEPQTPRDISRNPDVEPVSISADEIQKAADHLKLGKAPGP FT DGIPIEAIKAAIKAYLEAFLSVFQNCFDTGFFPIPWKRQKLVLLPKPGKPP FT DDASALRPLALIDNFAKILEILILNRLVVYTEGEHGLSDRQFGFRKGRSTG FT EAIAAVLKKRRSALLKKRTGNRYCAIVTIDVKNAFNSANWEAIHAALSKMM FT IPPYLCRLLRSYLDHRVFLYDTALGIKKMSLTAGVPQGSILGPTLWNVMYN FT GVLTLGLPPGPEVIGFADDIALTVLGESIEEIELLTSDSVSRIESWMQQMR FT LEIAHKKTEFLIISSHKTVQSGSIRVGDERIESIRHLKYLGVIIDDRLSFR FT KHVEYACNNVFKAAISLIQIMPNIGGPKSSWRRLLADVAFSRLRYNAAIWA FT HVLVLKENRQLANRVHRLLAMRVVRAYKTISHVAVCVIASMVPICLILAED FT SECCSFSGVSNAGLSRSSAKQLSMRKWQSEWDCSTKGRTTHALIPNIAAWT FT SRKHGEVNFYMTQFLSDHGCFRSYLHKYRHASSPDCPACVSIVESTEHVLF FT HCPRFAEERHEITVKCGTTINGTNLTELMLKNAGTWEVIANGMRSILLKL" XX SQ Sequence 4895 BP; 1364 A; 1112 C; 1279 G; 1140 T; 0 other; ctggtattat cagtgttgcg tgagttacag tgcaaaggcc cttcattcgt ttgtgttgtg 60 tgagtgtctg caaattaagt gattatacac taggctcgct aggtgagttc ctcgtaaatt 120 gattttcttc atgtgggatt cggttgagca aaaccaccaa tgagcgggaa cgaatcgcta 180 cctccccgtc ctctggggtc tgccttgaag gatataggcg cgttttttgg ccgtagtagc 240 aagacgccta gatctcctcc gtcggatctc ggagagtgtt cagcttctcc tacggttgaa 300 gtggttgcca gtacgtcggt cgattctgct gttgttgaag cggtaaccgg cgaggaaaat 360 gtgatggcag cagaggattc cgtgacctca caaccagtgg aatcgttttc ttcatcgaaa 420 gagccagcgc ttgttgtcgg tggaagcaag ctgcaggaag ctttaaaggt ggctggagaa 480 cttcatgcct atacgaaaga tcggaataac gttcaccatc cgatcaagaa gatggcagtg 540 agcatcttat cggcgttagc ctgtgttgaa cgtgagctca tgacgacacg gctgcgagcg 600 gaaaggactg aaaagtccct aaaagaggcg ctagaaggct gctcccaaac cgagacgcca 660 gtgaatggta aacggggcag aaatttgagg tcgactgagg aagcggacga tgctaagaga 720 gcaaaaaatg atgccccctc tggcagctcg ctggtcgcga gtgccggttg tggccccgaa 780 atgaacgaga caaaggggtc gtggagcacg gtcgtgcgga aaaatcgccg gaagccgaag 840 gaaagtgtca ttccggataa caccggcgaa aagcaagttc accctgcatc tactcgggag 900 gcgcttccgc cgaggcgtcc aaaaaccgaa gccgtactag tggcgcccgg cgaaaatatt 960 acccatgtgg aaatcctccg caagctaaaa gcagatcctg agcttcaagc tttcggtaag 1020 aaagtagtgc gaattcgagg aacaaaaaat ggaggcttgc tattcgaact gggcaaaagc 1080 gatgatgatt gcggagtcga ctacgccaag gtggttcaga attccattgg caacaatggg 1140 acagtaaaga ccttaggcca aatggaaacg gtggaaattc gatacttcga cgcagagacg 1200 cagaccagcg acgttgaaaa agatcttcgg gatttgttca ccgagctgga cggggtaact 1260 ttcgagacca agatgaccaa atccttcaac ggaatgcaga ccgcttcagt gaagctcccg 1320 acgaaactag caacgctagt ggcagcacgt ggcaagatta ggattggatg gtcaatctgc 1380 ccggtaaaaa tacaaatacc gaaaaggagg tgtttcaaat gctgggagac gggccatttc 1440 tcccgcgatt gcaaaggccc agataggacc gactgcgatg cggtgaaaca gggcattttg 1500 ccaaaacatg cgtcaaacct cccagatgtg tcctttgtcc accgggtaca aacatgtatc 1560 ataccagcgg cgttttctgt ccagcgagca agcagaaatc gacatggaga tagttcagtt 1620 gaacctgaac cactgtgaag aggtccaaga catgcttggg caattgctta tagaggagaa 1680 gggagatgtg gcaatgctgt ccgagccgta tcgctgtcct agcggcgtaa acaattgggt 1740 ttctgactct acagggaccg ctgctatttg ggcttctggt agatttccga tccagcagat 1800 catttcaagg cacggcgagg ggtatgtgat tgcagtcatc aacaaaataa ccttctgcag 1860 ctgctacgct ccgccaagat gggatctaga gaaattcgaa gaaatgctta agaggatttc 1920 agatgaggtg tacgatgtta atcctatcat catttcagga gattttaacg cttgggccac 1980 ggaatggggc agtaaaagca caaacgccag aggaaacgcc gtgttggagc acttttctag 2040 actgaactta gtactggtaa atgttggctt ctgtcccaca tttgtaagga atagcagaac 2100 ttccattata gaccttacgt tctgtagtcc agcattggct tcttccatga actggagggt 2160 aagcaacgcc tacaccctca gcgaccaccg agtgatacgc tacacggcag gaagcaagtg 2220 ccacagggtt gcccagggct ccggttttcc agcctggaaa acacaatgct tcaacgaaga 2280 actgtttatt gaggcattga ggtttggtga tttctcaaat acctcgtcag cattgaagct 2340 agcgtcagcg atcgccaacg cctgtgacac ctctttgcca cgaaggaaag ggggacctta 2400 cccacgacgg agagcttact ggtggaccac cgaaatagcc cagtgccgaa gccattgcat 2460 cgaagcacgc agaaagatga atcgagccaa atcctcggaa caaagggagg atctgagacg 2520 tttgtacatc ctggcacgat caaatctaaa acggaagatc aaggcaagta aaaggagatg 2580 ctttttagcc ctatgcgatg aggttgaaaa caacccgttt ggtgcttacc gaacgctaat 2640 gggtaagatg gtcggccaag acctacctag ggaaaggaac ccaaccgcgc tcaagactat 2700 cattgaacag ttgtttccta accatgagcc acaaactcca cgtgatatat cccgcaatcc 2760 tgatgttgaa cctgtgtcaa tatccgctga tgaaatacag aaagcagcag accatctcaa 2820 actggggaag gcacctggtc cggacggtat tcccattgaa gcgatcaaag cagctatcaa 2880 agcgtacctg gaagcttttt tatcagtgtt ccagaactgc ttcgatactg gcttttttcc 2940 gataccctgg aagcgacaaa aactcgttct tctaccaaag cctggaaagc cccccgacga 3000 cgcatcagca ctaaggcctt tagcattgat agataatttc gcaaagatac tagaaatcct 3060 aatactcaac cgattagttg tctatacgga aggcgaacac ggactgtcgg acagacagtt 3120 tggctttaga aagggacggt ctactggtga agcgatcgca gctgttctaa agaaacgacg 3180 aagcgccttg ctaaaaaaga ggacaggaaa cagatactgc gcgatcgtta cgattgacgt 3240 aaagaatgct ttcaacagtg ccaactggga ggccatacat gcagctcttt ccaagatgat 3300 gattccaccg tacctctgta ggctgctgag aagctattta gatcaccgtg tgtttttata 3360 cgacacagct ctgggaatca aaaaaatgag cctcaccgct ggcgttcctc aaggttcgat 3420 tttgggtcca actctctgga atgttatgta caacggagtt ttgaccctag ggcttcctcc 3480 aggacctgaa gtgataggtt ttgctgatga tattgctctt actgtccttg gtgagtcaat 3540 agaagagatt gaactcctca catccgactc cgtaagcaga atcgagtcgt ggatgcaaca 3600 aatgaggcta gaaatcgccc ataaaaagac ggaattcctc atcataagta gtcacaagac 3660 agtgcagtca ggtagcatcc gggtcggtga tgaacgtatt gagtcgatac gtcacctaaa 3720 atatcttggt gtgatcatcg atgaccgctt aagtttccgg aagcatgtcg agtacgcctg 3780 taacaacgtc tttaaggcag caatctcgtt gatacaaata atgccgaaca ttggagggcc 3840 caagagtagc tggaggcgac ttctagctga cgtggccttc tctcggttgc gttataacgc 3900 tgcgatctgg gcacatgttt tggtgctaaa ggaaaaccga cagttggcga acagagtgca 3960 ccgattgcta gctatgagag ttgttcgtgc ctacaaaaca atatcgcacg tggcagtttg 4020 tgttatcgca agcatggttc ccatctgcct catattagca gaggattctg agtgttgcag 4080 tttctccgga gtttcaaacg cgggattatc cagatcctca gcaaagcagc tctctatgag 4140 aaagtggcag tctgaatggg attgttcaac aaaggggcgt acgacacatg cactgatccc 4200 caacatcgct gcatggacga gcagaaaaca cggcgaagtg aacttctaca tgactcagtt 4260 cctctccgac catgggtgtt tccggagtta tcttcacaag taccgtcacg caagctcgcc 4320 agactgccca gcgtgtgtga gcatcgtgga gtcaacagaa cacgtgcttt ttcattgtcc 4380 tcgttttgct gaggaacgtc atgaaatcac cgtgaagtgc ggaacaacaa tcaacggaac 4440 aaacctgacc gagttgatgt taaagaatgc gggaacatgg gaagtcatag caaacggcat 4500 gcgatcaata ctgttgaagc tgtaagcctt atggaaagcg gatcaacgac ttgggcgttc 4560 gtaaatagtg ttgtgcgtat tgtgaaagtg tttagtgtat ctatgcacga attgaagtga 4620 gcctgtaact caagtgaatg cgtactttct gtgcttttgc gtggtattat gagtgtctcc 4680 aggtttatca ttctgtctcg cgaatgtggt ccaagttgca atagcgagat gtaatgctaa 4740 tgcatagccc tgccccaaga agcataccga aaggtgaacc catggggaag ggtagatggc 4800 ccaaggaggg ggtttactgg gtaaaaatca tatgtcaaca cccgtgcgac aacgggagtc 4860 tttcgaagat tccccctcct tgtaaaacaa aaaaa 4895 // ID Ag-Loner-1 repbase; DNA; ANG; 6343 BP. XX AC AAAB01008849; XX DT 21-JUL-2009 (Rel. 14.07, Created) DT 29-OCT-2010 (Rel. 15.11, Last updated, Version 2) XX DE Loner non-LTR retrotransposon, a fossilized genomic copy. XX KW I; Non-LTR Retrotransposon; Transposable Element; Loner; KW Ag-Loner-1. XX NM Loner. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6343 RA Biedler J. and Tu Z.; RT "Non-LTR retrotransposons in the African malaria mosquito, RT Anopheles gambiae: unprecedented diversity and evidence of recent RT activity."; RL Mol. Biol. Evol 20(11), 1811-1825 (2003). XX DR Genbank; AAAB01008849; Positions 2639407 2633065. XX CC ORF2 is corrupted by a frame shift. XX FH Key Location/Qualifiers FT CDS 401..1777 FT /product="Ag-Loner-1_1p" FT /note="ORF1." FT /translation="MEAEPMDESDEGFITVKRKSSDPGQMAKKKVLHEPLA FT STSRDDSAKANLKSRAKTYHASFNGAHNVFFMPRTKPLDVRSITASIYKKY FT PGVLDVIRLHPKKLRVTAKDRVQANAIVADPDYTEDYRVYIPGGLVEVIGV FT VDDFEYPLEDILQYGQGAFLYQSSPRVKVLEVKKLYASKMVDGKKVFRETS FT SLRITFEGTVLPNTLYFEGLRVPIRRPFVTKVATCSKCSQIGHSEPYCTNS FT VRCGKCKGPHSTAECKADTQKCFHCKQQWHEVSTCSVYRKTQAEHKKAVLF FT NSKRSFADVLKKTSEVNPFDTLSLLEGNEPLPNLAPLTGVKRSITPRVSIK FT GIRKNKTVPPKKAPRRHVDATQRQSQGPMPPRDVRTRPPSRNPPNPSNLAF FT PSEPRSFSRQSLPSLTEILQSLLDSLSLPEPLSGLIPLAFPFLRGMVRQWL FT SKWPCLSMMVRLDD" XX SQ Sequence 6343 BP; 1717 A; 1499 C; 1456 G; 1671 T; 0 other; cattctgtcg ccaaccacga agcgaatcgg acgttatcga tacgcgctcc aggtggatga 60 ttacggccaa agtcaagcca ctgaccccga ccatacacac cacgaagatt tgcgtacgct 120 tggttgaacg tgaagcgatc tgtgcttgct gtgcgtaaaa gttgtacagt gttaagtgtg 180 tgtagtgtgt ttaattgtct gtgcgttacc cgttagagag agttgagcgc atcgacgata 240 caagtgcgta cggctcgcat ctcttgtgcg ctcgcatctc gtgcgctcgc atcttatcgt 300 gtgcgctctc ctctcatttg tcccgtgctt ggtaagtgtt ccgctcttat cctgttatcc 360 ttgttcccca tacagttaat tcgttggcta gggctctccc atggaggcgg agcccatgga 420 tgaatccgat gaaggcttca ttacagtgaa gcgcaaaagt tctgacccag ggcagatggc 480 caaaaaaaaa gtgttacatg aacctttggc ttcaacatcg cgagatgatt ctgctaaagc 540 gaatttaaaa tcaagggcca aaacatatca tgcttctttt aacggggcac ataatgtttt 600 ttttatgcct cgtaccaagc cgcttgacgt tagatcgatc acagcatcga tctacaaaaa 660 atacccgggt gttttggacg tcatccgact gcatccaaag aagttgcgtg tcaccgcaaa 720 ggaccgggtg caggccaatg caattgtggc cgatccggac tatacggagg attaccgggt 780 ttatattccc ggtggattgg tggaagtcat cggtgtggta gatgactttg aatacccctt 840 agaggatatc cttcaatacg gacaaggggc attcctttac caatcctcgc ctcgtgtcaa 900 ggtcctcgaa gtgaaaaaac tgtacgcttc gaaaatggtt gacggaaaga aggtgttccg 960 tgagacttcc tcactccgga tcactttcga gggtacagta ctgccgaaca cgctgtactt 1020 cgagggtctg cgggtaccca tccgtcgacc ctttgttaca aaggtggcta cctgttctaa 1080 atgcagccag attggacatt ccgagcctta ctgtaccaac tctgttaggt gcggtaagtg 1140 caaaggaccg cattctacgg cagaatgcaa ggctgacacc caaaagtgct ttcattgcaa 1200 acagcaatgg cacgaagtgt ctacttgctc agtgtaccgc aaaacgcagg ccgagcataa 1260 aaaagccgtc ctttttaatt caaagaggtc attcgccgat gttttgaaga aaacctcaga 1320 agtgaatccg ttcgatacac tctcacttct tgagggtaat gaaccgttgc ccaatctggc 1380 ccctctaaca ggagtgaaaa gatcgattac accgagggtg tcaatcaaag ggatccggaa 1440 aaacaaaact gttcccccta aaaaggcacc tcgacgtcac gttgatgcga cccaacgcca 1500 atcccagggt cccatgcccc cacgtgatgt tcgtactcgt ccaccttccc gtaaccctcc 1560 caatcctagc aatctggctt tccccagtga acccagatcc ttctccaggc aatccctccc 1620 aagcctgaca gaaatcttac aatccctcct agactccctt tccctcccgg aacccctatc 1680 cgggctaatc cctttggcat ttcctttcct aagaggaatg gtcaggcaat ggttatcaaa 1740 atggccatgt ctcagtatga tggtccgctt ggatgattaa tgcctccaat gaccactata 1800 ctacaatgga actgtagaag ttttttggga aaaattgact cttttaaagt attgataggg 1860 caacacaatt gtaacgcatt tgccctaagc gaaacttggc tctcacccga caaaaatatt 1920 accttcccgg gatataatat tattcgtcaa gatcgacatg atccagctag tgataggcgt 1980 ggtgggggag tgttaattgg tattcgaagt agtcacagct tctacagaat acccctcccc 2040 acacccgagg gaatcgaata cgtcgctata cagacaaaac taggggatct tgacgtttct 2100 attgcttcaa tttatatccc accgggagcc aatctggacc caaagaagat caacaaggat 2160 cttgaaaccc tagtaacggt actaccaaaa ccgtttttca tcctgggcga ctttaacgcc 2220 catggatcag attggggttg tacgcatgac gataatcgtg caccaatcat tagggatatc 2280 tgcgacacat acagtttgac gattttaaac tctggcgaag caactagggt gccctcacct 2340 atggcacgac ccagcgcaat agacctatct ctttgttcat cgtctttagg gctggattct 2400 atgtggaagg taatccaaga cccgcttggc agcgaccatc tgccaataaa gatctcgatt 2460 atcaagagga gtcgcacagt cgatcaagtc cccgttaatt gtgacttaac gaggaacatc 2520 gactggacga aatacggcaa ccaaatgacc gctttgctaa gccgcgtaga acccagtttc 2580 tccgtaaatg aggagtacac aaatctcgtt atggcgataa acgagtgcgc cctcggtgcc 2640 caaacgaggc cgccccccca ggcaaggatt tttaaaagac cacccactcc ttggtgggac 2700 gcggattgta aagcggcatt ctcagcgaag cggaaggctt ttgccaggta tagagacacc 2760 ggctccatgg atctatacat ccattataga ggcctggagc gtaggtgcaa aaacttgctt 2820 aaagcaaaaa aaaaggtcat actggccgag gtatgtcaaa aacctcaagc cttccacttc 2880 gctaacggag cttcagagca tggcgaaaag catgcgcaac agcaaagcaa caaacgagag 2940 cgaaagggtt tctggggcat ggctagagcc atttgcacaa aaagtctgcc ctgatttcgc 3000 tcaagcacca ccatttgaac agagtgctca tgggagtgat ccgcaaatgg attcaccgtt 3060 cacaatggtt gagctatcgc ttgctctgta ctccagcaat aattcgtctc caggactgga 3120 tcagattcgg aacaaattgc tccataatct gccagatctg gcgaggaaac ggctgttgag 3180 attattcaac attatgttgg agcttaacac cgtcccgttg gagtggagag aggtgaaagt 3240 agtcaccttg ttgaaacctg gcaaaccggc atcagactat aattcttatc gaccgatagc 3300 aatgctatct tgcttgcgaa aactatttga gaaaatgatt ctttttagac tggacaattg 3360 gcttgaatcc aaaggcctct tgtcaagtac ccaatttggc tttcgcaaag gcaagggtac 3420 caacgattgc ttggcgctgc ttgtgtccga aatcgagatg gctcattctc gtaaagaaat 3480 gatggcatct gtatttcttg acatcaaggg ggcttttgac tcagtatcag tcaatgttct 3540 gtgtcaaaag ttatcatctg cgggcttaac cccaagactg aataacgtct tattcaacct 3600 cctttcggaa aaggcaatga atttcgacaa tggccacatg aaaattcgga gagtcagtta 3660 ctatggacta ccacaagggt cctgtttgag ccccttgttg tataattttt atgtgaacga 3720 tattgatgca tgccttgcac ctggctgtaa cctaaggcaa ttggcggacg acggtgttgt 3780 atcagtcgcc agcaacaaca tcgccgacct tcaaagtcct ttgcaaacca cacttaacaa 3840 tttagaagtg tgggccacaa acctaggtat cgagttttct ccagagaaaa cggaaatgct 3900 gatattctct ttcttttaca acactgagag taatagacgc ttgaatttgg ttgacccaaa 3960 agtcgatatc tttttatatg gtaaaaagat atccattgcc agatcttttc gatacctagg 4020 ggtttggttt gatagcaaaa atgtatggag gacacatatt gactatctgg tacagaaatg 4080 tacaagaaga atcaattttc tcagaacgat taccggactc tggtggggtg cacatcccaa 4140 agacgtcctt aacctatata agacaacgat actgtccgtc ttggagtatg gttgcatatg 4200 cttccactgg gcagccaagt cgcatctaat ccgacttgag aggattcagt atcgttgtct 4260 tagaatcgca ctaggtagca tgaagtcgac tcacaacatg tcactcgaag tgatgtccgg 4320 agtgatgccg ctaaagcttc gttttgagct actatcgctt cgccttttcg tccgctctac 4380 agtatcaaat cccttgataa tcgagaactt tgaaacactg caagagatag gttctaaatg 4440 caaaatcatg aaggtctatc gtgattttgt ttctctacag gtgcacccta gtagttttca 4500 aaacattagt agtgccagct taccagagtc ctacagttcc attttaagtg tagatacctc 4560 gttgagggag gcaattaata ctatccctga taatcttcgc tggatgacaa ttcccaacat 4620 ttttgtggaa agacatggcc aaccaaatgt aaaccatttt tacactgacg gatcctcatc 4680 agagcagggc attggttttg gtgtatataa cacgtgtacc gaagcatatt ttaaattacg 4740 ccaaccatgc tcagtttatg tagcggagct cgccgcaatc ttttatgcct tactgttgat 4800 aagtgcatgc cctccggatc agtatgtaat tttttcagac agcctaagtg ctcttgaagc 4860 attaaaatcc gtgaaggcta ttaagagccc agactatttt gtaaaagaaa ttctaaaagt 4920 cctaagctcc ttgtttgaaa aatcgtttcg aatatccctg gtatggttgc ctgctcattg 4980 cggcatctta ggaaacgaga aagcagacca tttggccaag aagggtgctt cggaagggtc 5040 tttttacgat agacctatcc tccctcacga gttcttacga gccccacaag ctttctgcat 5100 tgcacgctgg cagggcttgt gggacacgga tgaacttggg aggttcctgt actcgatctc 5160 tcctagagtt tctttgaaac cttggttccg cgacatctct ggagagcgtg cattcattcg 5220 aatgatgtct agacttaggt ctaatcattt cgcattgggc gcacatctcc agcgtatagg 5280 acaggtcgac acaaaagcat gtggctgcgg ccttggattt cacgacatag accatcttct 5340 atggtcttgc gtggaatacg aggctgcacg acccactttg ttggatgcag tcgaaaaact 5400 gggaagatct cctggtgttc ccatccggga tatactagct gggtcagact ggagccttct 5460 aaggatcatc tttgagttct gcagggctaa cggattaact gtgtaatatt cctataacag 5520 agctattgtg gctggctgtg tattggttgg tcgttgtcgt tgtcatgcac tctcacgccg 5580 atggcatgga tcgcggggtg ctatggaggg acttgctccc tcgtagcatc ttcgcggttt 5640 ttggcagcgg ggtgcggtca tgtatcatcg tcgtgcggct tgctcttgca tggctggctg 5700 aatgagttat gtatcccact cccactccat ctgttcgtct ttgtgctaca tgttgtaaat 5760 cgctagtttg cttgtgatga atgactgtac gtatgaatga atgaatgaat gagcgtggtt 5820 tgtatggatc ttgggtcagc tacagcttga gacagaaggt ctttgcatgg acgaagtcct 5880 tgtaaagata aactcttcaa aaagagttcc gcgagagttc tcaaagcacc tcaactctgc 5940 taccagaggc aagaatgcag aatgatgcaa actaaccatc ccaaagagcc ctggaatatg 6000 gattggatta agacccctgc aatgcccacg gacaagagcg gactacttca acggacaaga 6060 acgaaccaga agaattcgag caagccttgg gaaatttctt ttccacagga taatgacgaa 6120 gcatatgatc tcgacctctg ccccctctta ttggaccact cgacaaggac aaggatcgga 6180 ttttgacagc aacgtccgag cacacccggg atatggtgcc ctcctaaagc tgcgagaagg 6240 aaatgtctcg caacagtcgt gcattaaaga ttaaactgaa tttatatgta ctctaatttt 6300 aagcaatacg gcaaagccgt ccctactgaa taaaaaaaaa aaa 6343 // ID hAT-2N_AG repbase; DNA; ANG; 1104 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 2) XX DE This is a family of nonautonomous hAT DNA transposons - a DE consensus sequence. XX KW hAT; DNA transposon; Transposable Element; Nonautonomous; KW nonautonomous DNA transposon; hAT superfamily; TIR; 8-bp TSDs; KW hAT-2_AG; hAT-2N_AG. XX NM L1A_Mim; LTR6_MD; LTR86_MD. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1104 RA Kapitonov V.V. and Jurka J.; RT "hAT-2N_AG, a family of nonautonomous DNA transposons from RT African malaria mosquito."; RL Repbase Reports 5(2), 45-45 (2005). XX DR [1] (Consensus) XX SQ Sequence 1104 BP; 323 A; 279 C; 236 G; 265 T; 1 other; taggcatggg acaattgtgc gagacctatg ccgcacactc agttcattat agtgtgcggt 60 tgatcgccgc acgcgaaaaa gtgaacgggt aggtcttccg cacgcataca gaactaactg 120 aaatgtgctc tgaacaagat gcaagataac tgtcggtgtc atagactgaa cgaaaaaaaa 180 acattcgttt tgggtcggca gaatttgtca cacacaacga cacaagaaga tcgatcgcac 240 tttgcgcgag aaagaatgca catattgcta gcgtggtcag attgatcaat cgcacacagc 300 cgcacaaaaa gaactcctgc cgcacactga cccctgagtg aatactgagt gagaaagaac 360 actcatagcc gatgggtcat tcgcacacag ccgcacgcat gttcggttgg gtcagtcgca 420 cacataccgc acgcgtgttc agttgggtca gtcgcacact accgcacaaa aagaactctt 480 gtcgcacact gccgcacgcg cgttctgtta gttcggtcgc tcactgctgc acgcgtattc 540 agtggggtca gtcgcacact accgcacaaa aagaactctt gtcgcacact gccgcacgcg 600 cgttcagtta gttcagtcgc acactgccgc actcgtattc agtggggtca gtcgcacact 660 accgcacaaa aagaactctt gtcgcacact gccgcacgcg cgttcagtta gttcagtcgc 720 acaaaaagaa ctcgttaagc gcacagcaac ataaaagaga aagcttgtag attttactcg 780 tttagatcag ctttcgttac aaatcaactt aagtatatac agtctgttct cgagctacgt 840 tttctaagtt cttgtgtgcc ctagttacct agttatgcat gaaccgcgta tctcggactg 900 actgtattta taattaaaaa tgcagcatta aaagcaataa ctttctgttc ttgcattaaa 960 atctcgataa aatkaaacat gatattttta aaaatggtac ttagctccaa ttgagcgact 1020 gacctaccgc acgcgttcag ttcatcatac tgaacgaact acatcgcacg cgttcactga 1080 aaagaatcaa attgcccaag ccta 1104 // ID BEL19-LTR_AG repbase; DNA; ANG; 302 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE BEL19-LTR_AG is a long terminal repeat of the BEL19_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL19-I_AG; BEL19-LTR_AG; BEL19_AG; BEL superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-302 RA Kapitonov V.V. and Jurka J.; RT "BEL19_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(5), 86-86 (2003). XX DR [1] (Consensus) XX CC BEL19-LTR_AG is a long terminal repeat of BEL19_AG (its internal CC portion is BEL19-I_AG). XX SQ Sequence 302 BP; 111 A; 65 C; 72 G; 54 T; 0 other; tgtcgccgac agcggtaatt aggctgcgcc aaaagggtat gcaatctgca tcacaaccaa 60 caacaacaac aaatgtcaat agcgtagagc gtaggacgta ggaccaaggg ccaaaggctt 120 accaaccaac aacaacaaca aatgtcaata gcgtagggcg tagggccaaa gggccaaggg 180 cttagcgatc gctagagcag catgtgccag caggatgcat acctttcttg gaaaatttgt 240 aaaaacaagg ttttaaatcc ccaggggtaa aaataaatga attatactaa ctggagaaaa 300 ca 302 // ID GYPSY72-LTR_AG repbase; DNA; ANG; 227 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY72-LTR_AG is an LTR of retrotransposon GYPSY72_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY72_AG; GYPSY72-I_AG; GYPSY72-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-227 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY72_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 184-184 (2004). XX DR [1] (Consensus) XX CC GYPSY72-LTR is a long terminal repeat of GYPSY72_AG (its CC internal portion is deposited as GYPSY72-I_AG). XX SQ Sequence 227 BP; 75 A; 31 C; 59 G; 62 T; 0 other; tgttaggttt aatttttgct aggacgtgcc gctgtacaaa cgtaccgtag taaaatgtac 60 cacaggcaaa gtagaaagaa acagagggag acaagagtga gacagcaagc gaccaacaaa 120 atgggtataa aaggagatgg atttggaata aagaattaga attgttttgg ctgccagagg 180 aacgagtgtc tgtgttgagc gtttttttta cttttcgatt ctttaca 227 // ID MARINERN4_AG repbase; DNA; ANG; 1279 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE MARINERN4_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN4_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1279 RA Kapitonov V.V. and Jurka J.; RT "MARINERN4_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 21-21 (2003). XX DR [1] (Consensus) XX CC There are ~50 copies of MARINERN4_AG in the genome. CC They are ~97% identical to the consensus sequence. CC MARINERN4_AG copies are flanked by 2-bp target site duplications. CC This element has imperfect 16-bp terminal inverted repeats. CC Putative classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. This family is composed of several subfamilies. XX SQ Sequence 1279 BP; 321 A; 277 C; 323 G; 358 T; 0 other; cccaacagtc aaaagccgaa gattttatat tgacctttca ctttttctat ctgtctctcg 60 ctctaatttg agcgattttc ccctgatgag tgtccccgag cctcctcgtg tggttgttgt 120 tgttggaagt tgaaaccgaa accccttcgt gcgctgccaa aatcagcgaa gaacgaaatc 180 gagtggctca agcgaaaaaa cgaaaaaaaa tgacgaacga gtgtgctgtg acaataacac 240 acacacgcac aaggcgacac ccgaaatctg gtgcgatacc ggctttcagt ggagcaagta 300 cacacatgca cacatttggc cacaccagcg ctctgttcta gtgcaggtag ttttctgcag 360 tccacggtga tactacacac agccacgcac ttggcgatac acgctgtgtg gtgcggtgcg 420 gcggacgttc agaattcagt ggtgcaaata cacacatgca tacacttggc gacacacgct 480 ctgctgtgcg gtgagatttg tgttctgtgt ccagtggtgc aaatatacac acacacgcac 540 ttggctacac acgcagtacg gcgcggttgg cttggcagaa gcgcatacac acattcacgc 600 agtaggcttt actttcagtc cagtgaggtt ggtgcttgtg tggtgtttca gctcttggtg 660 gtcagcggat aaaccccgta aaaaatggag atcattctaa tttaaggtaa gtaaaagtga 720 ttaaaattat caaccaacat tccccccttc taattaatat catttttctt ccatttcatg 780 tgtgtttcac ggcgaacgga gacaacaaaa cctgtggcag tgtcgtggtg ttggagatgg 840 tggcccgtac agtgcagcga gaagaactag gcgcgacagc gatgtggagc agaaagtgct 900 gtgctgattt taatttttct ctgtgcgagt gtaaaaagtg ttttttaaat tatgtgcgtt 960 aatttagaat taagtgcagt gattattgtg catgtgttgt gtgtgttttt ttttgtattc 1020 tatgttgtgt tcgatattca ataaatcgaa tcatgttgat ataatggtcc gcactgtatc 1080 atttcggaaa gaaatcctgg caaaaaaggg ggagggggct attttaacga agggtcgtgc 1140 cgaaccccgt tcgggttttg cttcggcttc gggtttgctt cggcttcggg acccgacgga 1200 tcggttgaat aatttcgcca actcagcttc aaagccagct tcaatctcag cggatctctg 1260 agtgctggtg accggtggg 1279 // ID RETRO92_AG_LTR repbase; DNA; ANG; 225 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO92_AG DE retrotransposon - a consensus. XX KW Copia; LTR Retrotransposon; Transposable Element; AACOPIA1; KW Long terminal repeat; RETRO92_AG_I; RETRO92_AG_LTR; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-225 RA Jurka J. and Drazkiewicz A.; RT "RETRO92_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 17-17 (2002). XX DR [1] (Consensus) XX CC Related to AACOPIA1 from Aedes aegypti. 5 bp target site CC duplication. XX SQ Sequence 225 BP; 60 A; 47 C; 38 G; 80 T; 0 other; tgttggagtt tgcaccgacg cctaatctga cagcagtgtg aacaccttac agtttatact 60 aacctttgcg ttcacacctt tctttatttt gttaccttgt tttgtaccat gtaaatatct 120 tgatcagaca ataaaatgaa ctttgtagtt gagcttagac cgtcagccag aacggaacca 180 ctgcgtttgt tggaatttat tccttttatt aagcttaact caaca 225 // ID GYPSY37-I_AG repbase; DNA; ANG; 4382 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY37-I_AG is an internal portion of retrotransposon GYPSY37_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY37-I_AG; GYPSY37-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY37_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4382 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY37_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 66-66 (2004). XX DR [1] (Consensus) XX CC GYPSY37_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase CC RNase and Integrase is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY29_AG, GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, CC GYPSY34_AG, CC GYPSY35_AG, GYPSY36_AG and GYPSY38_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY37-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. CC The consensus encodes the 1432-aa GYPSY37_AGp gag-pol like CC polyprotein CC (pos. 69-4364). CC The sequence of the LTRs flanking GYPSY37-I_AG is deposited as CC GYPSY37-LTR_AG. XX FH Key Location/Qualifiers FT CDS 69..4364 FT /product="GYPSY37_AGp" FT /translation="MTTIEELCENLTKIGLARKCQQLGLATSGGKQEMAKR FT IIDHTTMATNDENDGDHQNREIREAVNAREQNAVLQNDDGVSENDDLHTHN FT DDNDDREDGGARAKCSRNDDDERYDDDGVGDNDDGGNDDYNDDDDDEDNDD FT DDDDDALLFQTAIRTSTPAARKERKFHGTSMVYSFRDMEESIDTFGADEGE FT DVRLWLKQLEMISRSARWNNEQMLIMCRKKLTGTARRFVFSLRDASSYSVV FT KKALIKEFAPFVRATDVHRALANRKKKPTETMRDYVYEMQRIALPIELDEP FT SLCEYIVDGVTDDEFYRSTLYEANTVQRLKEKLSIFEKATKNTKVAKKNRH FT DDNADVKKERQRFEGKCKGEVKRKCFNCGGTTHVASECPKKNDGPKCFRCN FT DFGHLSKECPKTQKTNKKDNDARINVVKAAVNEKLSVSVELFGQNIKAVVD FT TGSDISLMRFDLFEELKSNHIQMHESNLKVRGYGGGVSSVRGKTTISATID FT AEIFDIQFYVVPCEAIDSPMLIGMEFLSSVDYSITPEGVTIKKYRTDETVG FT IETKWIRRIAGYIDENELLVPTKYREEVMTLIEDYKPQKNMTDRNQLTITL FT SDLNVVCENPRRLALLEKEVVRKQIDEWLDEGIIQPSQSEYASPIVVVPKK FT DGSYRVCVDYRELNKKIVRDKFPMPNVEEQVDQLAEARVYTTLDLKNSYFH FT VPVDESSRKYTAFVADSGQYEFLRAPFGLCISGSGFGRFINDVLREFIRDG FT TVLAFVDDIIIPSKTEEDGLNAMKRVFEVAAKAGLTFNWKKCVFLQRRVEY FT LGYTIYDGKIEPAPAKIEKLKQFPQPTTVKQLQRFYGLASYFRKFVPSFAG FT IARPLSELLKKDRFTELNDEAMNSFLCLKDILAAYPVLRIFRADGDVELHT FT DASKTAIAGILMQRAEDDGKFHPCYYFSRLTSSAEKNYHSFELEALAVVES FT VRKFRCYLLGRTFKIVTDCMAFKDSVKKKKLNARIAKYVLALSEFDYVMEH FT RPGEKMPHVDALSRANVMIISTTILSKIRMAQNKDDRAKAIIATLERGDTV FT DKFILNNGVIYENDGDKRRLYVPKSMEIDIIRSAHEQGHFGVRKTKERINA FT DYFIVGLDEKIKSCIDTCVPCIIGEKKRGKPEGELQPIPKGDVPLDTLHVD FT HLGPMPSTKKSYGYILTVIDAFTKFVWLFTTKSTTAEEVVKKLQVITSTFG FT NPRRIISDRGSAFTSGHFTRFCENEGIEHHTIATGVPRGNGQVERVHRIII FT PMLTKLSMEKPEEWFKHVARVQKCLNNSWQRTINMTPFELMTGIKMRTKED FT AVLHELLSREIQNDFTEGRDELRRTAKRNIEKMQEENCKYYNLRRRPSRQF FT KIGDLVAIPKTQYGVGQKFKPRFYGPYEITRILDNDRYEVKKLDEETEGPK FT KTLTAGSCIKTWILPGRK" XX SQ Sequence 4382 BP; 1440 A; 777 C; 1103 G; 1062 T; 0 other; tgggggctcg tccgggaacg ggctaagtac ggtgataaag tgaaacgacg aaaaaccaag 60 agtgaataat gacgacaatc gaagagctgt gcgaaaacct tacgaagatc ggcctcgcac 120 gcaagtgcca gcagcttggt ttggcgacgt ccggcggtaa gcaggagatg gccaaaagga 180 ttattgacca tacgacgatg gccacaaacg acgagaacga cggtgatcat cagaatcgtg 240 aaattcgtga agccgtgaac gctagagagc agaacgccgt gctgcaaaac gacgacggtg 300 tgagtgagaa cgacgactta cacacacaca acgacgacaa cgacgatcga gaagatggcg 360 gcgcgcgtgc aaaatgttca cgcaacgacg acgacgaacg atacgatgat gatggtgttg 420 gtgacaacga cgacggcggc aacgatgatt acaacgatga tgacgacgat gaagacaacg 480 atgatgacga tgatgacgac gcattattgt tccaaacggc gatcagaact tcgaccccag 540 cggcaaggaa ggaacggaaa tttcatggaa catcgatggt ttattctttc cgcgacatgg 600 aagaaagcat cgacactttt ggtgctgatg aaggtgagga cgtacgatta tggctaaaac 660 aacttgaaat gatttctagg tcggcacgat ggaataatga gcaaatgctg attatgtgta 720 gaaaaaaact gacaggtacg gcgagacgtt ttgtgttttc cttacgcgat gctagcagtt 780 atagtgttgt gaagaaagcg ctaatcaaag agttcgctcc ttttgttcga gcaactgacg 840 tgcatagggc attagctaat cggaagaaga agccaacgga gacgatgcgt gattacgtgt 900 atgaaatgca gcgcattgct ttgcctattg agttagatga acctagttta tgtgagtata 960 ttgtggatgg cgttactgac gacgaatttt atcgctccac attatatgaa gccaatacgg 1020 tacaaaggct gaaagagaaa ctaagcattt ttgagaaggc aacaaagaat accaaggtag 1080 ctaagaagaa tcgacacgat gacaatgctg acgttaagaa agaaaggcaa cgctttgaag 1140 ggaagtgtaa aggtgaagtg aaacgtaaat gtttcaattg cggcggtacc acacatgttg 1200 cttcagagtg tccgaagaaa aacgatggtc cgaagtgctt tagatgcaac gattttggac 1260 atctttccaa agagtgtccc aaaacacaaa agacgaacaa gaaagataat gacgcacgaa 1320 tcaatgtagt gaaagcggct gttaatgaaa aactcagcgt atcagtggaa ttgtttggac 1380 agaacatcaa agcagttgtg gatacaggta gcgatatttc tctaatgaga tttgatcttt 1440 ttgaagaatt gaaaagtaac cacattcaga tgcatgaatc gaacttgaag gtacgaggat 1500 acggtggcgg cgtaagctca gtacgtggta aaacgacgat atctgcaaca atcgacgcag 1560 agatttttga tattcaattc tatgtcgttc catgtgaagc cattgattca ccaatgttga 1620 tcggaatgga atttcttagt tcagttgact attctatcac tccagaagga gtaacgataa 1680 agaaatatcg aacagacgag acagttggta tagagacgaa atggattcgg cgtattgcag 1740 gatacattga tgaaaatgaa ttattggttc caaccaagta tcgcgaagaa gtaatgacgc 1800 taattgaaga ctataagcca cagaagaata tgactgatag aaatcagctg acgattactt 1860 tatcagattt gaatgttgtt tgtgagaacc ctcgacgatt ggctctgttg gagaaagagg 1920 tagttagaaa gcaaatcgat gaatggcttg acgaaggaat catacaacca tcgcaaagtg 1980 aatatgcaag tcctattgtc gttgtgccaa agaaggatgg ttcgtaccgc gtttgtgttg 2040 actaccgtga attgaataag aagatcgtgc gtgacaaatt tccaatgcca aatgttgaag 2100 agcaagtcga tcagctcgct gaagctcgag tttacaccac tctcgactta aagaattcct 2160 attttcacgt accagttgat gaaagtagca ggaaatatac ggcattcgtg gcggatagtg 2220 gtcagtatga atttttaaga gcaccatttg gactgtgcat tagtggaagc ggttttggta 2280 gatttattaa cgatgtgctt cgagaattta ttcgagatgg aacagttttg gcgttcgtag 2340 acgatataat cattccgtct aaaactgaag aagatgggct gaatgcgatg aaacgtgtat 2400 ttgaagtagc agcgaaggct ggattgacgt ttaactggaa aaagtgcgta tttctacaac 2460 gacgggtaga gtacttgggg tacacaattt acgatggaaa aatagaacca gcacccgcga 2520 agatcgaaaa gttgaagcag tttccacaac cgacgaccgt gaagcaactg caacgattct 2580 atgggttagc aagctatttt cgaaagtttg tcccatcatt tgctggtatc gcacgaccgt 2640 tatcggaatt attgaagaaa gatcgtttta ctgaattgaa cgacgaagcg atgaattctt 2700 tcctatgttt aaaggatata cttgcagctt atccagtact acgcattttt agagctgacg 2760 gagatgtaga attgcataca gatgctagta aaacagcgat agcaggcatt ttaatgcaac 2820 gagcagaaga cgatggtaaa tttcatccgt gttattactt tagtcgactc acaagtagtg 2880 ctgagaagaa ttatcattcg ttcgagttgg aggcattagc tgtagtagag tcagtacgga 2940 agttcagatg ctatttgctt ggacgtacat ttaagatcgt gactgattgc atggcattca 3000 aggactcagt taagaagaag aagttgaacg caagaattgc caagtacgta ttagctcttt 3060 ctgaattcga ttacgtaatg gaacatcgac caggagaaaa gatgccgcac gttgacgcat 3120 tgtcaagagc taatgtgatg atcatttcaa caaccatctt atcgaagata cgaatggctc 3180 agaataagga tgaccgagct aaggctatta tagcaacatt ggaacgagga gatacggttg 3240 acaaatttat tttgaacaat ggcgttattt acgaaaacga tggcgataaa cgacgacttt 3300 acgttccaaa atcaatggag atagacatca taaggtctgc tcatgaacaa ggacattttg 3360 gcgtccgcaa aacaaaagaa cgcattaacg cagattattt tattgttgga ttagacgaaa 3420 agattaagag ttgcattgac acgtgtgttc catgtatcat cggcgaaaag aaaagaggta 3480 aaccagaagg tgaactacaa ccgataccta aaggagacgt accattggat acgttacacg 3540 ttgaccactt aggaccgatg ccgtcgacaa agaagtcata tggatatatt cttacagtca 3600 tagacgcttt cacaaagttt gtctggttgt ttacgactaa atcaacaact gcagaagagg 3660 tcgtaaagaa gcttcaggtg atcacgagta cgtttggcaa cccacgacga atcatcagtg 3720 atcgaggttc tgccttcacg tcaggacatt tcaccaggtt ttgtgaaaac gaaggcattg 3780 agcatcacac gatagcaacg ggagttccgc gaggaaatgg acaagttgag cgtgtccatc 3840 gtatcatcat cccaatgctt acgaagttgt ccatggagaa accggaggaa tggtttaagc 3900 acgtagcgcg agttcaaaaa tgtttgaaca atagctggca gagaacaatt aacatgacac 3960 cgtttgagct aatgaccggt atcaagatgc gtacgaagga agatgctgta ctacacgaac 4020 tactatcacg cgagatacag aatgatttca ctgaaggaag agatgagttg cgcaggacag 4080 cgaagcgtaa tattgaaaaa atgcaagagg agaattgtaa gtactacaat ctgcgtagaa 4140 gaccctcacg acagtttaag atcggagatc ttgtagctat accgaagacg cagtatggag 4200 tagggcaaaa gtttaaaccg cgcttttatg gtccgtatga aatcacgcgt attttagata 4260 acgatcgata tgaagtaaag aagttagatg aggagacaga agggcctaag aaaacattaa 4320 cagcaggaag ttgtattaag acgtggatac ttccggggcg gaagtaatgt caggaaaggc 4380 cg 4382 // ID HELITRON2_AG repbase; DNA; ANG; 6365 BP. XX AC . XX DT 29-JAN-2003 (Rel. 8, Created) DT 29-JAN-2003 (Rel. 8, Last updated, Version 1) XX DE HELITRON2_AG, a rolling-circle DNA transposon - a consensus DE sequence. XX KW Helitron; DNA transposon; Transposable Element; HELITRON class; KW HELITRON2N_AG; HELITRON2_AG; Rep/helicase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6365 RA Kapitonov V.V. and Jurka J.; RT "HELITRON2_AG."; RL Direct Submission to Repbase Update (26-DEC-2002). XX DR [1] (Consensus) XX CC HELITRON2_AG is a rolling-circle DNA transposon. The consensus CC sequence encodes well-preserved remnants of the ~1600-aa CC Rep/Helicase CC protein (positions 1448-5992). There is not enough sequence data CC to reconstruct it precisely. There is only a 57% nucleotide CC identity CC between HELITRON2_AG and HELITRON1_AG. However, HELITRON2_AG is CC 75% identical to HELITRON1_DM from D. melanogaster. CC Nonautonomous elements derived from HELITRON2_AG is deposited in CC Repbase as HELITRON2N_AG. XX SQ Sequence 6365 BP; 2074 A; 1231 C; 1181 G; 1879 T; 0 other; tctatatata taaaaatctc gtgtcacggt gtttgttgcg agcaacctcc gaaacggctg 60 gatcattttc aacgaaactt tgcacacacc ttatggtggt atgataatag gttttaagac 120 tcacatcatt ttcagaagtt gcactaaacc aaggaccgta aagatatcag gtcaagttat 180 aagcccgttt tgacagctag gtggaagaag aatcgacgaa gaaaactgac agttagtttt 240 ccgcgtttgg cgaacgcttc aagttctcga aggaaaattt ccattttaaa tcacgggcag 300 tgtcctaaca aactaaaata ttcacccaaa ttgatctcca ccaaatatta accatgcaaa 360 acgtaaacaa tccatcaata catatacaag cggaaaacca tctgtcaaaa tcggagccca 420 ggggggggat cgccaaaacc gctataacta cacctcatta tacattccac cttgcactaa 480 acttaattta ttagcaattt tctaactccc atacaaagga catgattgtt gttgttgtat 540 acagcacttt gacagtcagt gtgaagcgcg gtttatacgg acgtcacacg aagcgtaaac 600 tttggcgtaa actgaatttt agccatacaa ccctgaaatg ttttgacagt aagtttacgc 660 tctcccatac aaatcaagcg taaacttttt gagtttacgc ctcgtgtgac gtccgtatta 720 cactcggtgt aaggtgcggt ttacactcgc tgtgaaatgt ctagcggttt acgatgctgt 780 cacaaatatc cataatgaca aaacagagat ggagtacttt agtttgggaa taaagaatta 840 aaaaacacat gctatttcgc tttgtcctat ttgctgtttg atgttccaaa acgaacaaat 900 atgcttgctt taataattgt aacataaaca tgtttggtta ctgatggagt tcaacatcaa 960 ggcaagtgta aacaacattt ttcgttatgg tgttttttct tcggtctgcc cggctgcggt 1020 cttacccggc gctctggcac caatcgaatc tggcaccaat cgaacataca aaatgtatgg 1080 gatttgatat gttcgatttg ctatcaaaca aatcgaacat gtcaaattcc atacattttt 1140 catgttcgat gtcgtgtttg attggtgcca gagctccacc tttatgtgta catggggcta 1200 ttgaatgagt agcagttttc tagcatacaa gtgacatttg agagcttaga agcaaactca 1260 cgttaatact tttgtttgtg caatttgagt atcaaaactg tgtaacaaca taaaaatttt 1320 aatcgatata tcggaaaaaa cacgtatgta atatgaagtt ccttcaaaaa aatattatta 1380 tccctttatt gatatcatta ttcctttaaa attctgtttt tcattaattg gcgttttata 1440 gtaaacaatg ccagcaagga ggaatatagg acgaagaacg aatacagcag caagtaagag 1500 aaggagcaga actgaaagaa catctgatca aatagatcgt gacaatttaa gtttgcgtga 1560 aggaatgtca cagttacgcg catctcagtc acaacacaat cgaatagggc gaaatcaaca 1620 acgacgatta gaaagagcac aaacgcgccg gttcgtaact aacatccgca gaacatatga 1680 ccaacaacga cagcaagagc atcgagcttt ttcatattct tccttccatc gaattgcatt 1740 tcaatacgag cctgacattg catattataa tcattcaaaa gttttaatag gctccatgga 1800 tcaggaatgt cgcttctgcc atgcacttaa attcaaacat gagccagctg gaatgtgttg 1860 cgcttctgga aaagtttatt tgccatcact tgagacacca ccggaaccgt tgaaaggttt 1920 acttacaggt acagattctg attctgccat gtttctgaaa tccatccgaa aattctattc 1980 gtgtttccaa atgacctctt ttggagcgac cgaaatagtt tccaacgctt cttcaaatgg 2040 tcaaatattc aattctacgt ttaaaattaa aggccaagtt taccacaaag tagggtcgtt 2100 attaccaatg ccaattacta cccctaattt tttacaaatt tatttcatgg gtggtgacga 2160 taacaatcgt gtaaatacgc gttgtagttt cactaacctc agttcgatca gtgccagacg 2220 cattgtctat gaattgggcg cccttttgaa tgagcataat gttttattga attcttttaa 2280 atcacatatg cacgaattaa caagtgataa ttatgctatt gtcattaatc cggataaaac 2340 cccatccgga gaacatgtgc gtagattcaa tgcacctgtc atgaatgacg ttgctggaat 2400 cgttatcggc gaccgcacgt ctacgcgaga aatcgttatc cgaagaagga acaacaatct 2460 ggagttcatt tcggatacac atcgctccta tgatgctctc caatatccac tcatcttctg 2520 gaaagggcaa gacggatact gtattaatat taaacaacgg gatccaatta caggtaatgt 2580 cttactaaaa cagcaacata tttttattct acaaactgta tgtttctttt tatatacatt 2640 ttgttatagg aaacgaaaca aaaagttagc tctatgaatt tttattcata tcgtttgatg 2700 gttagacgca acgaggataa cttaattctt cgatgtcgtg aactattcca acaatttatt 2760 gttgacatgt atgcaaaggt agagaacgaa agactacgat atctacgtca caatcaaaag 2820 aaactgcgag caaaagaata tatccattag cgagacgcta ttatgagcga tgtaaacagt 2880 gccgatattg gtgataatgt tattctacca tcatcgtatg tgggtagtcc acgtcatatg 2940 caagagtata taaaggatgc aatgacattc gtacgggaat aatgaagacc atgtttattc 3000 atcactttca catgtaatcc aaagtgggaa gagattacat gtttgctttt gcccggacaa 3060 aatgcaacac accgccatga aataacagca cgagtgttca agcaaaaatt gaaatcttta 3120 atgaacctaa taacaaaaat ggatgtattt ggacctacac gttgctggat gtactcagtt 3180 gaatggcaaa agcgaggatt acctcatgct catattttaa tttggttagt cgataaaata 3240 cgtcctgaga caatgatagc ttgatatcgg cggaaattcc cgatccgtca agagatcagc 3300 tactctttga tattgttacg actaacatga ttcatggtcc atgcggcgct cttaattctt 3360 tatcgccttg catggcagaa ggaaaatata caaaacgttt tcctaaacaa ttcaccaatg 3420 agacaatcac taatgtcgat ggatatccga tatatagaag aagaaatact gaaaacggag 3480 ggcattccta cacacataaa attaaccaag actcattcat tgatattgac aatcgatggg 3540 ttgtgccata ttcgccactt ttaacaaaaa catttaacgc tcatatcaat gttgaattct 3600 gcagttcggt gaagagcatc aaatacatct gtaaatacat caacaaaggc tctgatatgg 3660 ccgtattcaa tattcaaaat actgaagtaa atactgctgg atcaattacc aacgacgaat 3720 taacgggcta tcaaattggt cgatacatca gttccaatga agccgtctgg cgtatatttg 3780 gttttcaaat tcatgaacgg tatccggctg ttgttcattt agccgtccat ctcgaaaacg 3840 gtcaacgcgt tttttttctg aggaaaatgc aattgaacgt gctacaaatc caccgagaac 3900 aacactaact gcattttttg aactatgtaa ccgtacggat gtattaggtg cctttgctcg 3960 gactctccac tattccgaag taccgcggtt ttttacatgg catcaaacaa aacaatggat 4020 gccccgtaaa caaggcatac cagtagatgc atgtcctggc ctgtttaaat caaatacctt 4080 aggccgtgtt tataccataa atccaaagca gatagagtgc ttctacctac gacttttgtt 4140 gataaatgtt attggcccct tatcgtttga aaatatacgt acagtaaatg gccaacaaca 4200 ttcaacatat aaagatgcat gtcttgcatt gggcttgctc gaaaatgaaa accattggca 4260 cgatatgatg gctgaagcta ctcttgattg tacagcaacc cagatacgac ttctttttgc 4320 taaagtatta tatgaattat atttttatat aaggattata tgactgacga tatactgcat 4380 agaataagga tgtcccacaa tgacccaacg atgccataca gcgattatat gtataatgag 4440 gcattgatcg ctattgaaga tctatgtatt atcattgcta atttatccct tcatcatttt 4500 gggttaaatt caccaaatcg agctgcatct gatgtagtgg acactgaagt taatcgtgaa 4560 ctgcaataca atattacagc aatggaagac attgttgctc gcaacgttcc actgttgaat 4620 gatgaacaat caatgattta cgaacaaatc atgctggcag tatcacaggg agagggtggt 4680 ctctttttct tggatgtccc aggtggcact ggaaaaactt ttctaatttc gttaattctt 4740 gccaaaatac gatctaataa tgatatcgcg ttagccgttg catcatctgg aattgcagca 4800 acattattag aaggaggaag aacagcgcat tcagcattta aattaccgtt gaacattcat 4860 aataacccag ctgcagtgtg caatataaaa aaaaaacaat cgtctatggc caaagttctg 4920 caaaactgta aaattatcat atgggatgaa tgtactatgg ctcacaaaaa ttcactggaa 4980 gcacttaata gaacactcaa agatttaaga aataacaatc aattctttgg aggtatttta 5040 ctactcctat cgggtgattt caggcaaaca cttccagtga ttccacgatc aacatatgct 5100 gatgagatca acgcttgctt gaagttgtcg ccattatggc ataattttca aaaggtacag 5160 cttagcaaaa acgtgcgcgt tgaaatactt caggattcat cagctatgac tttctctgat 5220 caattgttag atattggcaa cggaaaagtt cctttccatg gcaatactgg ttgcattaaa 5280 atgccatcaa atttttgtac agtcgtcgat tcctagcaga cactcataga ttgcatattt 5340 cccgatttaa aaactcaata tgtacatcaa tcgtggcttg cagaaagagc tattcttgca 5400 gcaaaaaatg tcgacgtcga agaattaaac ttccaaatac aagatgcatt gcctggtgat 5460 ttggtttcct acaaatcatt tgatacagtt tgtgattccc acgaagctgt taattatccg 5520 acagagttat tgaactcttt aaatttacca ggcatggctc cacatgtact cagattaaaa 5580 gttggatcac cggtaatact gctaagaaac ttgaatccac cacgtttggg taatggcacg 5640 cgactggtaa ttaaaaaatt aaagaaaaat attatagaag caacaattct aaatggcaaa 5700 tttgttaatg aaaatgtgct gctaccacgc attccaatgt tctcaactga ttcaacaatt 5760 gagtttaagc gcgtgcaatt tccaatcaat ctggcttttg caatgacaat taataaatcc 5820 cgaggtcaaa cgatgtctgt ttgcggtttg gacttacgaa catcgtgttt ttcgcatgga 5880 caattatatg tggcatgttc tcgcgtgggt aaaccttcca atctatacat actagctaaa 5940 gacagattaa ccaaaaatat cgtccattca ttagcgcttc gagatttgca ttagagattc 6000 atagtagtta tatgtgtgtg acaagtatgt aattatttgc atgtaaataa atcaatctac 6060 gtaattctac actgttttac tgttccaaat tatgttctta aacatattat ttgagtaatt 6120 aacgacaatg acatcaatga aaacaaataa aatgaataaa aaattaaatg tgaatttaaa 6180 cacctatatt tgcaacttaa cggactggca gacgaaggcg cgagaccgtg agcggtttcg 6240 gacactcctg aggcaggcca agaccgcaaa gcggttgtag tgccggataa gtaatttgca 6300 acttcaaata cttaatcggc aactaataaa atgtggggta aaactaggtt taccgggcca 6360 gctag 6365 // ID GYPSY62-I_AG repbase; DNA; ANG; 4382 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY62-I_AG is an internal portion of retrotransposon GYPSY62_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY62-I_AG; GYPSY62-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY62_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4382 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY62_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 163-163 (2004). XX DR [1] (Consensus) XX CC GYPSY62_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY62-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. The consensus encodes the 1441-aa CC GYPSY62_AGP gag-pol like polyprotein (pos. 179-4560). The CC sequence of the LTRs flanking GYPSY62-I is deposited as CC GYPSY62-LTR_AG. CC GYPSY62_AGP: CC MAELQQAILLMTELLQKLAQPNNTEQTLESLATNISEFSFDPENGITFEKWYSRYTDLFESDAK CC NLDDSAKVRLLLRKLDTPSHVRYVNFILPKLPKDVDFAGTVKILSQMFGTHTSIFNKRYQCLQL CC VKSEAEDIISYAGKVNRSCEDFDFKNMNIDQFKCLVFVSGLKGHAYADTRPRLLSRIECETAEA CC PVNLQTLINEYQRLVNLKEDTSMIERQSSSKQAVHVVQEKGSFHHPHTSKPEGKLPRTPCWQCG CC QMHFVRNCPFSEHQCKQCNRIGHKEGYCGCYSKRQTVPGGGEKKTSNQSPSNQPSHQQTNQPAN CC RQNNPRKVNARGVYIVNHIANHSSNRKYVPTLINGVATRLQLDTASDITVISKQTWNNLGNPSI CC IKPTIQAINASGKPLHLMGEFQCDVSINGKTARGRCFVTTTANLNLLGIDWIDLFKLWSVPLDS CC VCNQVTTKSIDQQISEFRAKHANVFNDSLGHCRKLKVKLFLKPNAKPIFCPKRPVPFNTIPLVD CC AELTRLQSLGILEPVDFSEWAAPIVAVRKPNGRVRICADYSTGLNEALEPNHYPLPTPEEIFAQ CC LNGSTVFSIVDLSDAYLQLEVDDDSKKLLTINTHRGLFRFNRLAPGVKSAPGVFQRVVDGMIAD CC IPGVRSFIDDVIVFGKDISSHATSLNLLFQRLKEYGFHVKAEKCHFFQSQLGYLGHIVDKQGIR CC PDPEKVKAIAALPPPTNVPELRSYLGAVNFYGRFVRNIHELRHPMDQLLKKDVKWQWTSACQQA CC FDQFKRTLQSNLLLMHYDPKLPIIVAADASSTGIGAVIFHQFPDGSMKAVQHASRTLAPAERNY CC GQPEKEALALVYAVTKFHKYLLGRHFTLLTDHKPLLSIFGSKKGIPLHTANRLQRWALTMLNYD CC FEIQHVSTNEFGCADLLSRLIDQTIQPEEEYVIAALSLEEDLVSILADTTEKVPVSFAALQKAT CC TTDATLQSVAQHIRDGWPSCSKNLSAAVQPYFLRRESLSMIDGCVMFGNRVIVPAQFRRRILQQ CC FHRGHPGIVRMKAIARSFVYWPGIDNEIEDFVKRCNPCSTAGKTPTKTTLESWPVPTKPWSRIH CC IDYAGPVDGVYFLIIADPYSKWPEVYMTKTTTAKATIKLLKQSFATFGVPEVIVSDNGTQFSSH CC EFLTFCTSEGIRHIRIAPYHPQSNGLAERFVDTLKRSLRKIRSGGETLEEALVTFLQVYRSTPS CC SDLGGKSPAELMFNRPLRTVSALLRPTTSEDGHTVRRDRTLQNDAFNRKHGAVKRSFQHGESVY CC VKVHQANSWQWKPGTIIEKIGSVNYNIFLEDQQRLIRSHANQLKSRFTETSQHTDQSTPLSIFF CC DNFGLQFPVEVPSTEIYVDVDSQDEEAVAVSQSNSFLTEDISEDEHENSDGSMVEPVDNEQTPV CC PVELSEQATSTMDRPRRTIQLPARFKPYWMFKP. XX SQ Sequence 4382 BP; 1246 A; 1136 C; 1001 G; 999 T; 0 other; atttggcgac gaggattttg aaattttgaa gcacaacgaa agaagcgatg gcagaactac 60 agcaagcaat ccttttgatg acggaactcc tgcagaagct ggcacagccg aacaatacgg 120 agcaaactct cgaatccttg gcaaccaaca tcagtgaatt ttcgttcgac ccggaaaatg 180 gaatcacttt cgaaaagtgg tacagccggt atacggatct tttcgaatcg gatgccaaga 240 atttagacga ttccgccaag gtacgcctgc tgttgaggaa gttggacaca ccatcccacg 300 tcaggtacgt caacttcatc ttgccgaagc ttcccaaaga cgtcgatttt gctggcaccg 360 tcaagatact atcgcaaatg ttcggcacac acacctccat ttttaacaag cggtaccagt 420 gcctccagct agtaaaatct gaagcagagg acatcattag ctacgcaggg aaggtgaatc 480 gttcatgcga agatttcgac ttcaaaaata tgaacatcga tcagttcaag tgtttggtgt 540 ttgtgagtgg ccttaagggt catgcctacg ccgacacgcg accaaggctt ctttcccgga 600 ttgaatgtga aacagcagaa gctcccgtga atcttcagac cttgattaac gaatatcaac 660 gactcgttaa tctcaaggag gacacatcga tgattgagcg ccagtcaagc tcgaaacaag 720 cggttcacgt tgtccaggag aagggaagtt ttcatcatcc acacacttca aaaccggaag 780 gtaagctgcc tcgcacccct tgttggcaat gcggacaaat gcattttgtc cggaactgtc 840 cgttttcgga gcaccagtgc aaacagtgca accgaattgg ccacaaggaa ggctattgcg 900 gctgttattc caaacggcag acagttccag gcggagggga aaagaaaaca tcgaatcagt 960 caccatcgaa tcagccatct catcagcaaa cgaatcagcc agcaaataga caaaacaacc 1020 ctcggaaagt caacgctaga ggagtataca tcgtgaacca catcgctaac cattccagca 1080 accgaaagta cgtgcctact ttgatcaacg gcgtggctac caggctccaa ctggacacag 1140 caagcgatat cacggtgata tcaaagcaaa catggaacaa cttgggtaac ccttcaatca 1200 tcaaaccgac gatccaagca atcaacgcgt cgggcaaacc acttcatctg atgggtgagt 1260 tccagtgcga cgtcagcatc aacggaaaaa ctgctagagg cagatgtttc gtcacgacga 1320 ctgccaacct caaccttctc ggcatcgact ggattgacct gttcaagctg tggtcagttc 1380 ctctcgactc cgtttgcaat caagtaacga cgaagtcgat cgatcagcag attagtgaat 1440 ttcgagcaaa gcatgcaaat gttttcaatg attcactggg acactgcagg aaattgaagg 1500 taaagctttt tctcaaacca aatgctaaac ctattttctg tccaaagcgc ccagttcctt 1560 tcaacaccat tccattggtc gatgctgaac tcacacggtt acaatcattg ggcatcctcg 1620 aacctgtcga tttctccgaa tgggctgctc ccatcgtggc agtgcgcaaa cctaacggtc 1680 gagtgcgcat atgtgcagac tactctacgg ggctaaacga ggcgttggaa ccaaatcact 1740 atccgcttcc gacgccggag gaaattttcg cccaactcaa cggcagcacc gttttcagca 1800 tcgtcgatct ttctgatgcg tatctgcagc ttgaggtgga tgacgactcc aagaagctgc 1860 tcacaatcaa cacgcatcga ggattgttcc ggttcaatcg cctggctccc ggagtgaaat 1920 ctgcgccggg cgtattccaa cgtgtggtgg atggaatgat agcagacata cctggtgtcc 1980 gctctttcat cgatgatgtc atcgtgtttg gaaaggacat cagctcacac gcaacgtcac 2040 tcaacctact cttccagcgg ctcaaagagt atggttttca cgtgaaggct gagaagtgtc 2100 atttctttca atcacaactc ggctatttgg ggcacattgt ggacaaacaa ggcattcgtc 2160 cagaccccga gaaagtgaag gctatcgctg cactcccacc gccaaccaac gttcccgagc 2220 ttcgatcgta cctgggtgca gtcaacttct acggtaggtt cgtgcgcaac attcacgaat 2280 tgcggcatcc aatggaccag ctgttgaaga aggacgtgaa atggcagtgg acatcggcat 2340 gccaacaggc tttcgaccag ttcaaaagga cgcttcaatc aaacctgctg ctaatgcatt 2400 acgaccctaa gcttccaatc atcgtggctg cggatgcatc aagcacaggc atcggggcag 2460 tcatctttca ccagttccct gatggcagca tgaaggcagt acaacacgct tcgaggacgc 2520 tcgcacctgc ggagcgtaat tatggacaac cagagaagga agcgcttgcg ttggtctatg 2580 cagtgaccaa attccacaag tatctgttgg gacgccattt cacactgctg actgaccaca 2640 agccactact ttccatcttc ggttctaaga aggggattcc cctacacact gcgaaccgac 2700 tacaacggtg ggctctcacg atgctgaatt acgacttcga gattcagcat gtatccacca 2760 acgaattcgg atgtgcagat ctgttgtccc gactgatcga ccagaccatc caacccgaag 2820 aagaatatgt tatcgctgcg ctgagccttg aggaagattt ggtaagcatt cttgctgaca 2880 caacagaaaa ggttcctgtt tcattcgcag cattgcaaaa ggctacaaca accgacgcca 2940 cactccaatc tgttgctcaa cacatacgcg acggatggcc cagctgctcc aagaaccttt 3000 ccgctgcagt tcaaccgtat tttctacgac gggaatcgct cagcatgatt gacggatgtg 3060 tcatgttcgg caacagagtc atcgttccgg cccagtttcg acgaaggatc ctgcaacagt 3120 ttcatcgagg tcatccggga attgttcgga tgaaggcaat tgcacgaagc tttgtgtact 3180 ggcctggcat cgataacgaa attgaggatt tcgtaaaacg ctgcaatccg tgttcgactg 3240 ctggcaagac tcccaccaag accacgttgg aatcttggcc agtacctacc aaaccatggt 3300 cgaggatcca tatcgactac gcaggtccgg tagatggagt ttacttcctg atcatagcgg 3360 atccctactc caagtggcct gaggtataca tgaccaaaac aacaacagcg aaagccacga 3420 tcaaattgct gaagcaatca ttcgcaacat tcggagttcc ggaagtaatc gtttcggaca 3480 acggcaccca gttttcaagc cacgaattcc tgacgttctg cactagtgaa ggtattcgac 3540 acattcgaat tgctccatac catccacaat ctaacggtct tgctgaaaga tttgtggaca 3600 cgttgaaaag aagccttcga aaaattcgtt cggggggaga gacgctcgaa gaagctctag 3660 tcactttctt acaagtgtac cgctccactc cttccagtga tctgggtgga aaatccccgg 3720 ctgaattgat gttcaacaga cctcttcgta ctgtttccgc attgctacgg ccaaccacaa 3780 gtgaagatgg tcacacagtg cgaagagata gaacattgca gaacgatgcg ttcaaccgca 3840 agcacggggc tgtcaagaga agttttcagc acggagaatc ggtttacgta aaggttcatc 3900 aagcaaattc ttggcaatgg aaacctggca ccataattga aaagatcggt tctgtcaact 3960 acaacatttt ccttgaggat caacaacgac tgataagatc gcatgccaac cagctcaagt 4020 cgcgcttcac cgaaaccagc caacataccg atcagtctac tccactatct atctttttcg 4080 acaatttcgg cctacaattt cctgttgaag taccatcaac cgaaatttat gtagatgttg 4140 attcacagga tgaagaagca gtagccgtca gccagagcaa ctccttctta acagaagata 4200 tttcagaaga tgaacacgaa aacagcgacg gttcaatggt cgaaccagta gacaacgaac 4260 aaactcctgt accagtagaa ctaagcgaac aggcaacatc cacaatggac agaccgcgac 4320 gaacaatcca acttcctgct cgtttcaagc cctattggat gttcaaacct taagggggga 4380 aa 4382 // ID HELITRON1_AG repbase; DNA; ANG; 6666 BP. XX AC . XX DT 11-NOV-2002 (Rel. 7.1, Created) DT 20-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE HELITRON1_AG, a rolling-circle DNA transposon - a consensus DE sequence. XX KW Helitron; DNA transposon; Transposable Element; HELITRON class; KW HELITRON1_AG; Rep/helicase. XX NM HELITRON1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6666 RA Kapitonov V.V. and Jurka J.; RT "HELITRON1_AG, a rolling-circle DNA transposon from African RT malaria mosquito."; RL Repbase Reports 2(10), 8-8 (2002). XX DR [1] (Consensus) XX CC HELITRON1_AG is a first example of HELITRONs (a rolling-circle CC DNA transposons) present in insects. The A. gambiae genome CC contains many families of HELITRONs. HELITRON1_AF is one of them. CC A defective copy of a HELITRON-like element is also present CC in the D. melanogaster genome (AE002840, positions 9827-9264). CC HELITRON1_AG has all basic hallmarks of HELITRONs: it is CC flanked by 5'-TC and CTAG-3', is inserted into the AT target CC site and encodes a 1792-aa HELITRON-like Rep/Helicase protein, CC called AGHEL1p. XX FH Key Location/Qualifiers FT CDS 820..6195 FT /product="AGHEL1p" FT /note="HELITRON-like Rep/Helicase protein" FT /translation="MQPSVSDIVSQNLTPPEETPNEKKARLQRKRQALYRA FT KKRLPGAAPTLAAVQDDQQAVPSTLAGSSLSAASAILQQQQRRNTDDELLN FT EIAGRYSTPPDETPPEKKARLARKRQALYRAKKRLAGAAPTVAAVQDDQQA FT VPSTSAGTSLSGAPAILQQQQLRRNIDDDLLNEIAGRYSTPPDETPPEKKA FT RLARKRQALYNARKRMVQTPNVATVVNRAPAPASAPVPAPAPAPTPALVDP FT TAAVAIRVPIHQQLQTYVRHLRDHEADQQRQFIARRRMTGHLRVAHGNHES FT YSCPSLHTLAPRVPCNFCSALKWPGEPPSVCCNSGQVVLPPFPEPPAELRQ FT LFDVPQFLLNIRRYNNAFAFTSIGASIRGNDPVRQDLRVGHGGIYNYRIQG FT ALCHRIGSLAVLPGRPPCYAQLYFYDSSSEGQYDEMLNARAKAYGSELNRE FT ILAILQRVFSTHNPIAEMFKHAYERMSVRDDLRLCIHSRIPGIDQRRYNAP FT TADEVGGVFVCDDTTGATEHTRDIVLQHRATGYLQPVFDTNQLADALQYPI FT LFPRGETGWTYGMPKVQRRTRQQQQPSRPTSSSNGAAHIDENDDAAEDPNP FT ELDGAERSANQITPREYAAYRIAWRENAQTLMHRAGRLFQQYCVDQYCKIE FT MQRLKYLREHQAQLRTEAYTGFMDLAGAEGTIADLHPDNLPNVAVEIPVVQ FT EPPVAPRRSIIDPPSNLSRTGTRVILGPSFVGSNRYMRAQYQDAMAIVRAL FT GKPDLFITVTCNPKWPEITQCLLPRQQAPDRPDVIVRVFRLKLKAILNDLT FT MGALGIEVARIHVIEFQKRGLPHAHILVILAEEDKPQTPADYDKIVSAELP FT NPATSSQLFETVQSCMMHGPCGAANPAAPCMKDGTCEKGFPKSFCEQTRSM FT DNGYPQYRRRNNGRSVTVKGIELDNRYVVPYNPWFTHKYNCHINVEVCTSI FT SSVKYLYKYVYKGHDRLSVTLAVGNDEIQQHIDARYLSPTDSCWRIMRFEL FT QAKTHTVVTLPIHLENQQNVFFRANETVSCVLNRGNHTMLTRFFQLAAHDN FT FARALLYHDVPTYYRYAKPTANQRLPWQEPGTKHWIRRIRTGHKTVIGRMV FT SCSMQLMERYCLRLLLCYRKGPTSYEDLRTVNGSVCETFQQAAINEGLLED FT DSEWDRALEEAATYRMPAQLRHFFALILSSGMPQNPRTLWESYASEMSEDF FT HHRNRDRYTTEESDLNKLLRDVEHFRALSDIDRYLRGTTPSKVLTSSPGMP FT QLSEYEHVQAHIMDDDDVNEFIIAERSYLITDLDATLATVHQLNDEQRMVY FT ETVTAAIDRQLATAASQANAGDQRLFFLDGPGGTGKSFLVEKILAHVRRCG FT EIALATAASGIAALLLTGGKTVHSTFKLPLDLNNHSTCSITVQSKRAEMLR FT QTALIVWDEASMSSRFALEAVDRTLQDITGVQLPFGGKVVLLSGDFRQILP FT IVPKGTDAQIINECIKKSTLWPLFRSLQLRDNMRVRTAPNANQASELRDFA FT NLLLRIGEGRHDTFAGLDPSLAKIPHDMIVPHTANPTNDLNTLIDKIYPDM FT QRHFQHPSFFSDRAILSPLNVDVASVNNLVLDRIPGPEQEYRSVDTLVNPE FT EHEHLQLPSEYLNTLNVSGIPVHRLRLKRFAPVLLLRNLNSDMGLCNGTRL FT QIVGLKRNCLHAKILTGTRRGEDVLLPRIFCDSNDKGHPFQIRRKQFPVQV FT CFAMTINKSQGQSLHHLGLYLPQDVFAHGQLYVALSRVTSRANIAVLIPNP FT KRADEEGVSTSNIVYREVVDR" XX SQ Sequence 6666 BP; 1706 A; 1761 C; 1639 G; 1560 T; 0 other; tctatatata aattttcgtt aacgctgacg ctcgatgtca cactttgcgt tacgctgttt 60 ctatctcgcc cgttctgcct tcgctgccgt ataacaccat cacgctcgca cgaccgcacg 120 acgaataccc ctaccatcgt gagcacgacc aacgctgcag gggtgctcaa ccgtttcctc 180 gctttccgaa ctcgaccgag cagtgcgcac actatcgact cgcccttgcg agcccctgcc 240 agcgcatgcg cgtacgagca agcgagcgag cgagcgaacg atcgagttgg ctacgcgata 300 tgatatattt tttgacaaaa cgaatatcgt ttattaatac ggttataggt gcgatttaca 360 cttttttaag ttgatttcca gtttttttag tttagtttga gtattaggtt aatttcttta 420 ggtgcgcatg gtgaagtata gtgtagtgta gtttagaaaa gtgttgtgta ttgtagtgta 480 gtttagtacg cagggcagtg tagatttagt gtaaggagat tattttcgta acgtacgttt 540 cgtgtacgtt ttgtggttca atttgttgtt cctgcgtgga acgtgtgttc ctgtgttcct 600 gtgttcctgt gtttctgtgt tccggcgttt tcttgcaatt ctgtttttgc tacgatatgg 660 agcataatga ggaaagtgtg ggcaacggtt gccctatgat tgacatcatg tcaggtataa 720 tatttcctac tttcttttat tcacagcact tatctaaaat tctttttttt tgttttacaa 780 ttttcaacag ccccagttac agaaccctat ccacaaacca tgcagccctc cgtaagtgat 840 atcgtctcac agaatctcac tccaccggaa gaaactccga acgaaaagaa agcaaggctt 900 caaaggaagc gacaggcgtt gtacagggcc aagaagcgcc tcccgggtgc agctcctacg 960 ctggcagctg tacaggacga ccaacaggct gtaccttcaa ctttggccgg gtcatcctta 1020 tcagcagctt ctgcaatcct gcaacaacaa caacgaagga atacggacga cgaattgctc 1080 aatgaaatcg ctggacgata ctccacccca ccggatgaaa caccacccga gaagaaggca 1140 aggctcgcca gaaagcgaca agcgctgtac agggccaaga agcgcctcgc gggagcagct 1200 cctacggtgg cagctgtaca ggacgaccaa caggctgtac cgtctacttc ggccgggaca 1260 tccttatcag gagctccagc aatcttgcaa caacaacaac tacgaaggaa tattgacgat 1320 gatttgctca atgaaatcgc tggacggtac tccaccccac cggatgaaac accacccgag 1380 aagaaggcaa ggctcgccag aaagcgacag gcactgtaca atgccaggaa gcgcatggtg 1440 caaactccga atgtcgcaac tgtggtcaac cgcgccccag caccagcatc agcaccagta 1500 ccagcaccag caccagcacc aacaccagca ttggtagacc caactgcagc agttgctatc 1560 cgagtgccaa tacatcagca gctacaaaca tacgtgcgcc atcttcgcga ccatgaagca 1620 gatcagcagc gacaattcat agcccggcgt cgtatgaccg gccatctgag agtagcacat 1680 gggaaccatg agtcgtacag ttgtccctcg cttcatacgc ttgcaccacg cgtaccttgc 1740 aatttctgtt ccgctctcaa gtggcctgga gagccgccca gtgtgtgctg caacagtgga 1800 caagttgtgc ttccaccgtt tcccgaaccg ccagcagagt tgcgccaact atttgacgtg 1860 ccgcaatttc tgctcaacat ccggcgatac aataatgcgt ttgccttcac ctctatcggt 1920 gcctcaatac gaggaaatga cccagtacgc caggatctac gagtcggtca cggtggaatt 1980 tacaactatc gcatacaagg tgcgctttgc catcgaatcg gttcgctcgc tgtgcttcca 2040 ggacgtccac catgctacgc acagctgtac ttttatgatt caagctcgga gggtcagtac 2100 gacgaaatgc tgaacgctcg ggccaaagca tatggtagtg agctgaatcg tgaaatattg 2160 gccattttgc aacgtgtttt ttcgactcac aatccgattg cagagatgtt taagcatgcg 2220 tacgaacgga tgtctgttcg agatgatctg cggctttgca tccactcgcg tataccaggc 2280 atcgatcaac ggcgttacaa tgcaccgact gcagatgaag ttggtggcgt gttcgtgtgc 2340 gatgatacta ccggagccac ggaacataca cgcgatattg tgctgcagca tcgtgccaca 2400 ggatacttgc agcctgtgtt tgataccaac cagttggcag atgcactcca atacccgatt 2460 ctgtttccgc ggggtgaaac cggttggact tacggtatgc caaaggtaca gcgccgaacc 2520 agacagcagc aacaaccaag cagaccaaca agttcgagca acggagcggc gcacatagat 2580 gaaaatgatg atgctgctga ggatcctaat cctgagttgg atggagcaga gcgttcagca 2640 aatcaaataa cgccacgcga atatgccgca tatcgcatag cctggcggga aaacgctcaa 2700 acgcttatgc atcgtgccgg acgattgttt cagcagtact gtgtagacca gtactgtaag 2760 atagagatgc agcggttgaa atatttgcga gagcaccagg cacaacttcg cactgaggcg 2820 tacacgggct ttatggatct tgccggcgct gaaggtacca tcgcggatct tcatcccgat 2880 aatttgccta atgtggctgt ggaaatccct gtagtgcaag aaccacctgt tgcaccccgt 2940 cgttccatca tcgatccgcc ctcgaacctg tcgcgtacgg gcacgcgtgt tattctcgga 3000 ccatcatttg tcggtagcaa tcgatatatg cgagcgcaat atcaagacgc aatggctatt 3060 gttcgagctt tgggtaaacc ggacctgttt atcaccgtca cgtgcaatcc gaagtggccg 3120 gaaatcacac agtgcctact tccacgtcag caggccccgg atcgacccga tgttatcgta 3180 cgtgtctttc ggttaaagct gaaagccata ctgaatgacc taaccatggg agctctcggg 3240 attgaggtag ctcgcattca cgtgatcgag ttccaaaaac gtggccttcc ccatgcacac 3300 atactcgtga ttctcgccga ggaagacaaa ccgcagaccc cggcagacta cgacaagatc 3360 gtgtctgccg aacttcctaa tcctgcaacg tcgtcgcaac tgttcgagac ggtacagagc 3420 tgtatgatgc atggaccgtg tggggctgcc aatcctgccg caccttgcat gaaggatggt 3480 acatgcgaaa aagggttccc aaagtcattc tgcgaacaga cgcgcagcat ggataatggc 3540 tatccacagt accgtcgtcg taacaatggg cgcagtgtga cggtgaaagg aatcgagctg 3600 gacaaccggt acgtcgtgcc ctacaaccca tggttcacgc ataagtacaa ttgccacatc 3660 aacgttgagg tgtgtacttc gatcagcagc gtgaaatatt tgtacaaata cgtgtacaag 3720 gggcacgacc gtctgagcgt taccctggcg gtaggcaacg atgaaattca gcagcatatc 3780 gatgcacgct acctttcacc gacggacagc tgctggcgaa taatgcgctt tgagttgcaa 3840 gcaaaaacgc acaccgttgt tacactgcct atccatctgg aaaatcaaca gaacgtgttt 3900 ttccgggcaa acgaaaccgt ttcgtgcgtg ttaaaccgtg gtaaccatac catgttgaca 3960 cgattcttcc agctggcggc acatgacaac tttgccagag cgctgctgta ccacgatgtc 4020 ccgacgtact accggtacgc gaaaccaact gcgaaccagc gactaccgtg gcaggaaccg 4080 ggaacaaagc actggatccg tcgcatacgt accgggcaca agacagtgat cggccgaatg 4140 gtgtcctgta gcatgcagct tatggaacgc tactgcttgc ggttgcttct ttgctaccgc 4200 aagggtccaa catcgtacga ggatctgcgt actgttaacg gttcggtgtg cgaaacgttt 4260 cagcaggctg ccatcaacga agggctgctt gaggatgatt ccgagtggga tcgtgctcta 4320 gaagaagcgg ccacttatcg tatgcccgcc cagttgcgcc atttcttcgc actcatcctt 4380 tcgtctggga tgccacaaaa cccgcgcacc ctgtgggaaa gctatgcgag tgaaatgagt 4440 gaagattttc accaccgcaa tcgcgaccgg tacacgacgg aggaatccga cctgaataag 4500 ctgctacgtg atgttgaaca cttccgggcg ctgagtgaca tcgatcgcta cctgcgcggt 4560 acgacgccat cgaaggtttt gaccagttcc ccaggaatgc ctcagctttc ggagtatgaa 4620 cacgttcagg cacacatcat ggacgatgat gatgtcaacg agttcattat cgctgaacga 4680 tcgtatctga tcacggatct cgatgctacc cttgccaccg tacatcagct caacgacgaa 4740 caacgcatgg tgtacgaaac ggtcacggcg gcaatcgatc gtcaattagc gacggcggca 4800 agccaagcga acgctggtga ccaacggtta ttcttcctgg atggccctgg tggaacgggt 4860 aaatcttttt tggtagaaaa gatactggca cacgtccgtc gctgcggaga aattgcgctc 4920 gcaactgcag caagcggcat agcagcactg ttgcttacag gagggaaaac agtacactcc 4980 acgtttaagt tgccgctgga cttaaacaat cattccacct gtagcattac ggttcagtcg 5040 aaacgggccg aaatgcttcg acaaacagca ctgatcgttt gggatgaggc gtcgatgagc 5100 agtcggtttg ctctcgaagc agtcgatcgg accctgcagg acataacggg tgtgcagctt 5160 cctttcggcg gtaaggtggt gctgctgtcc ggtgactttc ggcaaatttt accgatcgta 5220 ccgaagggca cggatgcaca aatcatcaac gagtgcatca agaagagcac attatggccc 5280 ctgtttaggt cgctacaatt gcgcgataac atgcgggtac gcacggcacc aaacgcgaac 5340 caagccagtg agttgcgaga ttttgccaac cttctgcttc gtatcggtga aggacggcac 5400 gatacgtttg caggactgga tccatcgttg gcaaaaatac cgcacgatat gattgtgccc 5460 catactgcga atccgacaaa cgaccttaac accctgatcg acaaaatcta cccggacatg 5520 caacggcact tccaacatcc gtcattcttt tcggatcggg ctattctgtc gccgcttaac 5580 gtggatgttg ccagcgtgaa caacctggtc ctagaccgaa ttcctggacc ggaacaggag 5640 taccgttcgg tcgatacatt ggtcaacccg gaggaacacg agcatctgca acttccttcc 5700 gaatacttga acacactcaa tgtcagcggc atcccagtgc atcgcttgcg gctgaaacga 5760 tttgcaccag tacttttatt gcgcaatctt aattccgaca tgggattgtg taacggtacg 5820 cgtttgcaaa ttgtaggcct aaagcgaaac tgtttacacg ctaaaattct gacaggcacg 5880 cggagaggcg aagacgtcct gcttccacgg atcttctgtg acagcaacga taagggtcac 5940 ccgttccaga tccgccgtaa acagtttccg gtgcaagtgt gctttgcgat gaccatcaac 6000 aagtcgcaag ggcaatcgct tcaccatttg ggcctatatc tgccgcaaga tgttttcgcc 6060 catggccagc tatacgttgc actctcgcgg gtgacatcac gagcgaacat tgctgtgctg 6120 ataccgaacc cgaaacgcgc tgacgaggaa ggtgtctcca caagcaacat cgtctaccga 6180 gaggtcgttg acagatgatt cttctactga accagagtgt gacgtaatta ctgaaaacag 6240 aaaattagac attacataaa tataaatatc cattctttca gtttgaaccg gctattcact 6300 aacaatacat ctaccagaac ctagaacaag acggcagcaa aatacaaaca atacatgtgt 6360 acatcttcct tggctatcct tactgcattt tcgtttcatg cagcgcctgc aaccaaaacg 6420 atcacgtaaa cgacgggaca atcaattgaa cttccatgct ggaaggtgga gctttacttc 6480 atgcaccacg ttccatgcgc acctgacttt gactgaacac cccacatttc caggctctcc 6540 agcaatcttt ccttggaatg ccatgagact tccaatttcc acgaggaacg aggtggcatg 6600 ctccagttgt ttctgtgcag cctggagttt ccgtgcgtag cactgtaacc gggcctctta 6660 agctag 6666 // ID RETRO33_AG_LTR repbase; DNA; ANG; 479 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO33_AG DE retrotransposon - a consensus. XX KW BEL; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; RETRO33_AG_I; RETRO33_AG_LTR; ROO; KW retrotransposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-479 RA Jurka J. and Drazkiewicz A.; RT "RETRO33_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 9-9 (2002). XX DR [1] (Consensus) XX CC Related to BEL and ROO from Drosophila melanogaster. 5 bp target CC site duplication. XX SQ Sequence 479 BP; 150 A; 102 C; 98 G; 129 T; 0 other; tgttggattt tgagataggc agaacacacc cgcaagttag gcaatgaatt aataacgccc 60 atgcataggc aacacgcatt acatgtattt tcccattaga cagaaatccc atttaaacca 120 gttcaaatcc agaatccagt tggaatagtg ccaacttaac aagaattaaa caagaaaatg 180 tatgtatcgc taacggagca cgtaggctcc tgataagaac accccagaaa ttaccaataa 240 atcgagttag aattccagtg taattactta gttcttcctc caagtttttg gtgagcattg 300 ctcttattgg gaaattcgat tctcattaag agtggcaaga cctctttggc aaaatccgtt 360 gttagatagg cggaagtttg catcggggtg ccagcctttt tcaaggcttt tgctcgggag 420 tactgctccc ctttctaaaa atacagtcca cgagacggag taggccgtag aaacctaca 479 // ID Mariner1_AG repbase; DNA; ANG; 943 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Mariner DNA transposon - a consensus sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Mariner1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-943 RA Jurka J.; RT "Putative mariner/Tc1-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 645-645 (2009). XX DR [1] (Consensus) XX CC TA TSD. XX FH Key Location/Qualifiers FT CDS 141..821 FT /product="Mariner1_AG_1p" FT /translation="MPKKSTLWKRITCIMMIRAGRDNRDIMQCVQCSLNTV FT KAIRSELGENSDDDEAVAARKPHSQRRDCVRTDAFLADLQARVMENPGIGI FT RPLAREMGVADEYISMLDTVVKPWITRVANGRPYVFQQDSAPYHTASKTIK FT WLAANFNDFTAPNVWPPSSPDLNPMDYFVWGAVERDTNRTSSNTKAELMAK FT IRSVFAALPRETVARACSRFRRRVEAVIEAEGGYFE*" XX SQ Sequence 943 BP; 257 A; 229 C; 259 G; 198 T; 0 other; tacagggcgt tcgggataat tttgacagtt acatttttgt ttataactcg tgatcgagtt 60 ggcataggaa aacaagcaat acgtcattcg acagctaaaa gtgtacggaa caagcagcga 120 aaacatcccg tgataaaaaa atgccaaaaa agagtacatt gtggaagcgg atcacatgca 180 tcatgatgat ccgcgccgga cgcgacaacc gcgacatcat gcagtgcgtg cagtgttcgc 240 tgaacacggt gaaggcgatc cggagtgagc tgggagaaaa cagcgacgac gacgaagcgg 300 tggcagcacg aaaaccacat tctcagcggc gggattgcgt gcggaccgac gctttcctgg 360 ctgacctcca ggcgcgcgtg atggaaaatc cggggatcgg aattcggccg ttggcacgtg 420 agatgggggt ggcggacgag tacatttcca tgctggacac cgtcgtgaag ccttggatca 480 cgagagtggc caacggcaga ccgtacgtgt tccagcagga ttccgctccg taccatacag 540 cctccaaaac gataaagtgg ttggcggcca atttcaacga cttcaccgcg ccgaatgtgt 600 ggcctcccag ctccccggat cttaatccaa tggattattt cgtgtggggc gcggtggaac 660 gggacaccaa cagaacctcc agcaacacca aggcggagct gatggcgaaa atcaggtccg 720 ttttcgcggc cctaccccgc gaaactgtcg ccagggcttg ttcccggttc cggagacggg 780 tggaggccgt aatagaagcc gagggcggat attttgaata aaatgaataa aatacatggt 840 aagctactat ttctaaaaca aaaaaaaatt ccccgatgat taattttgcc ataatttttt 900 tttctccgta ggtgactgtc aaaattatcc cgaacgccct gta 943 // ID RTE-1_AG repbase; DNA; ANG; 3314 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 29-OCT-2010 (Rel. 15.11, Last updated, Version 2) XX DE RTE-1_AG is a RTE-like non-LTR retrotransposon - a consensus DE sequence. XX KW RTE; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; RTE clade; AGRP1; RTE-1_AG; KW Ag-JAMMIN-2. XX NM RTE-1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1522-2432 RA Reiss A.R.; RT "A study of repetitive DNA elements in the malaria vector, RT Anopheles gambiae."; RL Thesis (1991). XX RN [2] RP 1522-2432 RA Reiss A.R., MacIntyre J.R. and Hagedorn H.H.; RT "A repetitive element of the Malaria vector, Anopheles gambiae."; RL Unpublished (1993). XX RN [3] RP 1-3314 RA Kapitonov V.V. and Jurka J.; RT "RTE-1_AG, a family of RTE-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 24-23 (2002). XX RN [4] RP 1-3314 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX DR [3] (Consensus) XX CC RTE-1_AG is a family of RTE-like non-LTR retrotransposons. CC The RTE-1_AG consensus sequence was reconstructed based on CC multiple alignment of ~100 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC RTE-1_AG occurred less than 1 million years ago. The CC RTE-1_AG family is composed of several subfamilies. CC The consensus sequence encodes a 972-aa RTE-1_AGp protein CC (positions 373 3288), which is composed of the AP endonuclease CC and reverse transcriptase domains. The 3' terminus is composed CC of the TAAG microsatellite. CC This is the same element as Ag-JAMMIN-2 in [4]. XX FH Key Location/Qualifiers FT CDS 373..3288 FT /product="RTE-1_AGp" FT /translation="MGSWNVRTLSEAGALKKLDDALATLSMDLVALQEIRW FT LGNGVHNRRGKHCYDIYYSCHDRHRVLGTGFAVGPRLKPAIMDFKAINDRL FT CTLRMRGKFFNISLINVHAPTEDKEEEEKDLFYGRLARIVDACPRHDLIII FT LGDFNAKVGREPMYRQYTGCHSLHEHSNDNGSRLVQFAAANNLVVGSTKFA FT RKKIHKITWAHPGGESFNQIDHVLISRRRQSSLLNVRTYRGANIDSDHYLV FT GLVIRCRIARPRANGGGENTQARLNTDSLRDIAVQQEFKTALEESLLPEDR FT YETTSERWNALKTKIINCARNILPPRRGNTKSGWFDDECRQVTERKNTAYR FT AMQQRHRTRACAEEYSRLRREEKRVHRSKKHALEEQNMRELEQTREAYGPT FT RKFYQAIAGHRNNVVPKVTCCRNKDGDLVSNQPEVLSRWAQYFDELLNDQF FT SEQLEAPLADNVMLLPPSIEETRKAIRRLKNNKAPGTDGIAAELVKNGGAR FT LENEIHQIVTEVWDSESMPCDWNLGIIYPIYKKGDRLDCNNYRGITVLNTA FT YKIFSLILQDRLVPHVEEIVGNYQRGFRNGKSTTDQIFTMRQILEKMAEYK FT NDTYHLFIDFKAAYDSIARVKLYDAMSSFGIPAKLIRLVRMTMTNVTCQVR FT VDGKLSGPFATTKGLRQGDGLACLLFNLALERAIRDSRVETTGTIFYKSTQ FT ILAYADDIDIIGLRLSYVAEAYQGIEQAAENLGLQINEAKTKLMVATSADL FT PINNPNLRRRDVQIGERTFEVVPEFTYLGSKVSNDNSMEVELRARMLAANR FT SFYSLKKQFTSKNLSRRTKLGLYSTYIVPVLTYASETWTLSKSDEALLAAF FT ERKMLRRILGPVCVEGQWRSRYNDELYEMYGDLTVVQRIKLARLRWAGHVV FT RMETDDPARKVFLGRPQGQRRRGRPKLRWQDGVEASAIKAGITDWQTKARD FT RERFRTLLRQAKTAKRL" XX SQ Sequence 3314 BP; 968 A; 825 C; 878 G; 643 T; 0 other; tgctctgtaa tgggatggga tccgaagccc cattgaggga taatacaggc tctcccatcc 60 aactcctatt ccgacacgtc ctcgtcgtgc agagtggtaa catgtgtcac cacatatcca 120 agcttgggta cacgggcttg acccaacccc ttgggcggat ggtggcatat ggcgaaccag 180 gaggggggtg ggtatcccgg gaaactgggt gcctgaacgt cgggggggca acccgacgct 240 aaacaaaacg gtcacgtggg cgcggggcca gtcacccagt cccatgaatc aacgactacg 300 gaaagcacca gaaattttcg aaacggacct aacgataccg accctacgca atgcccccgg 360 actactctaa aaatgggctc atggaacgta cgcactctaa gcgaagccgg agccttgaaa 420 aaacttgatg atgccctagc cacactgagc atggacctcg tagctctaca agagattcgg 480 tggctaggga acggtgtgca caacaggcgt ggtaagcatt gctacgacat atactacagc 540 tgccacgacc gccaccgcgt gctcggaacg ggtttcgccg taggtccccg gttgaaaccc 600 gcaatcatgg atttcaaggc tataaacgat aggctatgca ccctgcgcat gcgaggcaaa 660 ttctttaata taagcctcat aaacgttcac gcccctaccg aagataaaga ggaagaggag 720 aaggaccttt tttacggccg cctcgctaga attgtagatg cgtgccccag gcatgacctc 780 ataatcatcc tgggggactt caacgcaaaa gtcggtaggg agccaatgta ccgccaatac 840 actggctgtc acagtctgca tgagcacagt aacgataatg gtagtagatt ggtccagttc 900 gccgcagcga acaatctggt tgtaggaagt accaaatttg cgcgcaagaa aatccacaag 960 attacgtggg cgcacccggg tggagaatcc ttcaaccaga tcgaccacgt gttaataagc 1020 cgccgacgac agtcgagtct gttaaatgtc agaacatatc gaggagccaa tatcgattcc 1080 gatcactact tggttggctt agtgatacgt tgtagaatcg cccgcccccg cgccaatggg 1140 ggcggagaaa acacgcaggc tcggctcaac acggactctc taagggacat tgctgtccaa 1200 caggaattca aaaccgcttt agaagagtct ctactaccag aagacagata cgaaactacg 1260 agcgagaggt ggaacgctct aaaaacaaaa ataataaact gtgcaagaaa tatactccca 1320 ccacgtcgtg gcaacaccaa atctggctgg ttcgacgatg aatgcagaca agtgaccgaa 1380 cgtaagaata ctgcataccg agcaatgcag caacggcata gaacgcgggc atgcgcagag 1440 gaatattcac ggcttagacg cgaagagaaa cgagttcacc gctccaagaa gcatgctttg 1500 gaagagcaaa acatgcggga actcgagcaa accagagagg cgtacggacc gacacgaaag 1560 ttttaccaag cgatagcagg tcaccgaaac aacgttgtac ctaaggtaac ctgctgtcgc 1620 aacaaggatg gagatctggt cagtaaccag ccagaggtcc tctcgcggtg ggctcagtac 1680 tttgatgaat tactcaatga ccagtttagc gaacagctag aagcgccact agcagataat 1740 gtcatgctac tgccacctag catagaagaa acacgaaagg ctatccgtcg gctgaaaaat 1800 aacaaggcac ccggaaccga cggaattgca gccgaactgg tcaagaatgg aggtgcacga 1860 ctagaaaacg agattcatca aattgttact gaggtgtggg atagcgaatc gatgccttgt 1920 gattggaatc tcggcatcat ctaccccata tacaagaagg gagacaggtt ggactgcaac 1980 aactacaggg gtattacggt gttgaatacc gcctataaaa tattctccct gatccttcag 2040 gatcgccttg tcccgcacgt cgaagagata gtaggaaact atcaaagagg attccgaaac 2100 ggaaaatcaa ccactgatca gatcttcacc atgcggcaga tcttggagaa gatggctgaa 2160 tacaaaaacg acacatacca tctcttcata gacttcaaag ccgcatacga tagcatagcc 2220 agggtaaaac tgtacgacgc tatgagctca tttggaatcc cggccaaact gataaggcta 2280 gttagaatga ctatgaccaa cgtcacatgc caggtgaggg tggatggaaa actctcagga 2340 ccttttgcta ccaccaaggg tctgcgccag ggggacgggc ttgcctgtct cctattcaac 2400 ttggcgctag agagggccat ccgcgactcg agggtggaga ctacgggaac catcttctat 2460 aagtcaaccc agatcctggc atacgctgat gatatagaca tcattggtct gcggctctcc 2520 tatgtagcag aagcctacca agggattgag caggcggcag agaacctcgg attgcagata 2580 aacgaggcaa agaccaaact gatggtggca acatcagcgg acctaccaat aaataatcca 2640 aatctacgta ggcgtgatgt acagataggt gaacgcactt ttgaagtcgt cccagaattc 2700 acctatcttg ggtcaaaggt cagcaacgac aacagtatgg aagttgagtt gcgcgcaagg 2760 atgctggctg ccaaccggtc attctacagc ctgaaaaagc agttcacctc aaagaacctg 2820 tcgcgacgga cgaagctggg actatatagt acctatatag taccagtact cacatacgcc 2880 tctgagacat ggacactgtc caaatctgac gaagccctct tagccgcgtt cgagaggaag 2940 atgctcagaa ggatacttgg ccccgtatgt gtggaaggac aatggaggag ccgctataat 3000 gacgagctat acgagatgta cggcgacctc actgtcgtac agcgtattaa gctcgccagg 3060 ctccggtggg ctggccatgt tgtacgcatg gaaacggacg acccagcccg taaagtcttt 3120 ttaggccgtc cacaaggaca gaggaggcgt ggtaggccca aattgaggtg gcaagatggc 3180 gtggaggcgt ccgccattaa ggccgggata acggactggc agacgaaggc gcgagaccgt 3240 gagcggtttc ggacactcct gaggcaggcc aagaccgcaa agcggttgta gcgccggata 3300 agtaagtaag taag 3314 // ID Tc1-1_AG repbase; DNA; ANG; 1413 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE Tc1-1_AG is an autonomous DNA transposon - a consensus sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Tc1-1_AG; Topi; KW Autonomous DNA transposon; mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Grossman L.G., Cornel J.A., Salazar-Rafferty C., Robertson M.H. RA and Collins H.F.; RT "Tsessebe, Topi and Tiang: Three distinct subfamilies of Tc1-like RT transposable elements in the malaria vector, Anopheles gambiae."; RL Direct Submission to Genbank (JAN-1999). XX RN [2] RP 1-1413 RA Kapitonov V.V. and Jurka J.; RT "Tc1-1_AG, a family of autonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(4), 84-84 (2003). XX DR [1] (Consensus) XX CC There are ~50 copies of Tc1-1_AG in the genome. CC They are ~98% identical to the consensus sequence. CC This element has 24-bp terminal inverted repeats. CC This transposon is inserted preferentially into the GTAC target CC site (the TA target site duplication). CC Tc1-1_AG encodes the 332-aa Tc1-1_AGp transposase (pos. CC 272-1267). CC RADPFKTCTRIKQELGLQVSAKTVSRRLHAAGFCARRPRKVRKLxPHHVEARIRFAEEHLAASIFWWSKI CC IFSDESRINLDGSDGIKYVWRFPNQAYHPKNTIKTLSHGGGHVMVWGCFSWHGTGPLFRINGTLNSEGYR CC KILSRKMLPYARQQFGDEEHYIFQHDNDSKHTSRTVKCYLANQDVQVLPWPALSPDLNPIENLWSTLKRQ CC LKNQPARSADDLWTRCKVMWERIxRSECRNLIGDMAKRCQEVIANNGHQIDR. XX FH Key Location/Qualifiers FT CDS 272..481 FT /product="Tc1-1_AGp" FT /translation="MGRGKHCTPEERKHIQGLYRENVPIKTICKAFGRSRT FT FVDNAIRSEATGKSTGRPRKTTADVDAQIVEMI" XX SQ Sequence 1413 BP; 378 A; 327 C; 347 G; 352 T; 9 other; cactggtgga catwaaaata ggaaaaaaaa attttktgtg ttttttttgc rtacactagg 60 tgtgtcatcg ccaacagttc gaggcgtact gaatttgcaa tagccgawtt ttgcctttta 120 tactctcatt tttggtgcgc tacttatctg agttttgagt gccggcagcg ttttgcggca 180 gtataaaagg agagccaaac cgtgttgygc ttcattcttc catcagtggt caaggtgaaa 240 ggttgctatt taaaaattcc acaaaatcat catgggtcgc ggaaagcact gcacaccaga 300 ggagcgaaaa cacattcagg ggctgtatcg tgagaacgtg ccaatcaaga caatctgcaa 360 ggcgttcggc cgttcgcgga cgtttgtgga caacgccatt cgtagcgagg ctacgggtaa 420 atcgacaggt cgtccgcgaa aaacaacggc cgacgttgac gcgcagattg ttgagatgat 480 cagggcggat cctttcaaaa cttgcactcg tatcaagcag gagcttggtt tgcaagtttc 540 ggcgaaaacg gtgtctcgcc gtttacacgc cgctgggttc tgtgcccgga gaccgmggaa 600 ggttcgtaag ctgcwgccgc accacgtaga agcgcgcatt cggtttgccg aagaacattt 660 agctgcatcc atcttttggt ggagcaaaat cattttttcg gatgagtcca gaatcaatct 720 ggatggttcg gacggcatta aatacgtctg gcgctttcct aatcaggcgt atcatccgaa 780 aaatacgata aaaaccctaa gtcacggagg cggccacgta atggtgtggg gttgcttctc 840 ctggcacggc acgggtcctt tgttccgtat caacgggaca ctgaactcgg aagggtatag 900 aaaaatactt agtcgtaaga tgttgccata cgcccgacaa caattcggag acgaagagca 960 ttacatcttt cagcatgata acgactctaa gcacacatcg cgaacagtta aatgttattt 1020 ggcaaaccag gatgtgcaag ttctaccgtg gcctgcgttg agtcctgacc tcaacccgat 1080 tgaaaatctg tggtcaactc tcaagcgtca gcttaagaac cagcctgcac gttcagccga 1140 tgatctatgg acacgctgca aggttatgtg ggaacgcata smcagaagcg aatgccgtaa 1200 tctcatcggc gatatggcca aacgctgtca ggaagtgata gcgaataacg gtcaccagat 1260 tgaccgttag aatgtgtttt cgctcagtgg aacaccgcaa cacctcctcc caacaaccac 1320 ttaaacagtc cttttcaggg ctaccaaata tttctgacaa gaactatgtc tttcggtcac 1380 agaaaaaagt tcctattttt atgtccacca gtg 1413 // ID piggyBac1_AG repbase; DNA; ANG; 3338 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE piggyBac1_AG is an autonomous DNA transposon - a consensus DE sequence. XX KW piggyBac; DNA transposon; Transposable Element; Nonautonomous; KW nonautonomous DNA transposon; piggyBac superfamily; piggyBac1_AG; KW piggyBacN1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3338 RA Kapitonov V.V. and Jurka J.; RT "piggyBac1_AG: a family of autonomous piggyBac-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(3), 65-65 (2003). XX DR [1] (Consensus) XX CC There are 5 copies of piggyBacN9_AG in the genome, CC they are ~98% identical to the consensus sequence. CC This family is characterized by the TTAA target site CC duplications, CC and by the 17-bp terminal inverted repeats (1 mismatch). CC The genome harbors several families of piggyBac-like elements. CC An ancestral form of piggyBac1_AG was directly related to CC the piggyBacN1_AG nonautonomous family. CC piggyBac1_AG encodes the 552-aa piggyBac1_AGp transposase CC (positions 1165-2820). XX FH Key Location/Qualifiers FT CDS 1165..2820 FT /product="piggyBac1_AGp" FT /translation="MLKKYNYQSIILSKLIVMTITQNLSNFQAQHITHDGT FT IWSSQPTIRRKILAHNILRSSQSGPTRKTEGLSIIKTFKLLMSDEMIDIIV FT RETNRKAQQIYEREQVTNSAKSASMHTWKTLTTSEFEAYLGILLLAGVMRS FT NYVHSTELWKTSSHPIFRATMSLQRFRSINRFIRFDDGRTREVRKEMDKSA FT AISDIFAMLNRNLQACYVAGSHVTVDEQLYPYRGGTGFTQYIPSKPAKYGI FT KVWWVCDAVTSYPIKGQIYTGLAPSGQRERHQGERVVKDLCRIFRGSGRSV FT ICDIFFTSYNLASSLMSDFKLALLGTVNKRRTFVPPMFANPHGREIQSTLY FT GFSENISICSYIPKKNKSVVMLSTMHYDKDVQGPKEKPAMIIDYNKFKGGV FT DNMDKCLSEYSTKRKTNRWPLAFFFNILDIAAFAAFKIYKENNLQNCQSTD FT YRRMFLQQLSEQLTMPEIQRRSENVQIMRHFGPRSGVESMLGGPLIALSKK FT NTTNDVEERDVSGRIKHKGNCYLCVKKRSTRKSCSTCKKPICTLHSVIKTS FT CKSCD" XX SQ Sequence 3338 BP; 1162 A; 564 C; 593 G; 962 T; 57 other; ccgtcatgtg tacgaccgcc tacccgggta agttttgcat ctttatttca aaatatcttt 60 gaattgagaa gtcgtatcaa ctccaaattt ttaaaatgcc tcagaaaaga tgttttctaa 120 ccatgtttaa aaaataaaaa atttcaaaaa tatttgaatt tattttttat tatttaactt 180 tatataccct tcacatgtac aaccgcctac ccgggtaggc catacatttt gtatggaagg 240 atgatgtttt tcttggttat aagggtatat attttgagtt acttgtaaga tcagaaagaa 300 aaccgtctca taggttcata ataatgttta ctaacatatc gwagtaaaat tagaatttta 360 aaaatcctrt maatattttt ttcaatcaaa aatatgttgt aactccaaca ccccaaccgg 420 ccttgccrca agtttkgagt agatgcttcg aatttttatt gcttgttgtg taacaatata 480 ttgaaaataa cacaaaaatc attttmaaat gatawmaacy astastrayw wacgaaatak 540 aatwckmtgg ctgaaaktty acacacacat tcatattttt rttatagtgt twctgcgraa 600 acactctctg gcctacgcta gttttctgmc gtgtctctct gcrtctgcct gcacmagcaa 660 attgcgtatt tttggctcat ttcggtagta ggyarrkwar kygaacaatg aaaaaccaag 720 attctcggaa aatattgaac acaatcacaa tcagacaaga cttcacgaga cggcacgtgc 780 agctcgatcg cacaaagaat attgttcact tcatgttcaa aaacgaatcg agaaggttgt 840 gttagtacag tgatcaagag attggttgaa taatttgcgt taacaatctt tcgaaagaag 900 tgagtagtgg aaaaatatag tgaatgtatt gttgagatcg aaatcgtttt ttatcacatt 960 ttttactgtt tactctagaa atggaggaaa ttgaatggct aactccagag gaactttcaa 1020 atcttactga agaagaaaga acagcttata ctcaacaaaa aatacatgaa tttatggaat 1080 acggtagcga agatgaaaat atgatttctg cagtagtaga agaagatgaa attcctgtat 1140 ttgatcaagt gatcatcgat tctgatgctg aagaagtaca actaccagag tataatcctc 1200 tcgaaactga tagtgatgac gattacgcag aatctttcca actttcaagc gcaacatata 1260 acacacgatg gaactatttg gtcaagtcaa ccaaccatac gccgtaaaat attagcccac 1320 aatattttac ggtcttcaca gtcaggtcca acgcgcaaaa cggaaggact tagcataata 1380 aagaccttca agctgttgat gtctgacgaa atgattgata tcattgttcg ggaaacaaat 1440 aggaaagcgc aacaaattta tgaacgcgag caagtcacaa actcagcgaa atcagcaagc 1500 atgcatacat ggaaaacatt aaccacatca gaattcgaag catatcttgg gattttatta 1560 ctagccggag ttatgcgttc taattacgta cactctacgg aactatggaa aacatcttca 1620 catccaatat ttcgtgcaac aatgagtctc caacgatttc ggtcgatcaa tcgtttcatt 1680 cgctttgacg acggacgaac gagagaagtt cgaaaggaaa tggacaaatc agcagcaata 1740 tcggatatat ttgcaatgtt gaatagaaat ttacaagcat gctatgttgc aggttcgcat 1800 gtaaccgttg atgaacagtt atatccttat cgtggtggca caggatttac ccagtatata 1860 ccatctaaac cagcgaagta tgggataaaa gtgtggtggg tatgtgacgc agtcacatca 1920 taccccatca aaggccaaat atatacaggg ttagcaccat ctggacaaag ggaaagacac 1980 caaggtgagc gtgtagtaaa agatttatgt agaatatttc gcggaagtgg tcgtagcgta 2040 atttgtgata tttttttcac aagttacaac ttggcctcat cattaatgtc agatttcaaa 2100 ctagcacttt taggaacagt aaacaaacga cgaacgtttg ttccgccaat gtttgctaac 2160 cctcacggta gggaaatcca atcgacatta tatggcttta gcgaaaacat ttccatttgt 2220 tcctatattc ctaagaagaa caagtccgtt gttatgttgt caacaatgca ttatgacaaa 2280 gatgtccaag gtccaaaaga aaaacctgca atgataattg attataataa attcaaggga 2340 ggagtcgaca acatggacaa atgcctttct gaatattcta ccaagaggaa gacgaataga 2400 tggcctttag catttttttt caacattttg gacatagctg catttgctgc tttcaaaata 2460 tataaggaaa ataatttgca aaactgccaa tctacggatt atagacgcat gtttttgcaa 2520 caattgtccg aacaactaac gatgcccgag attcagagac gttcggaaaa tgtgcaaata 2580 atgcgacact ttggaccacg cagcggagtg gaaagcatgt taggaggacc attgatagca 2640 ttatctaaaa aaaatactac taacgatgta gaagaacgtg atgtgtccgg acgaatcaaa 2700 cacaaaggca attgctatct atgtgtaaaa aaacgatcaa cgagaaaatc gtgttctact 2760 tgtaaaaagc caatttgtac gctgcacagt gttattaaaa ctagttgtaa atcgtgtgat 2820 taaaaagatg racttttatg taaaaggaat tatattggca tttgttyrrm awyaaaaact 2880 tgaaataaay atatgaaaat taaaaaaaaa acataataar catatcttat agatcgtcaa 2940 catatakttt tatacaacaa ggaaaacaac caaaagatgt atactttttg aaawtttawg 3000 atttagattt aacttttgaa agraacgtaa rtamaactta gaaaaagaca taaacatcca 3060 ctcaaacttg mggcaaggcc ggttrgggtt ttggagttac aacatatttt tgattgaaaa 3120 atwtwttgat aggattttta aaattctawt tttwctacgm tatgtttagt aaacattatt 3180 ataaacctat gagacagttt tcattcaaat ttaacaaata actcaaaaga tatacgcttt 3240 taaccacgaa aaacatcatt cctccacaca aaatgtatgg cctacccagg taggcggttg 3300 tacggttacg tagagttttt tggtcgtaca caagacgg 3338 // ID MARINERN7_AG repbase; DNA; ANG; 379 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE MARINERN7_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN7_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-379 RA Kapitonov V.V. and Jurka J.; RT "MARINERN7_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 24-24 (2003). XX DR [1] (Consensus) XX CC There are ~500 copies of MARINERN7_AG in the genome (multiple CC subfamilies), they are ~95% identical to the consensus sequence. CC MARINERN7_AG copies are flanked by the TA target site CC duplications. CC This element has 23-bp terminal inverted repeats. CC Classification: a nonautonomous Mariner/Tc1-like DNA transposon. XX SQ Sequence 379 BP; 120 A; 82 C; 71 G; 106 T; 0 other; cagtagaacg tcgattatcc gggcagctcg ggaccggacg gttgccggtt aatcgatttg 60 cacggataat ggtccaagaa atgtcaaatt catataaaaa ttataaaatt cactagtttt 120 atgattaatt caccttgaat caatcgatta atcgttagta gaatattatt ttaatcaata 180 actataggtt taacaactgt tcatgaacaa aaatgaattt cgaaaatacc actagacact 240 acagagctgc acaaaaaact ggttgcccac ttgactatcg ctgcaaatgt tcgccgtacg 300 tttgaacagc tgtcacattt atgcgcacgg ttaagccgcc cgccggttaa tccgcccccg 360 gataatcgac gttctactg 379 // ID GYPSY15-I_AG repbase; DNA; ANG; 5806 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY15-I_AG is an internal portion of retrotransposon GYPSY15_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY15-I_AG; GYPSY15-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY15_AG; mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5806 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY15_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 171-171 (2003). XX DR [1] (Consensus) XX CC GYPSY15_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY9_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, CC GYPSY13_AG, CC GYPSY14_AG, GYPSY16_AG, and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY15-I_AG consensus was reconstructed after multiple CC alignment of 6 copies. CC The consensus encodes the 424-aa GYPSY15_AG1p gag-like protein CC (pos. 857-2128) and the 1223-aa GYPSY15_AG2p (pos. 2089-5757). CC The sequence of the LTRs flanking GYPSY15-I_AG is deposited as CC GYPSY15-LTR_AG. XX FH Key Location/Qualifiers FT CDS 2089..5757 FT /product="GYPSY15_AG2p" FT /translation="FFRPLSSIRKYPILTINTDADNFVKVKIEIAKEIYST FT LIIDTGATVSVLKASKLKPGCKINTSKKLTLISSSDHESETLGTAMTTIHF FT GDYSIIHEFHIIEDVESIFSDGLLGKDFIKHRCIVDYVNWMIYFSSDNGLI FT SHPIEDNVNGNYILPKRSEVVRKITIPNLTEDSIILSQEIQPGVFCGNTIV FT SKRNQYIKFINTTDKDVSFEIKSYTPEIEPLRDYEQLQRKPNTARERIEKI FT HNKIRIENIPQIAREELENLITKFSDIFCLEDEPVSTNNFYTQEISLKDNI FT PFYIPNYKQIHSQSEEMQSQVEKMLKSNIIEHSVSSYNSPILLLVPKKSVE FT GKKKWRLVVDFRQLNKKILPDKFPLPRIDTILDQLGRAKYFSTLDLMSGFH FT QIKLDKNSRKYTAFSTPTGHYQFTRMPFGLNISPNSFQRMMAIAMAGLTPE FT LAFVYIDDIIVTGCSARHHISNLGEVFDRLRKYNLKLNAEKCCFFTTEVTY FT LGHKITDKGIYPDDAKFDTIKNFPIPTNADEARRFVAFCNYYRKFVQNFDK FT IAKPINHLIKKDVKFAWTSECQAAFDTLKQSLLSPTILQYPDFKKQFIITT FT DASDMACGAVLSQITDGNDLPVAFASKSFTPGEKNKPIIEKELTAIHWAIN FT YFKPYVYGPKFIVRTDHRPLAYLFGMKNPTSKLTRMRLDLEEFDFEIEYLA FT GKANVAADALSRIILNSDDLKASIPKSKTILMVNTRAMVKKNNVKTDTNKD FT KPIATTGTDHPAMWKTDRPSEVRKVLKICTQRNKNNVEFIIYNHSYGKALG FT KFLLRKDVNGSQALEFALLEMCKIAKQYGRNKLAWSEEDHLFKEYSQQTIK FT EIANRAITKFEIILFTPTRWITTEKDRLRIISDYHMTPSGGHIGQYRLYQK FT IREKYKWKNMKFDIKKYVRNCKACIVNKTTRHTKEDTVVTTTPTKPFNIIS FT IDTVGPLTKTNKNNRYAITIQCDLTKYIVVIPIHNKEANTIARALVENFIL FT TFGTFIELKSDQGLEYNNEILHKISEILKIKQTFSTAYHPQTIGSLERNHR FT CLNEYLRSYTNEHHDDWDDWTKFYEFVYNTTEHTDTNYTPYELVFGRKANL FT PQDIFKTKIEPVYNIEQYYFEMKYKLQKSNEIARENLIKAKNRRQQILNKD FT TVPLIIKIGDQVYLENENRKKLDPVYIGPFTVVRDQGPNCVIQNNTTKKTS FT TVHKNRLIKYTGE" FT CDS 857..2128 FT /product="GYPSY15_AG1p" FT /translation="GCINMQKLFEKIEILDRIYDQVKQLNRCYRLCALTTL FT RDNTKEIYDEIQELLRKHESSIKDEILTKLVKKSRYVYYEINKCIKIHFER FT HPDSLNTTLSENQFDITIETKPDKMADIMELIKITTSLISKYDGNEKDLKG FT VVSNLNVLKKIVKPENKETVIELVLGRLTGKARIVVGETPTSIEDIVSKLQ FT DRCSIKVTPEIVVSKMDNTKQTGTIEDFGSVIEKLTQQLEEAYIAEEIAPE FT VARKKATKSGISALSYGLKDGETKIIMRSSKFETLHEAIEQAVKLELEDRT FT KKGKNDQTKILYSNATRNNRGYGNNYQGRNNFNKFSNNNRYQTQIPPRFPP FT ARYGQNNNRNNNNFRYNNNNTNNNRNQHANRQNYSNRNQYVQSNRNNSNWQ FT NNRAPIHNTVTAEEQNNFLGHSQVSENTLY" XX SQ Sequence 5806 BP; 2404 A; 981 C; 1012 G; 1409 T; 0 other; tggcgaccgt gaccttaatc tgcaatcgac gggcagaagt ataaacaaat tatttgtgat 60 acgtacaacg gtaacgtaca gcagtgtcgt gctaacgcat caaatagtgt aaaaaaaaag 120 tgcgtttttg accataaccg gtatacgacg gaaaaaagtg tcaaatgtgc aaatttgtat 180 tgaacaacgc agctggcaca agtgacccat atataaaaaa aaaaacaatc gctatgaatg 240 agcgaccaaa agcaaataca gtgtggtgta caaaatgaca actcatcgaa aatataagtg 300 atgaaccgtt tgaagggcga aaaaaaagtg atgtacagca atacaaaaaa ataaagtgtt 360 gtgacaacct tcctcaacgt gcaaaaatgc aacaacatta taaataaacc cttaagtgat 420 cagtttgcaa actagaacaa tcttgaagtg attaataacg caacgaaccc aaaccaacgt 480 aagcggaaat aacgatgtca cagtgctgtg aaaccataag cagaagtgat gttcatctga 540 aaggtgatag aatgaatggc gagactacgg tatacgagat gtgactagtg tgattgaatg 600 gaacaaccgt tcccaacgtg gaagacgaaa agatactgcg agctcactgc ccaatctgga 660 tcggcagacg agaatggaca agtatcccaa ccccgacaag acatcgagcg acaactgatg 720 acaccgtaga acttcacatc aagcaccaat taaacaccga gcaccggtaa cgagcgttac 780 accatgagca gcaacaacaa caaatacatg ctatgtattt aaacaaggta ccgtaaatat 840 gcaattttat tgatgaggct gcataaatat gcaaaaacta ttcgagaaaa tagaaatatt 900 agatagaatt tatgatcagg tgaaacagct gaacagatgt tataggcttt gcgcgttaac 960 aacattaagg gataatacta aggaaatata cgacgaaata caagaactcc tacgaaagca 1020 cgaatcgtct attaaagatg aaatattaac aaagctagtt aaaaagagta gatacgtata 1080 ctacgaaata aataagtgca taaaaataca cttcgagaga catccagatt cgttaaatac 1140 gacattatca gagaaccaat ttgacataac aatagaaacg aaacctgaca aaatggctga 1200 cattatggaa ttgattaaaa tcaccacttc tctcatatca aagtatgatg gtaatgagaa 1260 ggatttgaaa ggtgtggtgt caaacttaaa tgtattaaag aaaatagtaa agccagaaaa 1320 taaagaaaca gtaatagaac tggtactagg acgtctaaca ggaaaagcac gaattgtagt 1380 aggggaaacc ccaacttcaa ttgaagatat agttagcaaa ttacaagaca ggtgcagcat 1440 aaaggtaaca ccagagatcg tagtatcaaa aatggacaat actaaacaga ctggaacgat 1500 agaagatttt ggaagcgtta ttgaaaaatt aacgcaacaa ctagaagagg catacattgc 1560 agaagagata gcgccagaag tagctagaaa aaaggcaact aaatcaggaa tcagtgcatt 1620 gagttatgga cttaaagatg gcgagaccaa aattataatg agatcgagta aattcgaaac 1680 cttgcatgaa gcaatagagc aagcagtaaa gttggagcta gaagatagaa cgaaaaaggg 1740 aaagaatgat cagacaaaga tcctatattc aaacgctacc aggaacaata gagggtatgg 1800 aaacaactac cagggaagga acaatttcaa caaattctca aataataata gatatcagac 1860 acaaatccca cccaggttcc cacccgcaag atatggacag aacaacaaca gaaataacaa 1920 taacttcaga tataacaaca ataacactaa caacaacaga aatcagcatg caaatcgaca 1980 aaattattcc aacagaaatc agtacgtaca atcaaataga aataatagca attggcaaaa 2040 taatcgagcg cctattcata acacagtaac agccgaagaa cagaataatt ttttaggcca 2100 ctctcaagta tcagaaaata ccctatacta accataaaca ctgatgcaga taattttgtt 2160 aaagttaaaa tagaaattgc aaaggaaatc tatagcacac tcatcataga tacaggagca 2220 accgtatccg tacttaaagc tagtaaatta aaaccaggtt gcaagatcaa tacatctaaa 2280 aaattaacat tgataagctc tagtgaccat gaatcagaga ctttaggtac tgctatgaca 2340 acaattcact ttggtgatta ttccattata cacgaatttc atataataga agacgtagaa 2400 tccatttttt ctgacggact gttaggaaaa gactttataa agcacagatg tattgttgat 2460 tatgttaatt ggatgatata cttctcatct gataacggat tgatttcaca tccaatagaa 2520 gacaatgtaa atggaaatta tattttacca aaacgaagtg aagtagtacg aaaaataacc 2580 ataccaaact tgacagagga ttcaatcatc ttatcacagg aaatccaacc aggagtattt 2640 tgcggaaaca caatagtttc aaaacgtaat cagtatatca aattcattaa taccacagat 2700 aaagatgttt cttttgaaat aaaatcctat acaccagaaa tcgaaccttt aagagattat 2760 gagcagctac agagaaaacc aaacacagct agggaacgaa ttgagaaaat tcataacaaa 2820 attcgcatag aaaatattcc acaaatagca agagaagaat tagaaaattt gatcacaaaa 2880 ttctcggata tattttgttt agaagatgaa ccggtctcta ctaacaattt ttatacccag 2940 gaaatttcat tgaaagataa cattcctttt tatataccaa attataaaca aatacattca 3000 caaagtgagg aaatgcaatc gcaggtagaa aagatgttaa aaagtaacat tattgaacat 3060 tctgtttcgt catataattc accgatacta ctattagtac caaagaaatc agttgaaggc 3120 aagaagaaat ggcgtttagt tgtggatttt cggcagttaa acaagaaaat tttaccagac 3180 aaattccctt taccccgcat agacacgata ctagatcagt taggaagagc caaatatttc 3240 agcacattgg atttgatgtc agggtttcat caaatcaagc ttgataaaaa ttccagaaaa 3300 tatacagctt tttccacgcc tacaggccac tatcagttta caagaatgcc atttggactc 3360 aacattagcc caaacagttt tcaaagaatg atggctatcg ctatggctgg tttaacacca 3420 gagctagcat ttgtatatat agatgatatt atagttactg gatgcagtgc acggcatcat 3480 atcagtaatt taggtgaagt ttttgatagg ctaagaaagt ataaccttaa actaaatgca 3540 gagaaatgtt gtttctttac aacagaagta acgtatttag gtcataaaat aacagataaa 3600 ggaatctatc cggacgacgc gaagtttgat acgattaaaa acttcccgat tcctactaat 3660 gctgatgaag caagacgttt tgtcgcattt tgtaattatt atcgtaaatt tgtacagaat 3720 tttgataaga tagctaaacc aattaatcat ttgattaaga aagacgttaa gtttgcatgg 3780 acttcagaat gtcaagcagc tttcgataca ttgaaacaaa gcttactctc acccacaatt 3840 ttacaatatc cagattttaa aaagcaattc ataattacga cagacgcatc ggatatggca 3900 tgtggtgcag tgttatcaca aataacagat ggaaacgatt taccagtcgc gtttgcgagt 3960 aaaagtttta caccaggaga gaagaataag ccaataatcg agaaagagct tacagctata 4020 cattgggcaa ttaattattt taaaccttat gtatatggtc caaaatttat agttagaaca 4080 gatcatagac cattagcata cttatttggt atgaaaaatc ctacttctaa acttactaga 4140 atgagactag atttagaaga atttgacttt gaaatagaat atttagcagg taaagctaat 4200 gttgcggcag acgcactatc aagaataatc cttaactcgg atgacctaaa ggcatcaata 4260 ccaaaatcca aaacgatttt aatggttaat acgagagcca tggttaagaa aaataacgtg 4320 aaaactgata caaacaaaga taaaccaatc gcaacaacag ggactgatca ccccgcgatg 4380 tggaaaacag atagaccttc agaagtgaga aaggtattga aaatatgtac gcagagaaat 4440 aagaacaacg ttgaattcat aatatacaac cattcatatg gtaaagcact aggaaaattt 4500 cttttgagaa aagatgtaaa tggaagtcaa gcattagagt ttgctcttct agaaatgtgc 4560 aaaatcgcga aacaatatgg aagaaacaag ctagcatggt cagaagaaga ccacttattc 4620 aaagaatatt cccaacaaac tattaaggaa atcgccaaca gagctattac caagtttgaa 4680 ataatcctgt ttactccaac tagatggata acaacagaga aagataggct gagaataatt 4740 tcagattatc atatgacccc ttcgggagga catataggcc agtacagact gtaccagaaa 4800 ataagggaaa aatacaaatg gaaaaatatg aaatttgata tcaagaaata cgtacgaaat 4860 tgtaaggcat gcatagttaa taagacgact agacatacta aagaagacac agttgtaact 4920 acaacaccga caaaaccttt taatataatt tcaatcgaca cagtaggacc tctaacaaaa 4980 actaacaaaa acaacaggta tgcaataacc atacaatgtg acttaacgaa atacatcgta 5040 gtaataccta tccataacaa agaagcaaat actatagcaa gagcattggt agaaaacttt 5100 attcttacat ttggaacatt tatagaatta aaatcagatc aagggctaga atataacaat 5160 gaaatattac acaaaatctc agaaatctta aaaattaaac agacttttag cacagcttac 5220 cacccacaga caataggatc attagaaaga aatcatagat gtctaaacga atacctaaga 5280 agttatacaa acgaacatca tgatgactgg gatgattgga caaaatttta cgaatttgtt 5340 tacaatacaa cagaacacac tgacacaaac tacacaccat acgaactggt attcggaaga 5400 aaagcgaatt taccacaaga tatattcaaa acaaaaatag aaccagttta taatattgaa 5460 caatattatt tcgaaatgaa atataaactc caaaaatcaa acgaaatagc tagagaaaat 5520 ttgataaaag caaagaatag aagacagcaa atcttaaata aagatacagt accactcatt 5580 ataaaaatag gagatcaggt atatttggaa aatgaaaaca gaaaaaaatt agatccagtc 5640 tacattggac ctttcacagt agtaagggac caagggccta attgcgtaat acaaaacaat 5700 acaacaaaga aaacctctac agtacacaaa aacagactaa ttaagtacac aggagaataa 5760 cttcaatcat tgtaattcat tacgttattc tattaaaggg gggagg 5806 // ID BEL11-I_AG repbase; DNA; ANG; 5296 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL11-I_AG is an internal portion of the BEL11_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL11-I_AG; BEL11-LTR_AG; BEL11_AG; Bel clade; RING Zn-finger; KW integrase; peptidase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5296 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL11_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 29-29 (2003). XX DR [1] (Consensus) XX CC BEL11_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL11-I_AG, an internal portion of BEL1_AG is flanked by CC BEL11-LTR_AG CC LTRs. The BEL11-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 10 copies; they are less than 1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes a 1710-aa BEL11_AGp Bel-like CC protein CC (pos. 137-5266). CC BEL11_AGp is composed of the peptidase A16 (pos. 128-300), RING CC Zn-finger (pos. 317-388), reverse transcriptase (pos. 765-888) CC and CC integrase (pos. 1418-1577) domains. XX FH Key Location/Qualifiers FT CDS 137..5266 FT /product="BEL11_AGp" FT /translation="MSKQDKLRYKELKRHQYIDSINRVKEFLKTFTSEQQN FT QVSTRLDRLEKIWESFETVQEDIEDLEISEEGVATNARIRAEMEETYLYAK FT AQLRSMLPIPAAAEVVSVAANAASSSSSRVKLPLIALPEFAGNFDAWLTFH FT DTYVSLIHSSTDITAIEKFHYLRASLKEEAANLIQSISVTSENYDLAWSTI FT VKRYSNPIILRKKHIRSLISLPKMKETGAVALNRLVDDFRRHVKILEQLKE FT PVKSFSSILIELMADKLDDETLRVWEEAHADEDPTFTDMMAFLEKRIRVLE FT TLAIEKCGAVPKKPIKTKVSLHAATTHTNNVPVCVMCKKNGHSIASCNVFK FT GTNTQERMRVVSEKRLCRNCLKAGHLAHACASKYNCQQCSQRHHTLLHAHE FT ENSSVLVGETSSSSTMALASSKKSAVNAILSTVVLVVVDAYGKEHLARALL FT DNGSQPNAISEHLCQLLRLPRKPASVSIAGVDSTTTNAKHIVCTEVRSRIY FT HYRQAMNFLVLKKVTQNIPSTSFSTAAVGVPSNYVLADPDFGTARRVDMII FT GAAYFYSLLRGGQVHLPNQRNVLIDTVFGWLVAGDTPTFHESQSQTTISCH FT MMEATDKLQEQLERFWKVEELAITSLSPVEQQCEQYFKQTTNRDHTGRYVV FT RMPKHHDYAQMLGDSKAAAQKRFRLLEQRLAKDKHLKQQYDDFMREYVTLG FT HMFPVPVEEDSMAAVHYLPHHPVVKESSTTTKVRVVFDGSAKTTTGHSLND FT VLHVGPVVQDELLSLVVRFRKYKVAVIADIEKMYRQVSMHPDDRRLQRIFW FT RFQETEVVQTFELATVTYGLAPSSFLATRTLLQLAEDEGAPYPLATEAVKK FT NLYVDDLISGAESIEQAIQLRDELTSLMSKGGFRFRKWCSNELSVLDGLTP FT DLLGTTASHEFEATANVKTLGICWEPPNDVFRFTIAIPDVRPYTKRTVLST FT IAQLYDPLGLLSPIIVQAKILLQELWANKLGWDDELPRQLCDKWEEFCEQL FT PMLARFKIPRFALTPNYNYVELHCFADASEAAYGACAYLRSQSIDGTTQVT FT LLASKSRVAPLKPLTIPRLELCAALLAARLQQKLISAIDMAVNETHMWSDS FT TITLQWLAAPPRTWKTFIANRVGEIQAATNGCIWHHVPGIENPADMLSRGV FT SAELLLESNMWMHGPDWLMNDSSCWPSKSYGQQHFTDDELERKGNVVLTAQ FT VVEPDPLLLRYSSFRTLVHVTAYCMRFCHIARGKEQRETSNLSVDEIQNAK FT IVLVKMVQRQVFPDELRQLRKKQKLAGGSPLKLLHPFIDKDGVIRVGGRLG FT HADLPFCVKHPIVIPGYHPFTQLLLRQQHEKVMHGGITSTLSAIREEFWPL FT NGRRAVRSTIRACYRCNRANPVPIQQPMGQLPLSRVTANEAFVCTGVDYCG FT PIMLKPVHRKAAPQKAYLCIFVCMSTKAVHLELVGDLSTSGFLKALDRFIF FT RRNKPNHIYSDNGTNFVGAKNALHQVYQMLHDEAQNRQINNYLAEEGIEWH FT LIPPRAPNFGGLWEAAVKVAKKLLVRQLGVSLLSYEDLATVLIKIEGCMNS FT RPLTPLSNDPNDLSALTPSHFLIKGMMRPPPETDIRDVPTNRLDQYQRLQK FT YAQHFWQRWRTEYLHELAQQQRRNPPEQQVSIGDIVIIKDEQLPPARWPLA FT RIVEVHPGQDGIVRVVTLKTASGVLKRPSSKICLLECSREF" XX SQ Sequence 5296 BP; 1460 A; 1239 C; 1299 G; 1298 T; 0 other; ttttggtgcg tagtgaccag gattttgata aatccttcgg tttttgtgaa aaccggcgta 60 cagaagcgtt tgttcataaa gcagtgctgc ggcactcgat ctgtattagt gtttgtgtgt 120 gtgagccttt tgcaaaatgt cgaagcagga caagttgcgg tacaaggagc tcaagcggca 180 ccaatacatt gactccatca atcgtgtgaa agagttctta aaaacgttca caagtgaaca 240 acaaaatcaa gtgtcgacgc gactcgatcg tttggaaaag atttgggaat cttttgaaac 300 agtgcaagaa gacatcgaag atttggaaat ctccgaggaa ggcgttgcaa ctaatgcgcg 360 tattagagca gaaatggagg aaacgtattt gtacgcaaag gcgcaattgc gtagtatgtt 420 gcccattccg gcagccgctg aagttgtgtc tgttgctgca aatgctgctt catcttcttc 480 ttcaagagtg aagctcccat tgatcgcact gccagagttc gcaggaaatt tcgatgcgtg 540 gttaacgttc cacgacacat acgtctcact catacactca tcgacggaca taacggcaat 600 cgaaaaattc cactaccttc gagcttcact caaagaagaa gctgcgaatt taatacaatc 660 catttcggtt acgagtgaga attatgattt ggcatggagt acgatcgtca aacgttattc 720 caacccaatc attttgcgta agaagcatat tcgatcgctc atatcgcttc ctaagatgaa 780 agaaacggga gcagtggcgc tcaaccgttt ggttgacgat tttcgacggc atgtcaaaat 840 cctagaacaa ttgaaagaac ccgtgaaatc gttcagttca attctcatcg agttgatggc 900 ggataagctg gatgatgaaa cactccgtgt gtgggaagaa gcccatgccg atgaagatcc 960 tacatttacg gacatgatgg cgtttttgga aaaacgtata agagtgttgg aaacactggc 1020 aatagagaag tgtggtgcag ttcccaaaaa accaataaag acaaaagtat cgttgcatgc 1080 agctacaact cataccaaca acgtaccagt gtgtgtgatg tgcaaaaaga acgggcacag 1140 tatagcgtcg tgcaatgtgt tcaaaggcac taatacacaa gaacgcatga gagtggtgag 1200 tgagaaaagg ctgtgcagaa attgcttgaa agcaggacat ttggcccatg cgtgtgcgtc 1260 caaatacaat tgccagcagt gttctcagcg tcaccacaca ctacttcatg ctcacgaaga 1320 aaacagcagt gtattagtgg gtgagacttc tagctcttca acaatggcgt tggcatcgtc 1380 gaagaaatcc gccgttaacg ctatactctc tacagtggta ttggttgttg tcgatgcata 1440 cggcaaagaa cacttagcgc gagcattgct ggacaacgga tcgcagccga acgcgatcag 1500 tgaacatctt tgtcagcttt tacgactacc acgaaagccc gctagcgttt caattgctgg 1560 tgtcgacagc actaccacca atgcaaagca catagtatgt acagaagtgc gatctcggat 1620 ttaccactac cgacaagcaa tgaatttcct tgtgttgaag aaagtaacgc agaacattcc 1680 ttcaacgtcg ttttctactg ctgccgtcgg cgttccttcg aactacgttc tggccgatcc 1740 agatttcggg accgcgcggc gcgtggatat gatcatcggt gcagcatatt tctattcgtt 1800 gctgcgtggt ggacaagtgc atttgccaaa ccagcgaaac gttctcatcg acacggtgtt 1860 tggctggctc gtagcaggag atacaccgac ctttcatgaa tcgcaatcgc aaacaacaat 1920 tagttgccac atgatggagg caaccgacaa actacaagaa cagctggagc gattttggaa 1980 ggtcgaagag cttgctataa catcattgtc tcctgttgaa caacagtgcg agcagtactt 2040 caagcagacg acgaatcgag atcacaccgg cagatacgtc gttcgcatgc cgaaacacca 2100 cgactacgct cagatgcttg gcgattcgaa ggctgcagcc cagaagcgct ttcggttgtt 2160 ggagcagagg ctggctaaag acaagcatct gaagcagcag tacgatgact tcatgcgaga 2220 atacgtgacg ctgggtcaca tgtttcctgt gccggttgaa gaggacagca tggctgcggt 2280 tcactacttg ccgcatcatc cggtggtgaa agagtccagc acgacgacca aggtgcgtgt 2340 ggttttcgac ggctcggcga agacaaccac ggggcattct ctaaacgatg tcttgcatgt 2400 aggaccagtc gtgcaagatg agctgctgtc tctcgtcgtg cgattccgca agtataaggt 2460 ggcggtgatc gccgacatcg agaaaatgta tcgccaggtg agtatgcatc ccgatgaccg 2520 acgtttacaa cgtatttttt ggcgctttca ggaaacagaa gttgtgcaaa cttttgagtt 2580 ggcaacggtg acgtatggtc tggctccatc gtcattccta gcaacacgta cgctacttca 2640 actagctgag gatgaaggcg ctccttaccc tttggcaact gaagccgtaa agaagaactt 2700 gtacgtggac gatctgatct ccggcgcaga aagcattgag caagcaattc aacttcgtga 2760 cgaactgacc agtctcatga gtaagggagg tttcaggttc cgaaaatggt gctcaaacga 2820 gttgagtgtg cttgatgggt tgacacctga tctgcttgga acaacagcat cccatgaatt 2880 cgaagcaacc gcaaatgtca agacgcttgg catatgttgg gaaccaccaa acgatgtatt 2940 ccgcttcacg attgctatcc ctgatgtacg accctacacg aaacgtacag tgctatctac 3000 gattgcccaa ctgtacgatc cgcttggctt gctatcgcct atcatcgtgc aagcaaaaat 3060 cctcttacag gaactttggg caaacaaact cggttgggat gacgaattgc cgcggcaatt 3120 gtgtgacaaa tgggaagagt tttgcgaaca gctccccatg ctagctcgtt tcaagatccc 3180 gagatttgct ttgacaccca actataacta tgtagagctg cattgttttg cagacgcatc 3240 agaagcagct tatggtgcgt gtgcctacct gagatcgcaa agcatcgacg gcacaaccca 3300 agtaacgctg ctagcttcta aatcgagagt ggctcctctc aaaccactta ccatccctag 3360 actggaacta tgcgcagcct tgctagctgc cagattacag cagaaactga tatcagccat 3420 tgacatggca gtaaacgaaa cacatatgtg gtccgattca accatcacgc tgcaatggct 3480 tgcagcacca cctagaacgt ggaaaacttt catcgcaaac cgagtaggag agatacaagc 3540 tgctaccaat ggatgcattt ggcatcatgt gccagggatc gagaaccctg ccgacatgct 3600 atccagaggt gtttctgcgg aattgctttt ggaaagcaac atgtggatgc atggaccaga 3660 ttggctgatg aacgatagct cgtgctggcc cagcaaatcg tatggacaac agcacttcac 3720 tgatgatgag ctggaaagaa agggtaacgt tgtgttaact gcccaagtag tcgagcccga 3780 cccattgctc ctacgatact cctcattcag aacgttggtt catgtaactg catattgcat 3840 gcgattttgc cacattgcgc gtggtaaaga acaacgcgaa acgagcaatc tctctgtgga 3900 tgagattcaa aatgctaaaa tcgttttagt aaagatggta cagcgacaag tatttcccga 3960 tgaactacga caactgcgta agaaacaaaa gcttgctggt ggatccccac tcaagctact 4020 ccatccattc attgacaagg atggtgtcat acgtgttggt ggcagacttg gacatgccga 4080 tttgccattc tgtgtgaagc atccgatcgt cattcctggg tatcatccat ttacccaatt 4140 gctgttgagg cagcaacatg agaaggtgat gcatggtggc atcacatcaa cactttcagc 4200 cattcgcgag gagttttggc cattgaatgg caggagagcg gttcgatcta ccatccgagc 4260 atgttatcgc tgcaaccgag ccaatcctgt tccaattcag caaccgatgg gacagctacc 4320 gctttctcga gtcactgcaa acgaagcatt tgtctgtaca ggtgtggatt actgtgggcc 4380 gataatgctg aagcctgttc atcgcaaagc agctcctcaa aaggcgtacc tatgcatttt 4440 tgtatgcatg agcaccaaag cagtccattt ggagcttgtg ggtgacctaa gtacatcagg 4500 gttcctgaag gctttagacc gtttcatctt ccgacgaaac aagccgaacc atatctattc 4560 ggataatggt acaaatttcg tcggcgcaaa gaacgcactt caccaagtct accagatgct 4620 gcatgacgaa gctcaaaacc gtcaaatcaa taactatcta gcagaagaag gaattgaatg 4680 gcaccttatt ccacctcgtg caccaaactt cggtgggctt tgggaagccg ccgtgaaggt 4740 ggccaagaag cttttggtca ggcagttagg tgtctcgcta ctatcttatg aggatctggc 4800 aacagtgctg atcaaaatcg aaggctgcat gaattctcgt ccgttgacgc cgctttcgaa 4860 tgaccctaac gatttgtcag ctttaacacc gagtcatttt ctcatcaagg gaatgatgcg 4920 tccacctcca gaaactgaca tacgggatgt cccgaccaat cgactcgacc agtatcagcg 4980 gttgcagaag tacgctcaac atttctggca gcgctggcgt acagagtacc ttcatgagct 5040 tgctcagcaa cagcgacgta atccaccaga acaacaagtc tctatcggag acatcgtcat 5100 tatcaaggat gaacagctcc cacccgctcg ttggcccttg gctcggatcg tggaagtaca 5160 ccctgggcag gatgggattg tgcgtgttgt taccttaaaa actgcctctg gggtattgaa 5220 gagaccttcg tctaagatat gtttgttaga atgttcacga gaattttgaa aacttagttg 5280 ttcaaggggg ccggta 5296 // ID AgaP4MITE559 repbase; DNA; ANG; 559 BP. XX AC DQ301483; XX DT 22-AUG-2006 (Rel. 13.07, Created) DT 31-JUL-2008 (Rel. 13.07, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP4MITE559 P MITE, complete DE sequence. XX KW P; DNA transposon; Transposable Element; Nonautonomous; KW AgaP4MITE559. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-559 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-559 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301483; Positions 1 559. XX SQ Sequence 559 BP; 190 A; 82 C; 112 G; 175 T; 0 other; caaagtctat ataatacaga ggtcgagtcg tccagaacat ttgatcaaaa agcgtgggaa 60 agttttgaca gctggatgaa aaagcttcaa tagtttcatc ttagcataca ttgaggtttt 120 agttgttgta catgcagtgg tctgaaagca tttttggcca tttttacatc gtttgtttgt 180 aatattgtac tttaaaaagt tgacgaatat taagaaaata tagaattata gtgaaattat 240 caattacaac aagatcagtg cccgaaatta cgacgaaatc ttgtgcagct caagagggtc 300 aaagctaaga ggccagattt ctgattgtgg tcatttttaa gtgatgaatt actgaaaaaa 360 ctactccatg ccacgttcat gagaaagaaa tgtatagtta aactgagttt tgagcaaaaa 420 atgcacaatt agccctaaaa agtgtatggt tgtttggaat cgttagtgaa cgctttttca 480 tctaactgtc aaaaataagc tttcaaaatg gtggaccaaa atcgagctat gcatcgacct 540 ctgtattata tggactttg 559 // ID GYPSY64-I_AG repbase; DNA; ANG; 4359 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 27-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE GYPSY64-I_AG is an internal portion of retrotransposon GYPSY64_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY64-I_AG; GYPSY64-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY64_AG; mag lineage; reverse transcriptase. XX NM GYPSY64-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4359 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY64_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 167-167 (2004). XX DR [1] (Consensus) XX CC GYPSY64_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY64-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1424-aa CC GYPSY64_AGP gag-pol like polyprotein (pos. 74-4345). The CC sequence of the LTRs flanking GYPSY64-I is deposited as CC GYPSY64-LTR_AG. XX FH Key Location/Qualifiers FT CDS 74..4345 FT /product="GYPSY64_AGP" FT /note="gag-pol" FT /translation="MEGAQEKDSAAGGGESSKNVSEAILKILANQQSLMSS FT MAQQLQLTNQSIQKFTHVEVVLDSLSSNMSEFVYDKENGYTFDAWYSRYSE FT LFDRDACNLDNAGKVRLLLRKLSPQDHERYNSFILPKLAREFTFEQTVKKL FT KSLFGATISTFRRRYNCLQMTKDDVDDYLSYSCKVNKSCVDFKLSELTEEQ FT FKCLIYVCGLKSSSDAEIRMRLINKLNEAQDITLQQIVEQCNSLVNLKQDT FT VLVEQPSSVQYVANKGSSQQQRHPSGGNKQQDHPRTPCWSCGAMHFHKDCP FT SRNHKCKDCGKIGHSEGYCACFTSMATTSAPAQKAPWKKQYKRKQHQGQTS FT KIVTVNHVTQRRKFVSVHLNNIPHRLQIDTGSDITIISHQAWKRIGSPAVK FT PATCNARTASGDPLQLAAELECSITINNVTKQGKCFVTDPNVNLNVLGIDM FT MDLFGLWNEPITAFCNQICTTKTTNIAELRSRYPDVFNDKMGLYNKTAVQL FT TLKGTPTPVFRAKRPVAYMMEAVVEDELHRLESLGIIKKVDFSDWAAPIVV FT VRKPNGTVRICADFSTGLNNVLESNNYPLPLPEDIFVKMANCVIFSHIDLS FT DAYLQVPVDEASQPFLTINTHKGLFQFTRLSPGIKSAPGAFQKLMDTMLAG FT LNSTTGYLDDILVGGRNEDEHQQNLHLVLNRLRDYGFTVRIEKCNFNMRQV FT KYLGQILDAQGIRPDPDKIAPIVSMPPPHDIPTLRSYLGAINYYGKYVQEM FT RTLRQPMDQLLKAGMKFHWSTACQRSFDRFREILQSPLLLTHYNPKMEIIV FT SADASNVGLGARIAHKFPDGSIKAIYHVSRSLTSAESNYSQIEKEALALIF FT AVTRFHRMIYGRRFILETDHKPLLAIFGAKKGIPTYTANRLQRWALTLLLY FT DFSINYISTDSFGHADVLSRLINRHVRPDEEMVIANLTFEKSIRSILNESL FT QAVPLSFKTIQNTTKNDDTLQQIIKFIKEGWPPKTIINDPKILQFYQRRDG FT LSVVADCIMYGERLVVPPSSRESVLKQLHKGHPGIERMRSIARQYVYWPNV FT DEDVAHIVKSCIECSSVAKTDRKTTLESWPVPEKAWQRLHLDYAGPVNGYY FT YLILVDAYSKWPEVMRTKDITTTATLRMLRNIFARHGQPETLVTDNGTQFT FT SNMFETFCEHYSIVHLKTAPFHPQSNGLAERFVDTFKRALKKITAGGETLE FT EAIDTFLLCYRSTPCRSSPEGKSPAEHIYKRPIRTALELLRPPSSSHKVHD FT NKQEKQFNLKHGAKKRHYSPQDLVWAKVYHNNKWSWAHGQVIEQIGSVLYN FT VWLSSTRKLIRSHCNQLRSRHEAEVSQQEQLATTDVQIPLAILLDNCGLND FT EVERETTTSTTLPSEMLADLAPPRQRSRVGSTRNNNQPPVPTRQSSRQRVP FT PTRYDAYHLY" XX SQ Sequence 4359 BP; 1371 A; 1013 C; 942 G; 1033 T; 0 other; gtggcgacga ggcggtagaa gtttgcaaaa aaaccagcga aaattttccc gggaccgtgt 60 gttatcatcg acaatggaag gtgctcaaga aaaagattct gcagcaggcg gaggagagtc 120 atcaaagaat gtgtcggaag cgatactgaa aattctcgcc aaccagcaga gcctcatgtc 180 ttctatggca caacagcttc aattgacgaa tcaatccata caaaagttca cacatgtgga 240 ggttgtactt gactcattat caagtaatat gtcagagttt gtctacgaca aagaaaacgg 300 atacactttc gacgcctggt attcccgtta cagcgaactt ttcgatcggg atgcttgcaa 360 ccttgacaac gcgggaaaag tgaggttact tttacgcaaa ctgagcccac aagatcacga 420 gcggtacaat agtttcatat tgccaaaact agctcgcgaa ttcacattcg aacaaacagt 480 gaaaaagcta aaatccctgt ttggcgctac catctctacg tttcgacgca gatacaattg 540 tcttcaaatg acaaaagatg acgtagatga ttatctttca tattcctgta aagtgaacaa 600 atcctgtgtt gattttaaac tttccgagtt aactgaggaa cagtttaaat gcttgatcta 660 cgtatgtgga cttaagtcaa gcagcgatgc agagattcgt atgaggctga tcaacaaact 720 gaacgaagca caggacatca cgctccaaca aatcgtcgaa cagtgcaaca gtctcgttaa 780 cctcaaacag gacactgtgc ttgtagagca accatcgtca gtgcagtacg ttgctaacaa 840 aggttcatca cagcaacaac gtcatcccag cggaggaaac aaacagcagg atcatcctcg 900 tactccttgc tggtcttgcg gtgcaatgca cttccacaaa gattgtccga gtcgaaacca 960 caaatgcaaa gattgtggta aaattggaca ttccgaggga tactgcgcct gtttcacatc 1020 aatggcgacc accagtgctc cagcgcagaa ggcaccgtgg aagaagcagt acaagcggaa 1080 gcaacatcag ggacagacat cgaaaatagt gacagtcaac catgtcacgc aaagaaggaa 1140 gttcgtctcc gttcatctca acaacattcc tcatcgactg caaattgaca cgggatcgga 1200 catcaccatc atctcacatc aggcatggaa gcgtatcggt tctccggcag tcaaaccagc 1260 cacttgcaat gctaggacag cgtcgggcga tccgttgcaa ttagcggcgg agctggagtg 1320 cagcatcacc atcaataacg ttacgaaaca gggtaagtgt tttgtaactg atccgaatgt 1380 caatcttaac gttttaggga ttgatatgat ggaccttttt ggactgtgga acgagccaat 1440 cacagcgttc tgcaaccaga tctgcaccac gaagacgaca aacatagcag aactacggtc 1500 tcgttatcca gacgtcttca acgacaaaat ggggttgtac aacaagacag cagtacaact 1560 tacgttgaag ggcacgccta caccagtatt tcgtgcaaag agacccgttg cgtacatgat 1620 ggaagctgtt gttgaagatg agctgcatcg tctggaaagt cttggcatca tcaaaaaagt 1680 ggacttttct gactgggcgg cacccatcgt cgtggtacga aaaccgaacg gcaccgttcg 1740 tatttgtgcg gatttctcga cggggttgaa caacgtgctg gagtcgaata attatccttt 1800 accactgcca gaggatatct tcgtgaaaat ggctaactgc gtcattttca gccatattga 1860 tttgtcggac gcctacctac aagtacccgt agacgaagca agccaaccat tcctaaccat 1920 caacacccac aagggactgt ttcaattcac acgattgtca cccggcatca aatcagcgcc 1980 aggggcattt caaaagttga tggatacgat gctcgctggg ctcaatagca ccacagggta 2040 cctggacgac atattagtag gtggacgaaa cgaagatgag catcagcaaa acttacatct 2100 cgtgctaaac cgtttgcgag attacggatt taccgtacgc attgaaaaat gtaatttcaa 2160 tatgcgccaa gtcaaatatc tgggacaaat ccttgatgca caaggaatcc ggccagatcc 2220 agataaaata gcaccaattg tgagcatgcc accgccgcac gacattccaa cgctgcgatc 2280 ataccttgga gccataaatt attatggcaa atatgtccaa gaaatgcgca cactccgtca 2340 gcccatggat caacttttga aggcaggtat gaaatttcat tggtccacag catgccaaag 2400 atcattcgat cgttttcgag aaattttaca atctccatta ctgctaacgc attacaatcc 2460 aaaaatggag ataatagtat ctgcagacgc ttcaaacgta ggattgggtg ctcgcattgc 2520 tcacaagttt cctgatggat caataaaagc catttaccat gtgtcgcgta gcttaacatc 2580 agctgaaagt aactatagcc aaattgaaaa agaggcgctg gctttgatat ttgcggttac 2640 acgctttcac agaatgattt atgggcgtcg attcattctg gaaacagatc acaaaccttt 2700 attggctatt tttggtgcaa agaagggtat accaacgtat acagcaaatc gtttacaacg 2760 atgggcatta actcttctgc tctacgattt ttcgattaac tatatctcta cagatagttt 2820 tggtcacgcc gacgtattat cacgtcttat caatcgacat gtgcggccag atgaagagat 2880 ggtcatagct aatctcactt tcgaaaaaag tattcggagc atcttgaacg aatcactgca 2940 agcagtccca ttgtcgttca agacaattca aaatacaacc aaaaatgatg ataccttaca 3000 acaaatcatc aagttcatta aggaaggttg gccacctaaa accatcataa acgatcctaa 3060 aatcctacaa ttttatcaac gacgagatgg actgtctgtc gtagcagatt gtattatgta 3120 tggagagaga ttggtcgtgc ctcccagttc cagggaaagt gtcctcaagc agctacacaa 3180 gggacaccct ggtatcgaac gcatgcgctc gatcgcacgg cagtatgtgt actggccaaa 3240 cgtcgatgaa gacgttgcac atatcgtaaa atcgtgtatc gaatgttcta gtgttgcgaa 3300 aacagataga aaaacaactc ttgaatcctg gccagttccg gaaaaagcat ggcaaaggct 3360 acatctcgat tatgcagggc ctgttaacgg ttactactat ctgattctgg ttgacgcata 3420 ctctaagtgg ccagaagtga tgcgcactaa agacatcacc acaaccgcaa cattgcgcat 3480 gctccgcaat attttcgcaa gacacggaca acctgaaaca ttggtcactg ataatggtac 3540 acaatttacc agtaatatgt tcgaaacatt ttgtgagcac tatagtattg tgcatttgaa 3600 gactgctcca tttcatccgc agtcaaacgg actagcagaa agattcgtcg acacattcaa 3660 gagagccctt aaaaaaatta cagcaggggg ggaaacgtta gaagaagcaa tcgacacctt 3720 tctgctgtgc tatcgttcaa caccatgtcg tagttcaccg gaaggaaaat caccagctga 3780 gcacatttat aaaagaccaa tacggacagc tctcgaatta ttacgtccac cttcttcatc 3840 gcacaaagtc catgataaca aacaagaaaa gcagttcaat ctcaaacacg gagcaaagaa 3900 gcggcattac tcccctcagg acttagtgtg ggctaaagtg taccataaca acaaatggag 3960 ttgggctcat gggcaagtta ttgagcaaat tggtagtgtg ctgtacaatg tatggttgtc 4020 atcaacgagg aagctcattc gatcgcattg taatcagcta cgaagtcgac atgaagcaga 4080 agtttcccag caagagcaac tagcaactac agacgttcag ataccgctgg cgatactcct 4140 cgataattgt ggtcttaacg atgaagtaga aagagaaacc acaacatcca caacactccc 4200 atcagaaatg ttggcagacc tggcaccacc acgtcaacga agccgtgttg ggtcaacacg 4260 taacaacaat caaccaccgg tacctacacg tcaatcatcc agacaacgcg taccaccaac 4320 cagatacgac gcgtatcatc tttactaaaa aaagggagg 4359 // ID BEL18-I_AG repbase; DNA; ANG; 6242 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL18-I_AG is an internal portion of the BEL18_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL18-I_AG; BEL18-LTR_AG; BEL18_AG; Bel clade; integrase; KW peptidase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6242 RA Kapitonov V.V. and Jurka J.; RT "BEL18_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(4), 67-67 (2003). XX DR [1] (Consensus) XX CC BEL18_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL18-I_AG, an internal portion of BEL18_AG is flanked by CC BEL18-LTR_AG CC LTRs. The BEL18-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 4 copies; they are ~1% divergent from CC the consensus sequence. CC The consensus sequence encodes one 1985-aa BEL18_AGp Bel-like CC protein CC (pos. 183-6138). CC BEL18_AGp is composed of the peptidase A16 (pos. 360-450), CC reverse transcriptase (pos. 970-1149) and CC integrase (pos. 1680-1830) domains. CC BEL18_AGp: CC MEHEHExKAPPAVAQTNESQQPETGALDHRPLAPAADSLHFRSFAPTGQDWTVNSSLEIVDRARSSTPNA CC PLGAGLLLRxPQQRLTSEVREAPASRYTPLANSTVADLGxLHRSSSETVPRLAHLPTTADVQQPAPSQGG CC VSAGTVPRAADGASGAAIEDMELxPQAQDGASAGTVLRTAVGASGAAISNVHQPTPSQTGTPGRTMADKL CC AESSEAVAALELRMRNLTRRKLELERRELEFEEQLILAESKRIGDGFISMIDDEGESNKMVDWSRRGDSE CC GHSSHHPNVSPSSSRTVPPGNAQQGNDSVVSQQRQPRSGEYLNIADGAVDTYPFRNSIGTTLMNVNQSQI CC LARKASCKELPYFSGKPEEWPIFIATYETSTAACGYSDEENTLRLQRALKGKALEAVQPCLLHASNLTSV CC IETLRMLYGRPEIIVHSLIHRIHQMPAPRIERLETVIDFGMAVRNMCATITASGLEEYKCNVALMHELVE CC KLPHALRLDWARHRMQSSSATLSEFGKWIETQVKAASLITLPSLEFRPERKLDHKSRTHHMNIHNATGLV CC SEGSQICLLCEATCVDLSQCEQFNRLNVNDRWATIRRLNACRKCLKIHTYGCKGKKACGKNGCEYLHHEL CC LHNPERHSKPNEYAVATASLNAHSEVTGDVFLKYIPVIVYGRDKAITTFAFFDSGSTGTFIEHSLIEELG CC LEGQPHPLCLKWTGNSERDETGSIRVSLLEISGVGQNSSVHIIPKVHTVQSLSLPAQTLAVTQLTKQYAH CC LQGLPIHAYENARPRLLIGIDNHHLVRPTRYAEGGKYEPVAAKTSLGWIVYGPRNKNSNSVNIQAIHTIH CC ICNCGADAEANLDAAVKNFFTLESLGIVKPLETLRSKDDERALGILNAESKFNGKHYESGLLWRFDDVKL CC PCSRDMAMRRYKCLQKRMSKDSVLAKAVIEKMRDYERQGYIRRLSPEELSMKGSRDWFLPIFPVFNANKP CC GKVRVVFDAAAKVQGVSLNTYLLNGPDLLAGLLSVLYKFREHRVAIVGDIKEMFFQVRMKPDDQRSQMIL CC WNENDFTGSEPDVYAVAVMTFGAACSPSTAQFIKNLNADRFADKYSRAVKCIKEEHYVDDLLASAETDEE CC IITLAEQVRHIHAEGGFEIRNWLSNSHRVTSHLQREASPEDKINMCSDDRTEKVLGMWWDTLTDTFTFKL CC SPKHDIQLLSGGRMPTKRDVLRTVMAIYDPMGIIANFLMYVKILMQEIWRAGLGWDDVLSGRLAEKWSVW CC VAVLPTISQVRVPRCYRQFTSVNAKIQLHVFCDASENGMAAVAFFRFNDGGIIECSLVGAKTRVAPIKFV CC SIPRLELQAAVIGARFAAAIIAQHRIAIERIFYWTDSRDVICWMRSDHRRFSQFVAFRVGELLETTSVHE CC WRWLSTKLNVADDGTKWQKVPTADPDSRWFRGPDFLWKPECEWPVPEQYPTDTKEELRLHMMHHNTNTKP CC WIQLERFSSWNRLLRSVVYVLRFVSFIVSKRTARTTGPLTQGELEKAEYAILRLAQRDAFSQEIRRLTDA CC RDASEQRCTWKSVLPKTSILAKLSPEVDDNGVLRMRGRLTNCPWVSESTKRPVILPRQHQVTALILADFH CC RRFQHINHHAAINAIRSKLYIPRLQAEFNRIRRTCQHCKNRDAKPEPPEMGNLPSERLAAYQKPFSFTGV CC DYFGPVTVAVGRRVEKRWGVLFTCLTTRGIHLEAAHSLTTSSCILAIRRFIARRGQPLEFISDNGTNFVG CC ASRELAEAWEAIDKQRLAEEFTTPRLAWKFIPPGGPHFGGCWERLVRSVKKAMSEIRMSRLPTDEVLTTA CC LTEIEAMLNSRPLTQVPLDSESELPLTPNHFLLGTANGEAPKAVFSDDIATLKTTWKVSEVMANLFWKKW CC VAYYLPTLTRRVKWHHQVRPIKEGDIVVIVDPNLPRNTWPKGRVVAVIQSKDGQVRRATVATSTGIYERP CC ATKIAVLDVQQENNTYPPETNSxQH. XX SQ Sequence 6242 BP; 1701 A; 1509 C; 1620 G; 1397 T; 15 other; ttttaaaaaa attccacgtt tggaaatttt acgcgtttac gcgagtgcgt gagtgaaccg 60 tgtcgcaaga cagttgaggc cgcgaaacac ctaggaacgc gaccaacaga cgatacgtcg 120 gacggcraga gaatagttga ggtgtacgtt acacattaca ccaacatacg caatttctca 180 ctatggaaca cgagcacgaa tygaaagcac ctccagcggt agcacaaacg aacgagtctc 240 agcaacctga aacgggagca ctggatcata ggccactygc accggccgcg gattcacttc 300 acttccgatc gttcgcgccg actggacaag attggacggt taactcgtct ttggagatcg 360 ttgaccgcgc aaggtcatcg accccgaacg caccattagg tgcaggattg ttgctgaggg 420 raccgcagca acggttgact tcggaggttc gcgaagcccc tgcatcgcgg tacacaccgc 480 ttgcaaactc tactgtggca gacctggggm ctttacatcg ctcatcctcg gaaaccgtgc 540 cacggttagc gcacctaccg accactgcgg acgtrcaaca accagcgcca tcgcagggag 600 gcgtttctgc aggaaccgtg ccgcgggcgg cggacggtgc atctggagcg gccatcgagg 660 acatggaact acsaccgcaa gcgcaggacg gcgcttctgc gggaaccgta ctgcggacgg 720 cggtcggcgc atctggggca gccatttcga acgttcatca gccgacgcca tcgcaaaccg 780 gcacacccgg gaggacgatg gcggacaagc ttgcggagtc gagcgaagct gttgccgcct 840 tggaacttcg gatgcgaaat ttgactcgac gaaagctgga gcttgagaga cgagagctgg 900 agtttgaaga gcagctgatc ttggctgaat cgaagaggat cggcgatggc ttcatcagca 960 tgattgacga cgaaggtgag tcaaacaaaa tggttgattg gtcgagacgt ggggatagtg 1020 aggggcatag ttcacaycat ccgaatgtgt caccttcttc atcgcgtact gtgcctccag 1080 gaaacgcaca acagggaaat gattcagtgg tcagtcagca gcgtcagcct agaagtggag 1140 aatatttaaa tattgctgat ggggcggttg atacttaccc gttccgtaac agcatcggaa 1200 ccacgttgat gaatgtcaac cagagtcaaa ttctggctcg aaaggcaagc tgcaaggagt 1260 taccgtattt ttccggcaaa cccgaggagt ggcccatttt tatcgccaca tatgaaacct 1320 cgacagctgc ttgtgggtac tccgatgaag agaacacttt aagactgcag cgggcgctta 1380 aaggcaaagc tcttgaagcc gttcaaccat gtttgcttca tgcctctaac cttactagtg 1440 tgatagagac gctgcgcatg ttatacggtc gcccggagat aattgtgcat tcgttaatcc 1500 accgcataca tcaaatgcct gcaccacgaa tcgagcgctt ggaaactgta atcgattttg 1560 ggatggcggt ccgaaatatg tgtgccacta taaccgcttc ggggctagaa gaatataagt 1620 gcaacgtggc attaatgcac gagttggtag aaaagttgcc tcatgcgctt cggctagact 1680 gggcacgaca ccgtatgcaa tcgagctccg cgacactctc ggagtttggc aagtggatag 1740 aaacccaagt taaggcagct agtctgataa cattgccatc cctagaattc aggccagaga 1800 gaaaactcga ccataagagt agaacgcatc acatgaacat acataacgct acggggttgg 1860 ttagtgaagg aagtcaaatt tgtctactgt gtgaggccac ttgtgttgac ctktcgcagt 1920 gtgagcagtt caacagacta aatgtgaatg atcgatgggc aactatacgc agactgaatg 1980 cttgtcggaa gtgtctgaaa atacacacct acggctgcaa agggaaaaag gcttgcggga 2040 aaaatggttg cgagtayctg caccacgaac tattacacaa tcccgagcgc cattcaaaac 2100 caaacgagta cgcggttgcc acagcatcac tgaacgcgca ttctgaggtt accggtgatg 2160 tatttttaaa atacatccct gtaatagttt acgggcgaga caaggccatc acgacctttg 2220 ccttttttga cagtggatct accggtacct tcatcgagca tagcctaatt gaggarctcg 2280 gtctggaagg acagccccat cctctgtgcc tgaagtggac aggtaactcg gaacgcgacg 2340 aaacgggttc aattcgcgtg tcacttttgg agatttcagg agtcgggcaa aacagcagtg 2400 tacatattat yccaaaggtg cacacggtgc aaagcctttc tctaccagcc cagactctgg 2460 cagtgacaca actgactaaa cagtacgcgc atctacaggg tttrccaatt catgcgtacg 2520 aaaacgctag gcctcgattg ctcattggta tcgacaacca tcatttggtt cgaccgactc 2580 gttacgcgga aggtgggaaa tatgagcccg ttgcggcaaa gacgtcattg ggatggatcg 2640 tatatggccc acgtaayaag aactccaaca gtgttaacat acaagcaata cataccatcc 2700 acatatgcaa ctgcggcgca gatgcagagg ccaatctgga cgcagcggta aaaaactttt 2760 ttacgctcga atcgctcggt atcgtcaaac cgttggaaac acttcgttca aaggatgatg 2820 agcgagcgtt ggggattctc aacgcggaat cgaaattcaa cggaaagcat tatgaatccg 2880 gattgctgtg gcgtttcgac gatgttaagt tgccatgctc tcgtgacatg gctatgagaa 2940 gatacaaatg cctgcaaaaa agaatgtcga aagattctgt attagctaag gcagtcattg 3000 agaaaatgag ggactatgaa aggcagggtt atatacgacg attatcgcct gaagaactgt 3060 cgatgaaagg ttcccgagat tggtttctgc ccatatttcc ggttttcaat gcgaacaaac 3120 caggcaaggt tcgagtagtt ttcgacgcag cggcgaaggt tcagggtgtt agcctcaaca 3180 catatttact aaatggacca gatctgttgg cagggcttct atccgtatta tataaattcc 3240 gtgagcatcg tgtggccata gttggggata ttaaggaaat gtttttccaa gtccgtatga 3300 aaccagatga tcaacgatcc cagatgattt tatggaacga gaacgacttt acaggtagcg 3360 aaccggatgt ctatgcagtt gctgttatga ctttcggtgc tgcgtgttcg ccgagtaccg 3420 cacagttcat taaaaatttg aacgcagacc gttttgctga caaatattca cgggctgtaa 3480 aatgcataaa ggaagaacat tatgtagacg atcttttggc cagcgctgaa actgacgaag 3540 aaatcatcac gttagctgaa caagtccggc atatccatgc tgaaggtggt ttcgaaatcc 3600 gaaactggct gtctaactca catcgtgtga catctcacct acaacgcgag gcatcacccg 3660 aagacaagat aaacatgtgc tccgatgatc gcaccgaaaa ggttttgggt atgtggtggg 3720 atacgttaac cgacacattt actttcaaac tatcccccaa acacgatata cagctgctat 3780 ctggaggacg gatgccaaca aaacgcgacg ttctgcgaac agtgatggcg atttacgatc 3840 caatgggaat catcgcgaat tttctcatgt acgtaaaaat tcttatgcag gaaatttggc 3900 gtgctggcct tggatgggat gatgtgttgt ccggtcgact agccgaaaaa tggagcgttt 3960 gggttgcggt attgccgact atttcacaag tacgtgttcc taggtgttat cgtcagttca 4020 cttcggtaaa tgcaaaaata cagcttcatg tattctgcga tgccagcgag aacgggatgg 4080 ctgcagtggc gtttttccgt ttcaacgatg ggggtattat cgaatgctct ctggtgggtg 4140 ccaaaactcg cgtagctccg attaagtttg tatccatccc gcgcctagaa ctacaggccg 4200 cagtcatcgg agcccgcttt gctgcagcga ttatcgccca acaccgaata gccatcgaac 4260 gcattttcta ctggaccgat tcacgtgacg tgatatgctg gatgcggtct gatcatcgcc 4320 gatttagtca gttcgttgct ttccgagtag gcgaactcct ggagacgact tctgtgcatg 4380 agtggcgctg gctatcaaca aagttaaacg tagctgatga tggaacgaag tggcagaagg 4440 taccaactgc agatcccgat agtcgctggt tccgcggacc ggacttcttg tggaagcccg 4500 agtgtgagtg gccagttccc gaacaatatc ccaccgacac gaaggaagag ttacggctac 4560 acatgatgca tcataacaca aacacgaagc catggattca gctcgagcgt ttttcatcct 4620 ggaatcgctt gttaaggtca gtagtgtacg tcttacgatt tgtctccttt atcgtttcta 4680 agagaacggc tagaacaaca ggaccgctaa cacaaggcga gcttgaaaaa gcagaatatg 4740 ccattcttcg cttggcacag agggatgcct tttcacagga gatacgacga ttaaccgacg 4800 cacgtgacgc aagcgaacaa cgttgcacat ggaaatccgt gttgccgaag acgagcatac 4860 ttgcaaagct gtcccccgag gtcgatgaca acggagtact gagaatgcgc ggacgcctga 4920 ccaattgtcc ttgggttagc gaatctacca agcggcccgt gatcctgcca cgacaacacc 4980 aggtgacagc actaatttta gcagattttc atcgtaggtt ccagcatata aaccatcacg 5040 cggcaataaa tgcgatacga agcaagctat acataccgag actgcaagcc gaattcaatc 5100 gtatccgtag gacatgtcag cactgcaaaa accgtgatgc caagccagag cctccagaga 5160 tggggaacct tccctccgag cgtcttgctg cttatcagaa gccgttctca ttcacgggcg 5220 tcgactattt tggcccggta accgtagcag ttggtcgacg agtcgaaaaa cgttggggcg 5280 ttctcttcac ctgcctcacc accaggggca tccatctaga ggctgcgcac tccctgacca 5340 catcatcatg catcttagct atacgtcgtt tcatcgctag gcgtgggcaa cctttggaat 5400 tcatcagcga caacggtacg aacttcgtcg gtgcatcgcg agagcttgct gaggcctggg 5460 aagcgataga caaacagcgt ttagcggagg agtttacaac accgcgtctc gcgtggaaat 5520 ttatcccacc tggtggacca cactttggag gctgctggga acgactagtg cgatccgtca 5580 agaaggcgat gagcgaaata cggatgtctc ggctaccaac cgacgaagtg ttaacgacgg 5640 cattaacgga gatcgaagcg atgcttaatt ctcgccctct tacacaggta ccacttgaca 5700 gcgaatccga gcttccttta accccgaacc actttctgct agggacagct aacggagagg 5760 caccgaaagc agtattcagt gacgacatcg ctaccctaaa aaccacatgg aaggtatcag 5820 aggttatggc caatctcttc tggaagaagt gggtagcata ctacttaccg accttaacgc 5880 gcagagttaa gtggcatcat caggttcgtc cgattaagga gggcgacatt gtagtgatcg 5940 tggatccaaa ccttccccga aacacttggc ccaagggacg ggtagtggcg gtcatccagt 6000 cgaaggacgg gcaagtccgg cgcgcaaccg tagctacaag caccggaatc tacgaacggc 6060 cggccacgaa gatagccgtg ttggatgtac aacaggaaaa taatacttac ccgccggaga 6120 ccaacagtcr acagcactaa gaataaccca cgaaatggta agcgaaaaca catggtcacg 6180 acacaacaac aataacgaac tgaagagtca agagcttccg ggcaattgac tgggcgggag 6240 aa 6242 // ID GYPSY12-LTR_AG repbase; DNA; ANG; 806 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY12-LTR_AG is an LTR of retrotransposon GYPSY12_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY12_AG; GYPSY12-I_AG; GYPSY12-LTR_AG; Gypsy clade; KW mdg1 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-806 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY12_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 3(9), 167-167 (2003). XX DR [1] (Consensus) XX CC GYPSY12-LTR_AG is a long terminal repeat of GYPSY12_AG CC (its internal portion is deposited as GYPSY12-I_AG). XX SQ Sequence 806 BP; 321 A; 190 C; 133 G; 162 T; 0 other; tgtagcatgc acatgcacat gctatactgt ctataatcaa ttcacacgca attctcatca 60 cacttaataa acacaaattg tcaacaccac acaacacaca aaaaaggata aaataaccac 120 tcaagaaatt aacacaaccc aaaccacata aataacctta cgccaggaat tgtaagtcag 180 cagaaaaacc cttgtccaaa aacacttaaa aacacacaac ccgcacacag ccccgtaggt 240 gcgaaaacag cagaaaaaca ttacccagca caaaaacgtt tcaagtgcgt aaagtgtaaa 300 aaacacagca ttgcataata gcaaccaaca gttaataaat aatgaaatag gagacaccgt 360 acctcgtgtg tacaaacaga accgttaatc gtaagcgtag gaaacaaaac accacgtaag 420 cgccagaacc cagaactgac cttacagaat aattgcaacc atatgttaac cgaaaaacct 480 aaccctcagc ataaccgaca caaagcgaaa gtaaacgcaa aatcgaattg aagaaaggta 540 ccgtataaat aaagatgcaa gatgagacca ggacacacag gcagagacag ttagaatcac 600 agtcacgcac agtacgtaca gttgtaaagt ggtgctgcta agtcttggtc ggtcaagtct 660 cgatttgcaa acaaagtgaa atgtgtattt aattcagttg tgtttgtccg aaacctccac 720 catcgtgctt acaaatatac ctgaagagtg atgtctatag tatggcatcg ttgaactatc 780 cgtcgcgtcc gaaccaactt attaca 806 // ID MARINERN1_AG repbase; DNA; ANG; 1056 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE MARINERN1_AG is a nonautonomous DNA transposon - a consensus DE sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW MARINERN1_AG; nonautonomous DNA transposon; KW mariner/Tc1 superfamily. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1056 RA Kapitonov V.V. and Jurka J.; RT "MARINERN1_AG: a family of nonautonomous mariner/Tc1-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 2(11), 19-18 (2002). XX DR [1] (Consensus) XX CC There are several hundred copies of MARINERN1_AG in the genome. CC They are ~98% identical to the consensus sequence. CC MARINERN1_AG copies are flanked by 2-bp target site duplications. CC This element has 23-bp terminal inverted repeats. CC Putative classification: a nonautonomous Mariner/Tc1-like CC DNA transposon. XX SQ Sequence 1056 BP; 352 A; 172 C; 168 G; 363 T; 1 other; gggtaaatgt acctatagtg gtggtagtac caatagtggt gctattgctc taaaatgcgg 60 ttcatagggc aaattaaaag tatttaagtt atttaagtta gtgtgaaatg ttctttagcc 120 ttttacaaag ggtttaaaag ttaaatgggg ccattccgtt acttagtgac cgaaaaatcc 180 attattttgt gaaatgtgtc aacatttttc caaaatcctt aggatttcat aacgaaactc 240 atatttctgt taacgttcta aatgatcaca tacccacgca attactagtt tttcaacatg 300 actgttatgg aatgttattg ttttactaaa attatttgca ttttgaccat atttatgcat 360 ttttaagtgt tttttaccat agtgccatta taggtacaca aaacaggcat gttcctatag 420 tggttatagt tataaaatca ataaatacag gtcattttag attttttcga ggaaattttt 480 gacatgtcgt actgttttca ctaaaataag caacagaaat gagttttatk taggtaattt 540 accaaaaaca ggccaaaatt cgcttatctg gctgaatgaa aaattgactt caaatcaagc 600 acactcttcc gggtgttggt tgatgacagg tgtgatgcag tactttagtg aatagtttca 660 tcaatcaaat ttatacattt ttttacgatc aaattacgca agaaaccaat attatggcag 720 taatcgttgc tattcacacg aaaacatctt caaaactaag ttatattgat attatacact 780 gttttgaggc atccttgttt accaccacta taggaacaca atacgacgac tataggaacc 840 ataccaccat tataggtaca acgaagcaat cacaaaaata tgtatttttt cgtaaatttg 900 atgtgtttca tggtaaaaat ggttgcgatt gcgaagataa aacatattcc aattgctatg 960 tgttaaaata ttcagaaatt gtccttcctg acactttaaa tccattaaaa cacatcggta 1020 cctctacaaa ttgcaccact attggtacat ttaccc 1056 // ID CR1-9_AG repbase; DNA; ANG; 4308 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 2) XX DE CR1-like non-LTR retrotransposon - a consensus sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; CR1-9_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4308 RA Jurka J.; RT "CR1-like non-LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 635-635 (2009). XX DR [1] (Consensus) XX FH Key Location/Qualifiers FT CDS 141..4208 FT /product="CR1-9_AG_1p" FT /translation="MEGVCCDCSVPLESLECGIMCSFCDAVYHSACTKSPF FT TVIEEVKRTASLHWSCVGCSNALGNPHSKAVKTTGMQVGFQAALTAAVEVM FT KAALVPPVMHEIREGFASFAAAHQAPSEHFNVPPPDALPSGKRRRLFRDVV FT ASTEAIVVDNAPLYPVNTDTDNTHQRNRPSLPPIITGTDTTTTSIPTVSQL FT PRTNYMWLHLSRLAKTVTVDQVVSMVKSQLDTTDVIAFSLLKAGTTISSVS FT SLTFKVRIIAALRDKALTAGSWPVGLGVREFISLPQRPSHHRPDLTNTNTP FT IPVRHSPMHPVQLSPDMNCTLISPAPQKQLQPHQTPLKCTKLLFNTFFLSN FT NHTTVSSKGTVPLNLTSHLFTRNNNTSTTRSLTDDLAPTYTRIPLNVRSTS FT ITVYYQNVRGLRSKADEFRLTVLEADYDVVVLTETWLDPSLPTALLFGDAH FT CVYRCDRNAANSSLSRGGGVLIAVSSALRSHALPVLAQTLEFACICIQLPT FT YRLYIAAAYLPPNHSTDEGKINALLDTVNSICDTLGPNDRFVLVGDFNQPA FT LSWSPAQSNHETPFVFFEPHTSSVRSAQFVDGLHQNALYQLNNNTNSMGRI FT LDLAYGNWSSAATSSLIRVSEHPLLPIDCYHPPLEFDLEITASSMNHATTV FT TEIKLNYARADLTKLKRLICSFNQSFECSNYTSIDHATNDFSEFMRAALRE FT CAPIKKPHRGPPWGDRTLHALKKAKRAAYDYLRANRSTPCRLYYNGAHALY FT RRYNRVCYRRYIRQTERSLRKYPRRFWSFADNMRKSTGLPGTIRYKDNCAH FT SPLDICELFATRFEDSFISNITPPEAVASALVNTPADAIHVTLPIVTETLV FT TQKIERLKSSYSEGPDGIPATILKRCAVSVAPVLTKIFNESLRTGTFPSLW FT KVSWLTPIHKKGSKNEAVNYRGITSLAACAKVFELIIYEPLMASAGKYIST FT NQHGFMPKRSTTTNLMQFVSGCYKSIDDGMQVDAIYTDIKAAFDSVSHNIL FT LAKLDRLGLSPPLVKWMKSYLCGRSYAVRMGSHQSRSIGASSGVPQGSNLG FT PLLFLLYISDLCTALPDNSCLLYADDAKIYREIREPEDHLQLQATLTEFVS FT WCKRNALSVCIDKCAIISFSRSRAPALFNYTLDGQNLERVNCVKDLGVLLD FT AKLSFEEQIDQVVASGNRLLGLVINMTRELRDPMCFKTLYCALIRPLLEYA FT SIVWWPASTRALARLESIQRKATRFALRDWPRRLDYRTRCLLLGIPPLAER FT VEHTRLAFITGILNGTIDCPELLSRIHLYVPARILRRRPMLAVAETRTTFG FT SRNPLLCMCRLLNSANDIYEPGMTITELTSLLSVRNAFNNNNI*" XX SQ Sequence 4308 BP; 1122 A; 1153 C; 896 G; 1135 T; 2 other; gtcaaactgc cgcttagctg tgacttaagc tgtccctggt agtcctgcct tcaatcgttc 60 gtcacaacca ttcctgactt tgctgtattt cctgcgtttg ttcctgagtc ctgtggcatc 120 tatctgcaac ggcaactaaa atggagggag tttgttgtga ctgctctgtg ccacttgaat 180 cgcttgagtg tggtattatg tgctcgttct gtgatgctgt atatcactca gcgtgcacta 240 aatcaccttt cactgtgatc gaagaggtaa aacggacggc atctcttcat tggagctgtg 300 tagggtgttc gaatgctctt ggcaacccac atagtaaagc agtcaaaaca acgggtatgc 360 aggtcggctt ccaggcggct cttactgccg ctgtcgaggt gatgaaagct gctttagtcc 420 cgcctgtgat gcacgaaatc cgtgagggct ttgccagttt tgccgcggct caccaagctc 480 ctagcgaaca cttcaacgta ccgcccccag atgccctccc gagtggaaag cgtagaaggt 540 tgttccgtga cgttgttgca tccacggaag ccattgttgt tgacaatgct ccactatacc 600 ctgtaaatac agacaccgac aatactcacc aaagaaatcg cccttcacta ccaccaataa 660 taacaggaac cgataccacc accacgtcga tcccaaccgt gtcacaacta ccaagaacaa 720 actatatgtg gctacatcta tctaggttag ccaaaaccgt caccgttgac caggtagtgt 780 cgatggtaaa atcacagttg gacaccacgg acgtcatcgc ttttagtttg ctgaaagcgg 840 gaacaaccat tagctcggtt agttcactaa cttttaaggt tagaattatc gctgctcttc 900 gagataaagc acttaccgca ggatcttggc cggtcgggct cggtgtgcgt gagtttatct 960 cactcccgca aaggccgtct caccatcgtc ctgatcttac caacacaaac acacctatac 1020 ctgttcgaca tagcccgatg caccctgtac aattatcacc cgatatgaat tgtaccttaa 1080 tatcgcctgc tccacaaaaa caactacagc cacaccaaac accgttaaaa tgcaccaaac 1140 tactattcaa cacttttttc ttaagtaaya atcacacaac tgtttcctca aaaggcactg 1200 tgccactgaa tctgactagt catttgttta cacgaaataa caacacgtca acgacacgat 1260 ctctcaccga tgaccttgca ccwacttaca cacgcatacc actcaatgtc cgttccacgt 1320 ctataactgt atactatcag aatgtgcgag gtttacgatc caaagctgat gagttccgtt 1380 taactgtttt agaagcggac tatgatgttg tggtactcac tgagacttgg ctcgatccca 1440 gccttcccac agctctccta tttggcgatg cgcactgtgt ctatcgttgc gatagaaatg 1500 ccgcaaacag ctctctatct cgaggcggtg gagtactaat tgcggtgagt tccgctctcc 1560 gctctcatgc gcttccagtg cttgcgcaaa cacttgaatt cgcttgcatc tgcattcagc 1620 tcccgacgta ccgattatac attgctgctg cctaccttcc tcccaaccat agcacggacg 1680 aaggaaaaat aaatgcatta ctggacaccg tcaacagcat ttgcgacaca ctaggtccaa 1740 acgacagatt cgtgctcgtg ggagacttta accaacccgc tctttcctgg tcgcctgcgc 1800 aatcgaatca tgaaacccct tttgtgtttt ttgaacctca tactagttca gttcgtagcg 1860 cccaattcgt cgatggactg caccagaatg cgctttacca acttaataac aacactaact 1920 ctatggggcg cattttagac cttgcatacg ggaattggtc aagtgcagca acttcttcgc 1980 tcatacgcgt cagcgagcac ccgctgctgc caatagactg ctatcaccca ccgctcgaat 2040 ttgatttgga aatcacagca tcgtctatga atcacgctac cactgttacg gaaataaagc 2100 tcaattacgc acgtgctgac cttactaaac tcaagagact gatctgctcc ttcaaccaat 2160 cctttgaatg ctcaaactat acatccattg accatgccac gaacgatttc tcagagttta 2220 tgcgggccgc attgcgtgaa tgtgctccta ttaaaaagcc ccatcgcgga ccgccgtggg 2280 gtgaccgcac attacatgcc ctaaaaaaag caaagcgagc cgcttatgac tatttacgcg 2340 ctaatcgatc cacaccgtgt cggctttact acaatggtgc tcatgcactt tatcgacgtt 2400 ataatagagt ctgctatcgt cgatatatac gccagactga acgaagcttg cgaaagtacc 2460 ctcgccgctt ttggagcttt gccgacaaca tgcgcaagtc aactggcctc cccgggacta 2520 ttcggtataa ggataattgt gcacactcgc cattagatat ttgtgaactg ttcgccacgc 2580 gtttcgagga tagcttcatc tctaacatca cacctccgga ggctgttgct tccgcgctcg 2640 tgaacactcc ggcagatgca attcatgtca cgctacctat cgtcaccgag accctggtta 2700 cccaaaaaat cgagcgtcta aagtcgtcat actccgaggg gccagatggt atcccagcga 2760 caattctcaa gcgttgtgcc gtatcggtag ctcctgtttt aactaagatt tttaacgaat 2820 ctctgcgcac aggaaccttt ccatcactgt ggaaggtctc ttggctaaca cccatccaca 2880 aaaagggcag caaaaacgaa gcagtgaact atcgaggtat tacctcttta gctgcctgcg 2940 ccaaagtgtt cgagctaatc atctatgaac cattaatggc atctgccggc aaatacatca 3000 gcactaacca acatggattc atgcctaaga gatcaactac gacaaatctg atgcaattcg 3060 ttagcggctg ttataagtcc attgatgatg gtatgcaggt tgatgccatt tacacggaca 3120 ttaaggcggc ttttgacagt gtgtcccaca acattcttct ggcaaaactt gatagacttg 3180 gattatcacc acccctggta aagtggatga aatcgtattt atgtggtcgt tcatatgcag 3240 ttagaatggg ctcccaccaa tctagatcta tcggtgcctc ttccggcgta ccccagggca 3300 gcaacctggg gcctctgttg tttctgctct acataagcga cttgtgcacg gcactcccgg 3360 ataacagttg cctactctac gcagatgatg ctaaaatcta tcgtgaaatc cgtgagcccg 3420 aggaccatct tcaacttcaa gccactttga ctgaatttgt ctcctggtgc aagcgtaatg 3480 cactaagcgt ctgtatcgat aaatgcgcca taatctcatt cagccgttca agagctccag 3540 cgctgtttaa ttatacgctt gatggtcaga acctcgaaag agtgaactgt gtaaaggatc 3600 taggagtcct cctagatgct aaactctcct tcgaagaaca gatagaccaa gtagtcgcta 3660 gtggcaatcg tctacttggc ttggtcataa acatgacgcg cgagcttcgt gatcctatgt 3720 gtttcaagac actgtactgc gcacttatca gaccactgct ggaatatgct agcattgttt 3780 ggtggccagc atctactcgg gcacttgcgc ggctagagtc cattcaacgg aaggcaacgc 3840 gcttcgccct tcgtgattgg ccgcgtcgtc tcgactacag aactaggtgt ctgctgcttg 3900 gaatcccgcc tctcgccgaa cgtgttgagc acaccagact agcatttatc acgggaatct 3960 taaacggaac catcgactgt cccgagctgc tttcaaggat tcacctctat gttcctgcca 4020 gaatactccg tcgccgacca atgctggcag tcgctgaaac ccgaacaaca tttggctctc 4080 gcaaccccct tctttgtatg tgccgtctat taaattcagc taacgatatt tatgagcctg 4140 gaatgacgat aacggaactg acttcacttt taagtgttcg gaatgcgttc aacaataaca 4200 atatataaat gttctgttcg ttatctaatg taatgtattg taaacaaatt ttgactcgag 4260 agggcttcat agtccatcga ttaataaact aaactaaact aaactaaa 4308 // ID GYPSY55-LTR_AG repbase; DNA; ANG; 257 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY55-LTR_AG is an LTR of retrotransposon GYPSY55_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 5-bp TSD GYPSY55_AG; GYPSY55-I_AG; GYPSY55-LTR_AG; Gypsy clade; KW mag lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-257 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY55_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 150-150 (2004). XX DR [1] (Consensus) XX CC GYPSY55-LTR is a long terminal repeat of GYPSY55_AG (its CC internal portion is deposited as GYPSY55-I_AG). XX SQ Sequence 257 BP; 63 A; 64 C; 66 G; 64 T; 0 other; tgttgtggga acgcagcctc agcagcgcca gctgctcgac gtcgacattc gcctgtcgac 60 tattgaagca cgatcgtgtt gagtgcctga gtgttcccgt attggcgaac aaggaccgga 120 gccgcatgag agttgatacc gagcgctcaa gggcaacgga cggttgcaca cccacacacc 180 agtattcgcg tatcgtccta tgtgttattt taccgtgttt cctattaagt tttaaataaa 240 gtgtaagaat cacaaca 257 // ID GYPSY6-I_AG repbase; DNA; ANG; 4618 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 2) XX DE GYPSY6-I_AG is an internal portion of the GYPSY6_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY6-I_AG; GYPSY6-LTR_AG; GYPSY6_AG; Gyspy clade; gag; KW integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4618 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY6_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 81-81 (2003). XX DR [1] (Consensus) XX CC GYPSY6_AG is a family of autonomous Gypsy-like LTR CC retrotransposons. CC GYPSY6-I_AG, an internal portion of GYPSY6_AG, is flanked by CC GYPSY6-LTR_AG LTRs. The GYPSY6-I_AG consensus sequence was CC reconstructed based on multiple alignment of 3 copies; they are CC less than 1% divergent from the consensus sequence. Two elements CC are CC nearly 100% identical to the consensus and contain intact ORF, CC thus CC the family appear to be active. CC The consensus sequence encodes the 1511-aa GYPSY6_AGp protein CC (pos. 28-4560), composed of gag (zinc-finger, 317-353), reverse CC transcriptase (pos.690-859) and integrase (pos. 1207-1350) CC domains. XX FH Key Location/Qualifiers FT CDS 28..4560 FT /product="GYPSY6_AGp" FT /translation="MSKEDGKVLRELESVSCPNSEPATIGKIRALYNDVLS FT ITNTIESECIDQGDVDPKPTYSVTKEEASSAKPMHLSNHVEDEIAILKQRL FT EILELRSKIAVLENEKQASKINQPLLPEALKQIIPQFEDGVGVNHWLKVID FT HSSEMYGWDDNTTMLYASSRLVGAAKEWYNGCRSEICTLSEFTAGLKRAFP FT DNRNEAAIHKELMSVAKRRDETYTAFIYRVNAIGKSGNVSDEAILTYIING FT LARDRIYDNLVVRDYKDIYELMAHIQRCETLLKMRDTPSTSSYQRLENTRV FT NENQREKPNRAAQINRKDALASVRCYNCSSFGHYSGQCLKPRRTIGACFSC FT GQHDHQKLECANAAGKNNIPVFATNETTSERDSYADIEKIKAYQQVSVSLE FT TLNGVKKFHSYVYSLFDSGSPTSFISETFVPDGAAKKASPSGYKGLGNKPL FT SSLGHIVCNIGFRHKIISHRFIILSKSDMDWPIIIGRDALDKLNVHLYHSF FT YLYSKEKLLSLNNKTMNYLKPHMMERLISLGVLKNDSDDRLKRKMPLIKAI FT EENKYNLKEIPFHSLNHAILESSLSNRKKDDNSNQSFISDEFKYGETFAEM FT CAIDVNLPEHDFNIGSIVGIRERKQISSIIIDNYIKRQTGKIYKENHKMTI FT SLTHDTPIFCKPRKLSFAERNHVREIVKDLLQRKIIRQSNSAYASPIVLVK FT KKNGETRMCIDYRPINKITVRDNYPLPLIDTCIEHLGDKQYFTLLDLKNGF FT HQIDMAPASIKYTAFVTPDGQYEYVKMPFGLKNAPSEFQRCINSVLREFID FT NREIVVYLDDIIIASKDLHNHLRILSAVLQTIRKNGLELKLEKCMFVHDEL FT DYLGYHASSKGIRPSNYHIKAIENYPQPSNSKEVQRCLGLFSYFRRFVPSF FT STIAKPLSSLLKQNAPYNFDGPCISAFNTLKEMLIKAPVLAIYDPQKETEL FT HCDASSVGFGSVLLQKQFDGKFHPIAFFSKTASPAEAKLHSYELETLSVIY FT ALKRFHTYVHGVPIRIVTDCNSLVETLKNKNCSAKIARWSLFLENYDYTMQ FT YRPGCSMGHVDALSRQQIAVATQDLDIDSQLRIAQSRDQEIERLKSVIERG FT MVNGYTLHDNLVFRKLENGRLQFRVPRDMINNVIRSTHESIGHLGVDKCCT FT QISKNYWFPLMKTRVENFIKNCLKCIIYSAPARKNNRNLHSIPKIPLPFDT FT IHIDHLGPLPSLQSKKKYILVVIDAFTKFTKLYPTTNTSSREVCNALNQYF FT SYFSRPRRLISDRASSFTSTEFKQFILKHNITHVLTAVCSPQANGQVERVN FT RVIVPILSKLSDPIDHADWSSKLIAAEYALNNTVHSSTRISPSVLLYGIEQ FT KGPNINELTEYLKDNIPTTPRDLESIRVQASENIVNSQFTNETQFMKTHRP FT AIEFEVGEYVVIRNVDNSVNTNKKLIAKFRGPYVIHKRLPNDRYVVRDIEG FT FQQTQIPYDGVLESDKLRRWIASETDLASESLEVSPSDSLAGPDLG" XX SQ Sequence 4618 BP; 1568 A; 860 C; 899 G; 1291 T; 0 other; tctcagaagt gggattagtg accaaaaatg tccaaagaag acggaaaagt gttgcgtgag 60 ttggaatcag tcagttgtcc taactcggaa cccgcaacaa ttggcaaaat tcgtgcgctg 120 tacaacgatg tgctatcaat tactaacact atcgaaagcg aatgcattga tcagggagac 180 gttgacccta agccaacgta tagtgtaaca aaggaagaag catcgtcagc aaaacccatg 240 catttatcta accatgtgga agatgaaatt gctattctca agcaaaggct tgaaattttg 300 gaactacgaa gtaaaatagc ggtgcttgaa aatgaaaaac aagccagtaa gattaatcag 360 cctttgctac cagaagcgtt gaaacaaatc atcccgcagt tcgaagacgg cgttggtgtt 420 aaccattggt taaaagtgat cgatcacagc tccgagatgt atgggtggga cgacaataca 480 actatgcttt atgcaagtag ccgattagtc ggtgcagcca aagaatggta caatggttgt 540 cgtagcgaaa tatgtacctt gtccgaattt acagcgggat tgaagagagc ttttcctgac 600 aatcgaaatg aagcggctat acataaagaa cttatgagtg tcgctaaaag gagagacgag 660 acgtacaccg catttattta tcgagtgaat gctattggta aatcgggcaa tgtaagcgat 720 gaagctattt taacgtacat cataaatggc ttggcacgtg ataggattta cgataatctc 780 gtagtgcgtg actataaaga tatatatgaa ctgatggcgc acattcaacg atgcgaaact 840 cttcttaaaa tgcgtgatac accaagtacc tcgtcctacc aacgtttaga aaacacacgg 900 gttaatgaaa atcaacgtga gaaaccgaac agagcagcgc agatcaacag aaaggacgca 960 cttgctagtg tgcgttgcta caactgctct agttttgggc attattccgg ccagtgcctg 1020 aagcctcgac gaacgattgg tgcatgtttc tcgtgtggac agcacgatca tcaaaagctt 1080 gaatgcgcca atgcagccgg gaagaacaat atacctgttt ttgcaactaa tgaaactaca 1140 agcgagaggg atagctatgc cgatatagag aaaattaaag cgtaccaaca agtaagtgtt 1200 agtttagaaa cgttaaatgg ggttaagaag ttccattcat atgtttattc attgttcgat 1260 tctggcagcc cgacgagttt tatcagtgaa actttcgtcc cagatggagc cgcgaaaaag 1320 gcttctcctt ccggatacaa gggactaggg aataaaccac tttcgtcgct tggacacatt 1380 gtctgcaata ttggatttcg tcataaaatc atatctcatc ggtttataat cctttccaaa 1440 tcagacatgg attggccaat cataatcggt agagatgctt tggataagtt aaacgttcat 1500 ttgtatcact cattttattt gtatagtaaa gagaaattat tgagcttgaa taataaaaca 1560 atgaattact tgaagccaca tatgatggaa cgccttatct cattgggggt tttgaaaaat 1620 gattcggatg acaggttgaa gcgtaaaatg ccacttatta aagctataga agaaaacaaa 1680 tacaatctaa aagagatacc attccatagc ctaaaccacg cgattctaga aagttctctt 1740 tctaatcgga aaaaagatga taatagtaat caatccttca tttctgatga attcaaatat 1800 ggtgaaactt ttgctgaaat gtgtgctata gatgtcaatt tgccggaaca tgattttaat 1860 attgggtcta tagttggaat tcgggaaaga aagcaaatta gctctattat tatcgataac 1920 tatatcaaac gtcagacagg taaaatatac aaagaaaatc ataaaatgac gataagctta 1980 acacacgata cacctatttt ctgtaagcca cggaaacttt cttttgcgga acgaaatcat 2040 gtgcgtgaaa ttgttaaaga tcttcttcaa agaaaaataa tccgacaaag taactcagca 2100 tatgcttctc caattgtttt agttaaaaag aaaaatggtg aaacaaggat gtgtatcgat 2160 tacagaccaa ttaacaaaat aaccgtacga gataactatc cattaccttt gatagacaca 2220 tgtattgaac acttgggaga taaacaatat tttactttac tggaccttaa aaatggattc 2280 catcaaatag atatggcccc agcatcgata aagtatacag catttgtcac cccagatgga 2340 caatacgagt acgttaaaat gccgtttggt ttaaaaaatg ctccctcaga gtttcaaaga 2400 tgtatcaatt ccgtattacg ggaatttata gataaccgtg agattgtcgt ttacctcgat 2460 gatattatta tagcatcaaa ggacttacac aatcatcttc gaatattaag tgcagtttta 2520 caaacaattc gtaaaaatgg ccttgagtta aagttagaaa agtgtatgtt tgttcatgat 2580 gaactagact accttggcta ccatgcaagt agtaaaggaa tacgacctag caattatcat 2640 attaaagcaa ttgaaaacta tcctcagcct tcaaatagca aggaagtaca gagatgtctc 2700 ggactattct catattttcg aagatttgtg ccatctttct cgactattgc aaaaccttta 2760 agtagtttat taaaacaaaa tgcaccgtat aactttgatg gtccctgcat tagtgcattc 2820 aatacactga aagaaatgtt gataaaagca cctgtgttag ctatttacga tccgcaaaaa 2880 gaaacagagc tgcattgcga tgctagttca gtaggatttg ggtcagtact tctacagaaa 2940 caatttgatg gaaaatttca tcccattgca tttttctcaa aaaccgcatc gcccgcggaa 3000 gcgaaattgc acagctatga attggaaacg ctttcagtta tttatgctct gaaacgattt 3060 cacacgtatg tacatggcgt accaatacga atagtaaccg attgcaactc tttagttgaa 3120 acattaaaaa acaaaaattg ttcagcaaaa atagcacgat ggtcactttt tttagaaaat 3180 tacgactata ccatgcaata ccgtcccggg tgttcgatgg gtcatgtaga tgctttaagc 3240 cgccaacaaa tagcagtagc tacacaagat cttgatattg actctcagct gcgaattgca 3300 caatctcgag atcaagaaat tgaaagactc aaaagtgtca tcgaaagagg gatggttaat 3360 ggttacacct tgcatgacaa cttagtattt cgtaagcttg aaaatggaag attgcagttt 3420 cgcgttccca gagacatgat taataatgta ataagaagta ctcatgaaag tataggtcat 3480 cttggggtgg ataaatgctg tactcagata agtaaaaact attggtttcc attgatgaaa 3540 actcgagtgg aaaatttcat aaaaaattgt ctaaaatgta taatttactc tgcaccagcc 3600 cgaaaaaata atcgaaatct acatagtata cctaaaattc cactaccatt tgatacgatc 3660 cacatcgatc acctaggacc tctaccatct ttgcagtcaa aaaagaaata catccttgtt 3720 gtcattgatg catttacaaa atttacaaaa ttatatccta caactaacac tagcagtcgt 3780 gaggtttgta atgctcttaa ccagtacttt tcatatttca gtcgccctag gcggctaata 3840 agcgatcgtg cctcttcttt tacatcgacc gaattcaaac aatttattct taaacataat 3900 atcacgcatg ttctaacagc cgtttgctcc cctcaagcga atggtcaagt agagcgagta 3960 aaccgtgtaa ttgttcccat tttgagtaag ctttctgatc caattgacca cgcagattgg 4020 agttcgaaat tgattgcagc tgagtatgct ttaaataata cggtacattc atctacacgc 4080 atttctcctt ctgttttact gtatggaatt gaacaaaaag ggccgaacat aaatgaatta 4140 acagaatacc tcaaagacaa tattccaacc actccccgag atttagaaag catccgtgtt 4200 caagcatctg aaaatattgt caattcgcaa ttcactaatg aaacacaatt tatgaaaacc 4260 caccggccag ctatagaatt tgaagtcgga gaatatgtcg ttatacgcaa tgttgacaat 4320 agcgttaaca cgaataagaa attaatcgct aagtttcggg ggccttatgt tatccataaa 4380 cgcctcccta acgatcgtta tgtggttcga gacatagaag gatttcaaca aactcagata 4440 ccttacgacg gagttttaga atctgataag cttaggagat ggattgcatc tgaaaccgat 4500 ctggcgtccg aatcattaga agtaagtccc tcagatagtt tggcaggacc agatttaggt 4560 tagttaggtt aattataatg taaattgagg tcaattcata cgtcaggata ggccgagc 4618 // ID RT2 repbase; DNA; ANG; 6733 BP. XX AC M93691; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae RT2 retroposon. XX KW R1; Non-LTR Retrotransposon; Transposable Element; RT2. XX NM RT2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6733 RA Besansky N.J., Paskewitz S.M., Hamm D.M. and Collins F.H.; RT "Distinct families of site-specific retrotransposons occupy RT identical positions in the rRNA genes of Anopheles gambiae."; RL Mol Cell Biol 12(11), 5102-5110 (1992). XX RN [2] RP 1-6733 RA Butera S.T., Perez V.L., Besansky N.J., Chan W.C., Wu B.Y., RA Nabel G.J. and Folks T.M.; RT "Extrachromosomal human immunodeficiency virus type-1 DNA can RT initiate a spreading infection of HL-60 cells."; RL J Cell Biochem 45(4), 366-373 (1991). XX DR EMBL/GenBank/DDBJ; M93691; Positions 1 6733. XX FH Key Location/Qualifiers FT CDS 1533..2855 FT /product="RT2_1p" FT /translation="MKEQNAKLLEQITGMCQLLQEEKEEAKRREEKLEAQM FT EKLAAAHQRDRDVLNSLLAAKVGGGQPSASPRQPPTPLPRRSSAQPQQQQQ FT QQQRNQHEQEQPRASTSRAVMPPRSEALTAVRGDVVPELTYSEVVRRRYRG FT KATGKPRSQQQPQQQQQQRQLQRQAVGIAQHQQQQQQRQPQRQAVAGSQQQ FT QQERMQQQQQLQRKRKPRPDIIEVSPSEGETWDGIYDKVRKAIRLDAAHSE FT MKGHIKQGRRTHARLLRMELSKTANAPLMLEGVRKIIGDAGVSRLVTEMGE FT LLVVDIDPLATEEDIIAALDAKIGASAGVVSASIWELPDGSKRARIRLPVK FT SARQLEGLKLFLCDCVSKVRAAPPTPPERQRCFRCLEMGHIASNCRSTADR FT QNLCIRCGLTGHKARSCQNEAKCALCGGAHHIGHSECARSAQRCSRP" FT CDS 3280..6477 FT /product="RT2_2p" FT /translation="MVEQLGLIVINQGREYTFVGNGVALPSIVDVAFASPS FT IARPDTWVVSTSYTASDHRYVLYTVGGTPPSPEQLQNHQQTAQQSSQQQQQ FT QQQQQSLQQQQLSQQQQQQRQRQPSSQQGDSSSQRRVRHAGRRWKASQFSP FT SSFLEALFAADFVQRASTQEGMIAAMLKACDETMQRVTRLHQDPHRNIFWW FT SPLLARLRNNCEVARDRMLQTADLEERSIAAAEHRTARAELSRAIRASKRN FT LFQELIEIAEENAFGAGYRVVMSRLRGSRTPSEADRVVLERIISDLFPEHP FT PCDWSQLSNVGSVEGATTTAGIAPVTDDELLLIASQMANHKAPGLDGIPNA FT AVKTAIMLFPESSGFLYQDCLNRASFPAQWKRQRLVLLPKQGKPPGESSSY FT RPLCMLDALGKVLERLILNRLNEHLEEPSSPRLSDRQFGFRRGRSTVSAIQ FT RVVEAGRTAMSFRRTNGRDNRFLLVVALDVKNAFNTANWQSIASRLQAKGV FT PVGLQRMLRSYFEDRVLYFDTSEGPVVRHVTAGVPQGSILGPTLWNIMYDG FT VLDVPLPPDVEVIGYADDLALLVPATTTDEVRARAEEAVDQVQRWMQQHGL FT ELAPAKTEAVLISSKKTPPQVTFRVGDVEVQSSRSIRYLGVQLQDHLKWRD FT HVTKVSEKASRVVAAVTRLMQNHSGPRTAKSRLLAYVAESVLRYAAPVWAE FT ATQVRECRRMLQRVQRKAAIRVARAFRTVRYETATLLAGLVPICHLINEDA FT RVHQQLLAPDRAATREDIRATERQNTIDCWQEEWDADALQQDASRHTRWTH FT RVIPSVGDWQSRKHGDMTFHLAQVLSGHGFFRDYLCHNGFTSSPDCQLCVG FT VPETADDAFFECPRFAAVRQELLGEGGPDPVCPDTLQRHLLRDADSWSRIC FT EGAKRITAQLQRAWDEERAALAVNVIERQDEDAAELEAQRAEVRRARNERR FT NANRRAATARRREERRAGLPPTPPASPRTAQRRAALRERQARFRERRRNRR FT LLGMSGQAINNDDDGRERADLAAGPSGMRNRAIDEENEATDGGLNAAEEAA FT VVEAEVASR" XX SQ Sequence 6733 BP; 1602 A; 1879 C; 2003 G; 1249 T; 0 other; gagtgttcat cgagtagccg tcgagtgaag acggatgtaa acacgctccg acagagtttc 60 ggaaaaaagt gtgtaaaacg gcctcggatc gctcggaaag tgttaaaaac gagtttagag 120 cacccgtttt acctttagac ggtgttattt acccgaaaat cggtgatttt ttgtaaaagt 180 gaaaaatcgt gtttttgtgt atacatttgc ccccccccgg tagcaccaaa tttccgggtg 240 tgggaattaa gtgtaaaaat ctgcgatttt agtgccggaa acacttgtat ttaactaaat 300 ggacgcttct gaagcgattc caagtgtttt ggaaacgatc cgtgcagcaa aaagtgaaaa 360 aaacgcgaaa accaaaagtg tcgggggtgc taccgctggg gcatgtgtcc gcggtaaata 420 acgaaaaaaa cagtaaatcg tgaataactc ggtgagtttt ggtcggattc gagtgcggtt 480 ttcaccattg tgcgtgtctt tgtgcaaaca ataagaatca aacaaaaaaa cagtgaaaat 540 cggtgaaaaa atttttgaca tttgtgagtg acagttcagt aatacccacg ggacaaattt 600 ggccggggcg caattcccaa gggcaaaatt tgaattttcg gcccggtggc ggttggcggc 660 ggaaccgagg gtgccgaaaa attgtcaaat ttgcaggggt ggtagagcaa gtaacgccga 720 ctacgggaaa attattttcc cccctccccc ccctcccccc tcccccccag aggtcagaat 780 agagtagggc gcccgataag agtcggataa ggtcgaggtc gcccgaaaat tctcattttt 840 ggtagggtgg tagagcgtga ccccagcagt ccgggaaaat tattttcccc gctcccaccc 900 ccccccccct cccagtgaaa aattttataa gtgtcaaaaa gtcggaaaag tggtccaaaa 960 agggtggaat tccgggtcga gccgtttgga cgagtgaacg tgccgcgaaa atttgggagt 1020 ggtgcggaat aattccccac gtgtgtgacg caccctcctg aattttttcc ccaccccctt 1080 cccccctccc ccccgcccac cagggggcac aaattggacc tttttcgccc tgaacaccgg 1140 agttgaagcg gaaaccgaca tcatcgtgca gcagatcacc agctaggcgg cgcagaatcc 1200 agggtcattc cgaccccctg ggtcattcga cccccgggtc attcgacccc cagggttatt 1260 ttaccccacc catcgcgaac gtcggttatc gcgatggaag cacccgggag atcgacccgc 1320 tcggtgctag ggcgacgtcg gtggactgcc gcaccagctt ggctccttgc tccaagctct 1380 ttgcggccga gccacgtgtt gcgcttccta agctaagcgc cacaggcgct agtaagccca 1440 ttgccgagcc aaaagctgca tcagcgacac cggctccgga gctcgagttg ctcagagcta 1500 caatacaacg gctcgaggag cagaactgtg cgatgaagga gcaaaacgca aaactcctgg 1560 agcagataac cggcatgtgc caactgctgc aggaggaaaa ggaggaggca aagcgccgtg 1620 aggagaagct cgaggcgcag atggagaagt tagccgccgc acatcaacgc gatcgagatg 1680 tgctcaactc tctgctggcg gcaaaggttg gcggtggaca accgtcagct agtccacgtc 1740 aacctccaac tccgttgccg cgccgatcct ctgcgcagcc gcagcagcag caacagcagc 1800 agcagcggaa ccagcacgag caggagcagc cccgcgcgtc gacgtcgcgc gctgtcatgc 1860 cgccgcgtag cgaggcattg acagccgtcc gcggagacgt cgtgccggag ttgacctaca 1920 gcgaggtcgt gcggcgtaga tatcgcggca aagctacggg taagccacgc tcccagcagc 1980 agccgcaaca gcagcagcag cagcgtcagc tacagcgaca ggcagtcggt atcgcgcagc 2040 atcaacagca gcagcagcaa cgtcagccac agcgacaggc ggtcgctggc tcgcagcagc 2100 aacagcagga gcgtatgcag cagcagcagc agctacagcg taagcgaaag ccgaggcctg 2160 acatcatcga ggtgtctccc agcgaaggcg aaacctggga tggcatttac gacaaggtgc 2220 gcaaagccat tcgtctggac gcggctcaca gcgaaatgaa agggcatatt aagcagggcc 2280 gccgaactca tgctaggctg ctacgtatgg agctgagcaa gacagcaaac gctccgctta 2340 tgctggaagg cgtccgcaaa atcatcggcg acgcaggcgt cagtcggctt gtcacagaaa 2400 tgggtgagct gctggtagtc gatattgatc cccttgctac ggaggaagat atcattgctg 2460 ccctcgatgc taagattggc gcaagtgctg gagttgtttc tgccagcatt tgggaactac 2520 cggatggttc gaagcgagca cgcatccggc tacctgtgaa gtcggctcgg cagttggaag 2580 gacttaaact gttcctgtgc gactgtgtga gcaaggttcg agcagcccca ccaacgcctc 2640 cagagcgaca gcgctgtttt cgctgtctgg agatgggcca catcgcctcg aactgccgtt 2700 ccaccgcaga tcggcagaat ctgtgcatcc gctgtgggct taccggacac aaagcacgat 2760 cctgccagaa tgaggcaaag tgcgcactgt gcggtggcgc tcaccacata ggccacagcg 2820 aatgtgctcg ttcggcccaa cgatgttccc ggccctgaaa gttctgcagg cgaacctggc 2880 catggccgtg atgcccagaa cctggtgctg caagctgcca gagaggagaa agcagacgtg 2940 ctcattctct ctgatgttct gcgcccacct gaaaacaacg gccggtgggc attcagcagc 3000 tgcaaggcgg tagcggtggt agctgtcggt gagctaccaa tacagcgggt gtggtgcagt 3060 gaagctcagg ggttggttgc agcgcagatc ggcggagtgg ttttcatcag ctgctatgct 3120 ccaccaagcc ttaacctcgc agagttcgag cgcttcttgg aagcaataga actcgaaggc 3180 ttctcccacc ctcaagtcgt cgtcgccggc gattttaacg ccagacatga ggagtggggc 3240 agcccgagga cttgcgaccg cggggaagag ctgcacggaa tggtggagca gcttggccta 3300 atcgtgatta atcaaggtcg ggaatacacg tttgttggca acggggtggc tctcccgagt 3360 atagtggatg tggcattcgc gagcccgtcg atcgctcgcc ctgacacctg ggttgttagc 3420 acaagctaca ccgcgtcaga ccaccgctat gttctctaca ccgtgggagg aacaccacca 3480 tcccccgaac aactgcagaa ccatcagcag acagcgcagc agtcgtctca gcagcaacag 3540 cagcagcagc aacagcagtc cttacaacag cagcagttat cacaacagca gcagcagcaa 3600 cgacagcggc aaccatcgtc gcagcagggt gattcctcaa gccagaggcg tgtgcgtcac 3660 gctggccgcc gatggaaagc ctcgcagttt tctccttcct cattcctcga ggcgctgttc 3720 gctgcggact ttgtccaacg cgcatcgacc caggagggta tgatcgcagc catgctgaag 3780 gcctgcgacg agacgatgca acgggttacc cgactgcacc aagacccgca tcggaacatc 3840 ttttggtggt ctcccctcct ggctcgcctg aggaacaact gcgaggttgc ccgtgaccgg 3900 atgctgcaga cggctgactt ggaggagcgc agcatagcag cagctgagca caggacagca 3960 agggcggagc tcagccgagc aattcgagct agcaagcgga acttgttcca ggagctgatc 4020 gaaattgcag aagagaatgc gttcggggca ggataccgtg tcgtcatgtc ccggctccgg 4080 ggcagtcgga cgccttcaga agcggaccgg gtcgtcctcg aacgcattat atccgacctc 4140 ttccctgagc atccgccctg cgactggtcg cagctgagca acgttggaag cgtcgaggga 4200 gcaacaacaa cggcgggaat tgcaccggta acggacgacg agctgctgct catcgcgagc 4260 cagatggcaa atcacaaagc accagggctc gatggcatcc cgaatgcggc agtgaagacg 4320 gccatcatgt tgttcccgga gtcttccggg ttcctgtacc aggattgcct caaccgcgct 4380 tcgtttccgg cgcagtggaa gaggcaacga ctggtgttgc tgccgaagca ggggaagcct 4440 ccgggagaga gcagctcgta ccgacccctc tgcatgctcg acgcactggg gaaggtactt 4500 gagcgcctca tcctgaatcg cctcaatgaa catctcgagg agccgtcttc accccgactg 4560 tcggaccggc agttcggatt tcgtcgaggg cggtcgacag tgagcgccat ccagcgggtt 4620 gttgaggccg gccgtaccgc catgtcgttc cggcgaacga acggccggga taaccgtttc 4680 ttgcttgtgg tggcgttgga cgtgaagaac gccttcaaca cggccaactg gcaatccatt 4740 gccagccgcc ttcaggcaaa aggtgttccc gtcggcctcc aacggatgct tcgaagctac 4800 ttcgaggatc gtgtgctgta ctttgacacg agcgaaggcc ccgtcgtacg gcatgtaacc 4860 gccggtgttc cacaggggtc catcctgggc ccaactctgt ggaacatcat gtatgacggg 4920 gtgctggatg tgccgctacc tcccgacgtc gaagtcatcg gatacgcgga cgatctggcg 4980 ctgttggtac ctgctaccac cacggacgag gttcgcgcga gagcagagga ggccgttgac 5040 caggtccaac gttggatgca acagcacggt ctggagctgg ccccagccaa gactgaagcc 5100 gtcctgatct caagcaagaa gactccgccg caggtgacat ttcgcgtcgg tgacgtggaa 5160 gtccagtctt ctaggagcat caggtacttg ggcgtgcagc tccaagatca cctgaaatgg 5220 cgagatcacg tcacgaaagt ctccgaaaag gcgtcgcgcg tggtggcagc tgtaacgcgc 5280 ctcatgcaaa accacagcgg ccccaggacg gccaagtcaa ggctgctggc gtatgtggca 5340 gaatcggtgt tgcggtatgc tgctcccgtc tgggctgagg caactcaggt gcgagagtgc 5400 cgacggatgc tgcaacgagt tcagcgaaaa gcagccatca gggtggcacg ggcattccgt 5460 accgtcaggt atgagacggc caccctactc gctggactcg taccgatatg ccacctcatc 5520 aacgaggatg ctcgggttca ccaacaactc cttgctccag accgtgctgc aacgagggag 5580 gacatccggg cgacggagcg gcagaacacc atcgactgct ggcaggagga atgggatgcc 5640 gacgcactgc aacaggatgc tagccggcac acgcggtgga cgcaccgtgt aattccatcg 5700 gtcggcgact ggcagtctag gaaacatgga gatatgactt tccatctggc acaggttttg 5760 tccggacacg gatttttccg tgactacttg tgccacaatg gattcacgtc gtccccagac 5820 tgccagttgt gcgtcggcgt cccagagacg gcggacgacg ccttcttcga gtgcccacgc 5880 tttgcggcag ttcgacagga gctactcggc gagggaggtc cggaccctgt ctgtccggac 5940 accctccagc ggcacttgtt gcgcgacgcc gatagctgga gtcgtatttg cgaaggcgcg 6000 aagcgaataa cggcccaact gcagcgagcg tgggacgagg agcgggcagc attggctgtt 6060 aacgtcatcg agcgtcagga cgaagacgca gcagagctgg aggcccaacg cgcggaagtg 6120 cggcgggccc gtaacgagcg acgcaacgcc aaccggcgag cagcaacagc acgccggcgg 6180 gaagaacgtc gagctggact tcccccaacc ccaccagctt caccacggac cgctcagcga 6240 cgtgcagcgc ttcgtgagcg tcaagcgcgg ttcagggaga ggagacgaaa ccgtcgtctt 6300 ctcgggatgt ccggccaggc gataaacaat gacgatgacg gccgagaacg cgcggattta 6360 gcagccggac catctggtat gcgcaaccgc gcaatcgacg aggaaaacga ggccacggac 6420 ggtggcctga acgccgcgga agaagccgcc gtggtcgagg cagaagttgc ctcccgctag 6480 gcgtggtatg cacgtaaaaa cagcgacgca tcgagcgtga aaaagcgccc tatttagggg 6540 aatgagtgaa tttttcggtt gaatttagaa atagaaataa aacatggaag gtgcttttcg 6600 cacggacaaa aggttggcca ttaagccatg aaaccccctg cagggtaagc cctcgcgggt 6660 aaaatgtagg ggagcgggag ggtttaattt tgaaacaaaa taaaaaaccc gtttataaaa 6720 aaaaaaaaaa aaa 6733 // ID Mariner-N20_AG repbase; DNA; ANG; 479 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 2) XX DE Nonautonomous Mariner DNA transposon - a consensus sequence. XX KW Mariner/Tc1; DNA transposon; Transposable Element; Nonautonomous; KW Mariner-N20_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-479 RA Jurka J.; RT "DNA transposons from African malaria mosquito."; RL Repbase Reports 10(9), 1188-1188 (2010). XX DR [1] (Consensus) XX CC TA TSD. ~96% identical to consensus. XX SQ Sequence 479 BP; 156 A; 94 C; 81 G; 147 T; 1 other; tacagggttt accagcataa taaacaactg tccacttgtt tataacaata gtcgttttga 60 atagacaccg ctgtctacca cttgctcaca cgtcagatgg tgtaagctac tctgtccatt 120 tcaataacac aattttcgcc ttcgatagca actgtcagat tcggcacagt ttgttttgtt 180 ttgacaaatt ttgcaaaaac aaaacatttt caaacacgtt tcttttattc aagmaatatg 240 cagtataaac catacatttc aataaatcaa aacattaatt ttgttaatta tacaacaacc 300 acaagaaacc gtgccgaatc cgaccgttgt gcaatcgaag gcgaaaattg tgttattgaa 360 ttggacagag tagcttacac catctgacgt gtgagcaagt ggtagacagc ggtgtctatt 420 caaaacgact attgttataa acaagtggac agttgtttat tatgctggta aaccctgta 479 // ID Clu-12_AG repbase; DNA; ANG; 409 BP. XX AC . XX DT 03-SEP-2010 (Rel. 15.09, Created) DT 03-SEP-2010 (Rel. 15.09, Last updated, Version -1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-12_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-409 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1436-1436 (2010). XX DR [1] (Consensus) XX CC TAA TSD. XX SQ Sequence 409 BP; 123 A; 66 C; 86 G; 134 T; 0 other; gcgaaaggcc acgggagtga caaaatgctg acaaattttt caaaatgtga gcttttgtca 60 cttttttgag cttttgtcaa aattttgtga cctggagcag ccaggaaaaa aagttgacaa 120 aagttgacaa aatttgacgt tcgtgaaaaa gcgtaaacaa acagctattt ggcgcagttg 180 tgacccattg ttatgataat aatgctaaaa tgcatgattc taattgattt atgtacctag 240 ttagtgcttc ttaaacaaaa tatcgatgat attcagatgt tggagctttt tgtcgctctt 300 gtgtatgttt ttggtgcggc cacgagaaaa gctctttttt gcagttgaca agcaattttg 360 acaaagttga aaaatttttc aacatttttt cagctccgtg gccacgcgc 409 // ID GYPSY21-I_AG repbase; DNA; ANG; 4383 BP. XX AC . XX DT 05-FEB-2004 (Rel. 9.01, Created) DT 05-FEB-2004 (Rel. 9.01, Last updated, Version 1) XX DE GYPSY21-I_AG is an internal portion of retrotransposon GYPSY21_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW AP protease; GYPSY21-I_AG; GYPSY21-LTR_AG; GYPSY21_AG; KW Gypsy clade; RNase-H; gag; integrase; mag lineage; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4383 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY21_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(1), 7-7 (2004). XX DR [1] (Consensus) XX CC GYPSY21_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its reverse CC transcriptase, is CC phylogenetically grouped with representatives of the mag CC lineage of other organisms. CC GYPSY18_AG, GYPSY19_AG, GYPSY20_AG, GYPSY22_AG, GYPSY23_AG, CC GYPSY24_AG, CC GYPSY25_AG, GYPSY26_AG, GYPSY27_AG and GYPSY28_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY21-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 1434-aa GYPSY21_AGp gag-pol like CC protein CC (pos. 68-4369). CC The sequence of the LTRs flanking GYPSY21-I is deposited as CC GYPSY21-LTR_AG. XX FH Key Location/Qualifiers FT CDS 68..4369 FT /product="GYPSY21_AGp" FT /translation="MSQEDLRNAIVQLTNLVAKQQQQIENLANRTFSTAAS FT GSEKTIESLANGIQDFLYDPDAGVFFDAWYARYEDVFIQDGHSLDDAARVR FT LLLRKLSTPLHDKYVNTILPKHPRDFSLDETVTKLKKLFGRQKSVFHSRYQ FT CLQYAKSDADDFTSYAAMVNKHCEAFQLSKLTSDQFKALRFVCGLQSPRDA FT DIRARLISKLEADETAVVEQGEAASKVTLENLVEECHRVANLKHDSQMVEN FT KEACSVNAISRNQNHSSSKKTNKVPKTPCWKCGELHYVRECPFASHMCTRC FT KQQGHKEGYCSSSKPAASKPFKQWKPKESMKTNGIYTVRNVGRKRKFVSVE FT LNGVAVKLQHDSASDITIISNETWASIGRPPTQPTDESAVTASGSDLNLLA FT EFQADITINNVTKTGRIFISDSADLNVLGIDTMDLFDLWSVPINSLVNVVH FT QNSDQYVDRLKHQFPEVFRSTLGRCTKAQVKLYLKPDARPCYCPKRPVAYA FT ALPKVDAELERLETNGIISPVQFSDWAAPIVVVRKSDNVSVRVCGDYSTGL FT NNALECDRHPLPHPDDLFAELAGARYFTHLDLSDAYLQVEVEVESRKLLTV FT NTHRGLFQYNRLPPGVKSAPGAFQRIIDSMVAGISGVKPYLDDIFIAGRTK FT EEHDRILYAVLERVREYGFHLRLEKCRFALPQIGFLGLIVDKDGVRPDPSK FT TEAIAKMPPPKDVKQLRSYLGAINYYGRFVPQMKHLRAPLDDLLKKDARWN FT WTKECQKSFEQFKTILLSDLLLTHYDPSKEIIVAADASKYGLGAVVMHRFP FT TGEVKAIAHASRSLTAAEMNYGQVEKEALALIFAVTRFHKMLYGRHFTLET FT DHQPLLKVFGSKKGIPVYTANRLQRWALTLLLYDFEIKHISTMNFGYADFL FT SRLMSSQRRPDEDYVIAAVYVESEAKAILEDSINNLPVTHQMIVAETRKDA FT VLQQVISYINEGWPASVKLITDPDVKKFFVRREGLQVVDNCVMFGDRIVVP FT SKFRKRIVRQLHRGHPGMERMKSLARSYIYWPNVDDDVAQFVRQCDACAEA FT AKAPTKATLESWPLPDRPWQRVHVDFAGPIDGHHYFVIVDAYSKWPEIFRT FT RSITTTTTLDLLRETFSRYGNPDTLVSDNGTQFTSGQFQQFCSENGINHIR FT TAPYHPQSNGQAERFVDSLKRGLKKLGKGESPTLQHLQTFLSVYRSTPNRN FT TPKGTSPAEAFLKRTMRTTLDLLRKPYPPTAAVNHKQNEQFNKRHGAVKRS FT FVENDLVYVEQHVHNKKSWVPGRVIEPKGSVNYVVSLDLHGRQKLVRSHVN FT QMRSRYGSETPGQEQQQLPWEVLLEEVGTTAAVKPAGNHVAINLNSTVPAD FT VPVFENPPTDFNPPTTNEQPSASASESIASESVIAESIASESVASESIGCL FT LRRSVRTPRIPRWLSSYDLY" XX SQ Sequence 4383 BP; 1176 A; 1130 C; 1039 G; 1038 T; 0 other; tttggcgacg aggaaaagtt cagaagcatc tcgaaggtct ttttagcgaa atcatcaaat 60 aacgaaaatg tcccaagagg atctgcgaaa cgccatcgtc caactcacga atctcgtcgc 120 gaagcagcag cagcagatcg aaaatttagc aaatcgaact ttttcgacgg cggctagcgg 180 aagcgagaag acaatcgagt cgttggcaaa tggaatccaa gatttcctgt acgatccgga 240 cgcaggagtt ttctttgatg catggtacgc tagatacgaa gacgtcttca ttcaggatgg 300 tcattctctc gacgatgccg cccgtgtgag attgcttcta cgcaaattaa gcacccctct 360 gcacgataag tacgtgaaca ctattctgcc aaagcatcct cgcgatttct ctctggatga 420 gaccgttaca aaactgaaga agttgtttgg tcgccagaag tcggtttttc actcccgtta 480 ccagtgcttg cagtacgcca agagtgacgc ggatgatttt acttcgtatg ctgccatggt 540 caacaaacac tgcgaagcgt ttcaactttc caagctcacg tcggatcaat tcaaagctct 600 ccgatttgtc tgtggactcc aatcgccacg ggatgcggac atccgagcta gattgatttc 660 gaaactggaa gctgatgaaa ctgcagttgt tgaacaagga gaagctgcaa gtaaggtgac 720 tttggaaaac ctggtggaag aatgtcatcg tgtggccaac cttaagcatg actcgcagat 780 ggtggaaaac aaggaggctt gctccgtcaa cgccatttcg cgcaaccaga atcattcttc 840 ttcgaagaag accaacaaag tgcccaagac accctgctgg aaatgtgggg aactgcacta 900 cgttcgtgaa tgtccgtttg catcgcacat gtgtactaga tgcaagcagc aaggacataa 960 agaaggctac tgttcaagca gtaagcccgc tgcatccaag cctttcaagc aatggaagcc 1020 caaagagagc atgaagacga atggcattta cactgttcgc aacgtcggaa gaaaacgcaa 1080 attcgtctcg gtcgagctca acggggtagc agtcaagctt cagcacgact cggcgtccga 1140 catcaccatc atttcgaacg aaacatgggc tagcatcgga cgaccaccca ctcaaccgac 1200 cgatgaatct gctgtcacag cgtctggtag tgatttgaat ctcctcgcag agtttcaagc 1260 cgacatcact attaacaacg tgaccaagac agggcgcatt ttcatctctg atagcgccga 1320 tctcaacgtt ttgggaatcg atactatgga tctgtttgat ctgtggtccg taccgattaa 1380 cagcttggtc aacgtcgtac atcaaaactc tgaccaatat gttgatcgcc tcaagcacca 1440 gtttccggag gtttttcgaa gcacgctggg tagatgcaca aaagcgcaag tgaagttata 1500 cttgaagcct gatgcccgtc catgctactg tccgaagcga ccagtggcgt atgcggctct 1560 tcccaaagta gatgcggaac tcgaaaggct cgaaaccaac ggtataattt ctccagttca 1620 attctcggac tgggcagcac caatagtcgt cgtacggaag tcggacaatg tttcggtccg 1680 tgtgtgcggt gattattcta cggggctgaa caacgcgctg gaatgtgacc gtcatcctct 1740 acctcatcct gacgatcttt tcgcggagct ggctggggca cgctatttca cacaccttga 1800 cttatcagac gcctatctac aagttgaggt cgaggtggaa tcgcgcaagc tacttaccgt 1860 gaacacacat cgcggccttt tccagtacaa ccgacttcct cccggagtca agtcggcacc 1920 tggtgctttc caacgcatta ttgacagcat ggtggctggc atctctggag tgaaacctta 1980 ccttgatgat atcttcattg ctggccgcac caaagaggag cacgaccgta tcctctatgc 2040 tgttctcgaa cgcgtccgtg agtatggttt ccatttacgc cttgaaaaat gtcgtttcgc 2100 gcttccccag atcggtttcc ttggattgat cgtcgacaag gacggtgttc ggcctgaccc 2160 gtccaaaaca gaagccattg ccaagatgcc acccccgaag gacgtgaagc aacttcgctc 2220 ctacctcgga gctatcaact attatggtcg attcgttcca cagatgaagc acctcagggc 2280 tcccctggat gacctgctga aaaaggatgc tcgctggaac tggacaaaag agtgtcagaa 2340 atcttttgag cagttcaaga ccattttact ctccgacctg ctgcttactc actatgaccc 2400 atctaaggag atcatcgtcg cggcagacgc atcgaagtat ggtctaggcg ctgtcgtcat 2460 gcatcgtttc cccaccggtg aggtgaaggc aatcgcacat gcttctcgct cactgacggc 2520 agcggaaatg aactacggcc aagtggaaaa ggaagcattg gcgttgatat tcgctgtcac 2580 ccgtttccat aagatgctgt acggacggca tttcactctc gaaaccgatc atcaaccgct 2640 actcaaagtt ttcggctcga aaaaaggaat accagtttac acagccaatc gtctgcaacg 2700 atgggctttg acacttctcc tgtatgactt cgagatcaag cacatctcga cgatgaattt 2760 cggctacgca gacttcctgt cccggctgat gtcatcacag cgcagaccag atgaggatta 2820 cgtcatagct gccgtctatg tcgaatccga agcaaaggcg attctcgaag actccatcaa 2880 caatctgcca gtcacacatc agatgattgt ggctgagaca cgtaaggatg ctgttctgca 2940 gcaagtgatt agttacatca atgaaggatg gccagcaagc gtgaagctaa tcaccgatcc 3000 tgatgtgaag aagttcttcg tcagacgtga gggacttcaa gtcgtcgata actgcgtcat 3060 gttcggcgat cgaatcgtcg ttccatcaaa attccggaag cgaatcgtcc gccagctaca 3120 tcgtgggcac ccaggaatgg agcggatgaa gtctctggct cgcagctaca tttactggcc 3180 gaatgttgac gacgatgtgg cgcaattcgt tcgtcagtgc gatgcatgtg ctgaagcagc 3240 gaaggctccg acgaaagcaa ccctggaatc atggcctctt ccggaccggc cgtggcaacg 3300 agtacacgtc gatttcgctg gcccaatcga cggacatcac tatttcgtga ttgtagatgc 3360 ctactctaag tggcccgaaa tttttcgcac tagatccatc accacgacaa caactttgga 3420 cctgcttcgt gaaacatttt cccgttacgg caatccagac acgctagtct cagataacgg 3480 aacgcagttt acgagcggac agtttcaaca gttttgcagt gagaatggca tcaaccatat 3540 tcgtactgct ccataccacc cgcagtcgaa tggccaagct gaacgtttcg tggattccct 3600 caaacgcggc cttaagaagt taggtaaggg ggaatcacca acattacagc atctacagac 3660 gtttctttca gtgtaccgat caacacccaa ccggaataca cctaagggaa cgtctccagc 3720 cgaagcgttt ttaaaaagaa cgatgcgcac tacgttggat ctgttgagga aaccatatcc 3780 tccgactgca gccgttaacc acaaacaaaa tgaacaattc aacaaacggc atggagcggt 3840 caagcgatca tttgtagaga atgacttggt gtatgttgaa caacacgtac acaacaaaaa 3900 gtcgtgggtt cctggtcgag tcattgaacc aaaaggatcc gttaactatg tcgtgtcgct 3960 tgacttgcat ggaagacaga agctggttag atcgcatgtc aaccaaatgc gttctcgcta 4020 cggttccgaa acccctgggc aagaacaaca acaacttcca tgggaagttt tattagaaga 4080 agttggtact actgctgcag tgaaaccagc tggcaatcat gttgctatta atttgaactc 4140 tacggtgcca gctgatgttc ctgtttttga aaatccgcca accgatttca acccgccaac 4200 aaccaatgaa caaccatctg catcagcatc ggaatccatc gcatccgaat ccgtcatagc 4260 cgaatctatc gcatctgaat ccgtcgcatc cgaatccatc ggatgcttgc tacgtcgttc 4320 cgtacgaaca ccgagaatcc ctcgatggct ctcatcatac gacctttatt aaaaaggggg 4380 aga 4383 // ID MSAT1_AG repbase; DNA; ANG; 258 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE Mini-satellite type DNA - a consensus sequence. XX KW MSAT; Satellite; Simple Repeat; Nonautonomous; MSAT1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-258 RA Jurka J.; RT "Minisatellites from African malaria mosquito."; RL Repbase Reports 9(2), 637-637 (2009). XX DR [1] (Consensus) XX SQ Sequence 258 BP; 58 A; 52 C; 52 G; 96 T; 0 other; caaaaacttt tgcgactttt gcgacttttg cgattgtcgc aaaaactttt gcgacttttg 60 cgattgtcgc aaaaactttt gcgacttttg cgattgtcgc aaaaactttt gcgacttttg 120 cgattgtcgc aaaaactttt gcgacttttg cgattgtcgc aaaaactttt gcgacttttg 180 cgacttttgc gattgtcgca aaaacttttg cgacttttgc gattgtcgca aaaacttttg 240 cgacttttgc gattgtcg 258 // ID CR1-6_AG repbase; DNA; ANG; 4293 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 19-MAY-2005 (Rel. 10.06, Last updated, Version 2) XX DE CR1-6_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW AP endonuclease; CR1 clade; CR1-6_AG; DNA/RNA-binding; PHD finger; KW reverse transcriptase. XX NM CR1-6_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4293 RA Kapitonov V.V. and Jurka J.; RT "CR1-6_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 57-57 (2003). XX DR [1] (Consensus) XX CC CR1-6_AG is a young family of CR1-like non-LTR retrotransposons. CC The CR1-6_AG consensus sequence was reconstructed based on CC multiple alignment of ~10 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-6_AG occurred less than 1 million years ago. CC The 3' terminus of CR1-6_AG is composed of the ATAAAC CC microsatellite. CC CR1-6_AG encodes two protein sequences: a 358-aa CR1-6_AG-ORF1p CC (positions 225-1299) and 998-aa CR1-6_AG-ORF2p (positions CC 1300-4293). CR1-6_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (aa positions 4-57). CR1-6_AG-ORF2p is composed CC of CC the AP endonuclease (aa positions 50-230) and reverse CC transcriptase CC (aa positions 550-780) domains. XX FH Key Location/Qualifiers FT CDS 1300..4293 FT /product="CR1-6_AG-ORF2p" FT /translation="FAALPTPTYNRPHLTVSFTNRRTSVKSAQTPTCYVPA FT ANALRTARSTASVYYQNVRGLRTKVDEFRLSVLESNFDVIVLTETWLDPSL FT PSALLFDDSFRVYRCDRSVDNSTCSRGGGVLIACSQSLTSREHTTVHPSLE FT LVCVVIQLGNSRLFIIAAYLPPRLAANAATLREIENCIRSLCSTMHPGDGL FT LLLGDFNQPLVSWSAAQHDPDLPFLHYEPRTRSALSALFMDEMHHSGLFQI FT NGHLNTSGRVLDLVFANNAVASVCLPLELCLTPLLAIDTYHPALELAIPLP FT REESAVPALTSRLDYARTDFNRLLPMIASFANVFDCSHYATLDLAVKDFER FT FMLQALNECTPVKRPKRGPPWGDRTLRRLKTAKAAAYRDYLLRRCPAALRN FT YNTAHSLYRRYNRFRYLGHVRRTVLRCRGNSRVLWNFANNRRKSSGFPSSV FT SYNGRNGNNPSAVCDIFASRFAATFLPAVTDERQIADALSNVPVDAMAPNL FT PIIDEYSVSKAIDRLKSSFAPGPDGIPASTLKRCGTTIAPILASIFRDSLR FT SGIYPACWKTSWLVPVHKKGDKSNACNYRGITSLCACAKVFELLVYEPLLA FT AASNYISTAQHGFVPQRSTTTNLVEFVSLCHRTIDAGSQIDAVYTDIKAAF FT DSVPHALLLAKLETLGLPVQLLAWMRSYLTGRTYCVKMGPHTSRRIDASSG FT VPQGSNLGPLLFVIFLNDVTRLLPPDGHLLYADDAKLFLPIRDRSDQLRLQ FT ATLSAFQSWCSLNGLELCVEKCVVVTFARKRCPLVYDYALNGSTIGRKSCV FT TDLGVLLDEKLSFHDQLEHVVTKGNQLIGLLKQIARDITDPICIKTLYCAL FT VRPVLEYASVVWWPTAARPLARLESIQRKFTRFALRSWSVQLDYEGRCALL FT GIETLKQRNCNAQRLFVAGLLDNRIDSPALLSRLNMYVPPRSLRARSLLDV FT EERRTRFGSSDPFIRMCREFNVICDRHQPDMSRTALLNSIRVV" FT CDS 225..1358 FT /product="CR1-6_AG-ORF1p" FT /translation="MAAICFACAVSLDAADCIVGCAYCEATFHRGCCRLPS FT ELIDAVLTHIDLHWSCTGCTNILKNPRCRSVKEIGAQVGFQAALTSTVAAV FT GKLIEPIIAEVRSGFTLLQNAPIPQLCNTDPRPVAGRKRRRIIEDSMSPDV FT NKNVNIRQNNMFAASSPSAYTNTTVGIPPSSTLPEELMGTDSLSSPLRAAF FT PQPATDRIWIRLSRLSTAVTVEQVVASVKRRLATDDVLAYCLLRKGVSVDS FT VNWLSFKVRVPAALRDAALAPSSWPVGIGVREFVQSRQREHGHSSSPITIK FT HRSLTRTPVVIDRRSMPRTPTSTVYHAPAHASTSQAQTLTSPQLGEHTLND FT TTHGPNSTLIDGPLLIRRTSNTNLQQTTLDRFFHE" XX SQ Sequence 4293 BP; 915 A; 1227 C; 1026 G; 1125 T; 0 other; tctgacgtct aactgctcgg tcgcttatcg ctgtgctccg cgtttttcca aaaatattta 60 agtgtatttt ctaacgtacc cagttaatac tcgttcaact gactaccaaa ctgcttcctg 120 ttactgttgc gcgttagtgt gttatttcat tgacccacga ctgtgtatta acgtttttaa 180 tccgggttaa cggtggtgtg atcaaaaccg cacataattt cgcgatggcg gctatctgtt 240 tcgcgtgtgc tgtgtcactg gatgctgccg actgtatcgt cggctgtgcg tactgtgagg 300 ctacttttca ccgcggttgt tgcaggctgc cttccgagct gattgatgcg gtcctgactc 360 acatcgatct gcactggagc tgcactgggt gcaccaatat cctgaaaaat ccgcgctgcc 420 gatcagtcaa agagatcggg gctcaggtcg gttttcaagc tgctctcacc tcaactgttg 480 cggctgtggg gaagcttatt gaaccgatta tcgccgaggt gcgcagtgga tttactctac 540 tgcaaaatgc acccatacct cagctttgca ataccgatcc tcggcccgtt gcgggtagaa 600 agcggcggcg tatcatcgag gattctatgt cccctgatgt caacaaaaac gtaaacattc 660 gtcaaaataa catgtttgca gcgtcatcgc caagcgcgta cactaacact acagtcggca 720 tcccaccctc gtctacgcta ccggaagaac tcatgggaac cgattcgcta tcgtcaccgc 780 ttcgagcagc gtttccccag ccggccacag acagaatatg gatccgacta tctcgcttgt 840 ccactgccgt caccgtggag caagtggtcg cttctgtgaa acgccgttta gccaccgatg 900 acgtcctagc atactgcttg ctgaggaagg gagtcagtgt tgacagcgta aactggcttt 960 ctttcaaagt gagagtaccg gccgcccttc gtgacgcagc actcgcccca tcgtcctggc 1020 ctgtcggtat tggtgtacgt gaatttgtac aatcccgtca acgagagcat ggacactcat 1080 cgtcaccaat caccatcaaa caccgttctc tcacacgcac acctgttgtc atcgatcgcc 1140 gatcgatgcc tcgcacacca acatccactg tctatcacgc accggcacac gcatcaactt 1200 cacaggctca aacactaaca tcaccacaat tgggagaaca cacgctgaac gacactactc 1260 atggtcctaa ttcaacactc attgacggcc cgcttttaat tcgccgcact tccaacacca 1320 acttacaaca gaccacactt gaccgtttct ttcacgaata gacgaacctc tgttaagtct 1380 gcacaaacac ctacctgtta tgttcctgct gctaatgcgc ttcgcactgc ccgttccact 1440 gcctctgttt actatcaaaa cgtgcgggga ctgcgcacga aggtcgatga gtttcgcctg 1500 tcggtgttgg aatccaattt cgatgtaata gtgcttacgg aaacctggct cgatcctagt 1560 ctaccttcgg ctttgctgtt tgacgatagc ttccgagtct accgatgcga tcgtagtgtt 1620 gacaacagta catgttcccg cggtggaggt gtgttgattg cgtgctctca gtctctgacg 1680 tcacgggagc acacaacggt gcatccatcg cttgagctag tgtgcgttgt aatacaacta 1740 ggcaattccc gactattcat cattgctgca tacctcccgc ctagacttgc cgcaaatgct 1800 gccacgctcc gtgaaatcga aaattgcatt cgctcattat gctcaactat gcatcccgga 1860 gacggtttac tcctgttagg agacttcaac caacctctcg tctcctggtc agccgctcag 1920 catgatccgg atttgccatt tctgcattat gagccacgta cgcgatcggc tctctccgct 1980 ctgtttatgg acgagatgca tcatagcgga ctcttccaga tcaatggtca tcttaacacc 2040 agcggacgcg tcttagacct ggtgtttgca aataatgctg ttgcttctgt ttgccttccg 2100 cttgaacttt gcctcacccc gctgttagca attgacacct atcacccggc gcttgagttg 2160 gccatcccgc taccacgaga ggaatcagcc gttcctgcgc taacatcacg ccttgactac 2220 gcgcggacgg actttaacag gctcctgccg atgatcgcct cgttcgcgaa tgtcttcgat 2280 tgttcccatt acgccactct tgacctcgct gtgaaagatt ttgagcggtt catgttgcaa 2340 gccctgaatg aatgcacgcc ggtgaaacga cctaaacgag gccccccttg gggtgacaga 2400 acgctgcgca ggctcaaaac ggctaaagct gccgcgtatc gcgattattt gttgcgtagg 2460 tgtcccgctg cattgcgcaa ctataacact gcgcactctc tctatcggcg ctataacagg 2520 ttccgctacc tggggcacgt acggcgtacg gttttgcgct gccgcggcaa ttcacgtgta 2580 ctgtggaact tcgcaaacaa tcgtcggaaa tcctctggtt ttcctagctc cgtcagttac 2640 aacgggcgca acggcaacaa tccgtctgct gtatgtgata tttttgcctc gcgctttgcc 2700 gccaccttcc ttccagctgt aactgatgag aggcagatag ccgacgcatt atcaaacgta 2760 ccggtggatg ctatggcccc aaacttgccg atcatcgatg agtattcagt ctcgaaagcg 2820 attgacaggc tgaaatcttc gtttgctcca gggcctgacg ggatcccggc ttccacgctt 2880 aagcgttgtg gcaccaccat cgcgcctatc ctggcctcga tttttcgcga ctcgctgcgt 2940 tcgggcatct atcctgcctg ctggaaaact tcgtggcttg ttccagtgca caaaaagggg 3000 gacaaatcaa atgcatgtaa ttaccgtggc attacatcgc tttgtgcctg cgctaaggtc 3060 ttcgagctcc tggtgtacga acctctcctc gcagctgcct cgaactacat tagcacagct 3120 caacatggat ttgtccctca gcgttcaacc accaccaacc tggttgagtt cgttagcctc 3180 tgccacagga ccatcgatgc cggctcgcag atcgacgcag tctacacgga catcaaggct 3240 gccttcgata gcgttccgca cgccttgctg ctcgcaaagc tcgaaacgct tggtctgcct 3300 gtgcagctgc tggcctggat gcgctcctat cttactggtc gcacatactg cgtgaagatg 3360 ggaccccata cgtcgcgccg tatcgatgct tcttctgggg tgccgcaggg gagtaatctt 3420 ggaccgctgc tgtttgtcat tttcctgaat gacgtaacac ggttgctccc tcctgacggc 3480 cacctactgt atgcagacga cgcgaagctg ttcctgccta tccgcgaccg gtcagaccaa 3540 ctccgccttc aagccactct aagtgccttc cagtcatggt gctctttgaa tggtcttgaa 3600 ttgtgcgtcg agaagtgtgt tgtcgttacg tttgcgagaa agcggtgccc cttagtgtat 3660 gactatgcgt taaatggatc taccattggt cgcaaaagct gtgtcacgga tctaggagtg 3720 ctccttgacg aaaagttgag cttccacgac cagctagagc acgttgtcac taagggtaac 3780 caattgatcg gcttactaaa acaaatagca cgagacatca ctgacccgat ttgcatcaag 3840 acgctatact gtgctttggt gcgaccagtg ttagaatacg cttctgtagt atggtggcct 3900 acagctgctc gtcccctagc tcgtttagag tcgatccagc gcaaattcac gcggttcgct 3960 ttgcgctcct ggagtgtcca acttgactat gagggacgct gtgcgttgct tggcatcgag 4020 acgctgaagc agcggaactg caacgctcag aggctgtttg tcgcgggact tcttgacaat 4080 cggatcgact cgcccgcgct tctttcgagg ctcaacatgt atgtcccgcc gagatcgctc 4140 cgagctagat cgctacttga cgtggaggaa cgccgcactc gctttggctc ctctgatccg 4200 tttattcgta tgtgccgtga gtttaatgtc atttgtgatc gtcatcaacc tgacatgtcg 4260 cgcaccgcat tgttgaatag tattcgtgtc gtg 4293 // ID Clu-137_AG repbase; DNA; ANG; 1175 BP. XX AC . XX DT 04-SEP-2010 (Rel. 15.09, Created) DT 04-SEP-2010 (Rel. 15.09, Last updated, Version 1) XX DE Putative non-autonomous DNA transposon: consensus. XX KW DNA transposon; Transposable Element; nonautonomous; Clu-137_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1175 RA Fernandez Medina R.D., Struchiner C.J. and Ribeiro J.M.C.; RT "Transposable elements from Anopheles gambiae."; RL Repbase Reports 10(9), 1451-1451 (2010). XX DR [1] (Consensus) XX CC 2bp TSD. >96% identical to consensus. XX SQ Sequence 1175 BP; 379 A; 187 C; 191 G; 414 T; 4 other; cccaagtagc cttaagaaac tctatttgat tattaacagt tttttaaagc cgttaaaaat 60 caaatagagt ttcttaagag aatatgacat ttgccagcgt tacagcataa cgacgtattc 120 cgttaccctc gttttttttt tattcgggtc aatataacac acagtgtttt tcaaatttta 180 tctacaccat ttccaattta ttcaatattc gaaaaatcat tttccattat tatttatata 240 ataatgcgta tttattttca caccaaaata tttgttttgc ttaactgtac gaaaactatg 300 tgttttcttt tattcctcta cttgaatctg cggtnataat aaaaaaagaa tttaaaaata 360 aaaaaaaatt catagttttt aagtaataca tgtgaagagt ttgataaagg gtatgcaata 420 ttngcaaatg agaaaaatct tcataaaaat atttattttc ttgggcaaaa taatatgctc 480 ttgatgacga taatttttga gacatactgt acatggtcac gtacatggcg cctccatttc 540 ttatgtcaaa aagttcgtca agcttcgtgt gtgtagcttg tgtgtagatg gcaacaaaac 600 caagcgtctg cgacaatntc tatcagctat tgtttaattt gtgtatcatc gtgcttttac 660 aacaggtcag tgtatcgata ttaccattct cttcacatga gctacataag ttttatattt 720 cgcagcattt tcgattcgga tagcgtttgg acatcatgtt ttccgtacac ggcagcagct 780 gacctccacg atgaagattg acgaaatccg cgtaatgaaa gaaaattgtg tgatgcggtt 840 accnagtgaa attgtgtttt tattcaataa acgaataagt gttttattta ttaatttatt 900 tcacgaaaat ataagcacat taagcttaac gaagaccttg catgtcccta acgaaagaga 960 aagaaaaata tagccatctc tgtcgatttt ttcgttaatt ttgatctttt tcgttaaaaa 1020 taccggttaa attttaagat ttaacgaaac attgacgaag caaggaaagc gttacgcagt 1080 gttttaagcc ctggaattac tgctgtatgg aaagctaaaa gtaatacttt taggcatact 1140 tttaacggtt tcttaacagg attgtgctac ttggg 1175 // ID R6Ag1 repbase; DNA; ANG; 5075 BP. XX AC AB090817; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon R6Ag1 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; KW gag-like domain; reverse transciptase; R6Ag1. XX NM R6Ag1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090817; Positions 1 5075. XX FH Key Location/Qualifiers FT CDS 741..1772 FT /product="R6Ag1_1p" FT /translation="MMDFLDEVMSEFRRYRAVNRMQQVLDRQTQTSFDGND FT GFGPQTRKGRRPVADDQQPGPSGLQRLQQQQQQPSRLTPVREAVENIPSPR FT NGPNINEGSINKRKKKKSKKKQNKPRKRPEALLISDCTSEELAKLLKEMKQ FT SDALKSVGETISKVRRAQNGGMLLELKQGSSASAIAPKVKEAVKGKASVRT FT LAPSKMIEIMHLDEITTPEEVAESVKAQLNIEIEIDRIKMKKGRAAGTQWA FT RINVSLPDFQSFLNLGKLKVGWSICHIREVMEEQKCYKCWKVGHTSYHCRE FT PDRSNLCWKCGLSGHKKQACTNSVKCLDCGTRSQNLHATGSYMCPRRRTIR FT S" FT CDS 1775..4801 FT /product="R6Ag1_2p" FT /note="endonuclease and reverse transcriptase." FT /translation="MVRLLQHNQNHSYAAFQLMWQTIREESADIVLIADPY FT LATTNVKVLRNDDNTAAVVVNADLPVKVVSKALKGLMVVDIGDMRVVSVYA FT PPRFSMEEFQIMLDNTVMAVTGIHKFVIGGDFNAWSASWNNQLGERGETQK FT RRGELVLSTFAQIDAILLNDGSTPTYVGPGRTSVVDLTFASRTVARSFKWE FT VLSSYMNSDHRAIRIDLETQSVRNLSRPITGWSIKYFSKDIFEVMMQAAVD FT TEVTTSEDLMRILVTACNATMTKRKRYTPNKSAFWWTLEIEALRKECKHRD FT RLAQRAFHTDLYSTFRDEFKVARNALKRLIKHTRQRKWKEFLGTANNASFG FT IVYHTFKKVAEGSIGPRTMTLDEFREVVSELFPTHPNTVWPDYRIDQPREF FT ERITNDEILAVARRLPNKKAPGPDGIPNEALKVGMLTATDAFCRVYQGCLE FT NAKFPDEWKRQRLVLIPKPNKPPGEPGSVRPICLLDGAGKGLERIIVQRLN FT AHIEEVNGLSDDQFGFRSRRSTVDAIQRVVDIVSVARSRNRYSGRYCAVVT FT LDVTNAFNSASWLAIANALQRINTPKYLYDIIGDYFRNRVLMYDTTDGPAE FT IAVTSGVPQGSVLGPTLWNLMYDGVLRVAMVEGARIIGYADDIVLLVEGNC FT VDDIEILVSSQIRIIDRWMTDNGLKIAPTKTEFIMVSSHQRIQHGAIRVGD FT HVVHSSRTLKYLGMVLDDRLEYTSHIRYAVERATKLWTTLVRMMPNKAGPS FT SNVRRVIALTVVAKVRYASPIWCHTLRFANRRQWLRRFYRPVVQRVISSFR FT TTSHDAVCVLAGMIPLHLLLDEDSRTFHRRRAENIAGSVARNMERVTTMER FT WQREWDESVHGRWTYRLIPDVNRWISRRFGGVDFFLSQFLSSHGFYAYQLH FT RMQLTGSPLCDACEEPEDAEHTIFHCVRHRELIIRLQHQVDEELTPENIIE FT VMSANRYNWSMVHQAVRTIMIRQQHRRHVIERGERRALLANIQLALQSSDS FT DDE" XX SQ Sequence 5075 BP; 1417 A; 1080 C; 1398 G; 1179 T; 1 other; gcccggcaaa gtccgaccac cacctccacg cggtgccggg cggggtaggg gcccgtcatt 60 aagttgacga ggccccgagc taacggtggc ggtgaagacg gcgttgagca tgacaacgtc 120 gcaggggagg gtgtggtgtt aagtcgccac accctacatt ctgtcaagtg ggatactgaa 180 ctcgagagga taatacctct gcgcggatta gactagggta cggtttatgg aataactagg 240 atgtacacgg gctggcggat gaacccgccg aacccttagt aaagcagggt gaaacctgct 300 tgaaacgggg gagggctaag ggggattctg tcaccgctct attaaaactt agcaagtctt 360 aggtagcgct ccaagactgt cgccatacga tacgctctat ggcaaatgca ggtgtgggtg 420 ggttcagccc cactttgctc cggttcggca ctgagcccgt cgtctcataa gggcgcgggc 480 caacccttgc gtggagctcc ctgtttcata gccacccacg gaaccaaata ggagactgca 540 taaacagacg cggatagcgg gcctgagttt gagcccaata gaatgagtaa cactaaaact 600 aagaaggtca aaaaggccaa gggggttgac aagaaagatc cggaccaagc gttcatggat 660 cttcaagatc ggatggaggc tatgcgtttg ggataagtaa actcgatgat gagtctyccc 720 attttacgct agtaacggta atgatggact tcctcgatga ggtaatgtcc gaatttcgaa 780 ggtacagagc cgttaatcgc atgcagcagg tcctcgaccg tcaaacgcaa acgtcgtttg 840 acggtaacga tggattcggg ccgcaaacgc gaaaaggcag aagaccagtg gctgatgacc 900 aacagcctgg tccaagtggg ttgcaaaggt tgcaacaaca acaacaacaa ccatcgaggt 960 tgacccctgt tagggaagcg gtggaaaata ttccaagtcc gagaaacgga ccgaatatta 1020 atgagggtag cattaataag aggaagaaga agaagagcaa gaagaagcag aataagccca 1080 ggaagcggcc tgaggctctg ctgatatcgg actgcacttc cgaggagctg gcgaaattgc 1140 tcaaggaaat gaagcagtcc gatgctctta aatcggttgg agagacaatc tctaaggtcc 1200 gacgggccca gaatggtggc atgttgctag aattgaagca gggtagttct gctagtgcaa 1260 ttgccccaaa ggttaaggaa gcggtgaagg gcaaggcgtc agtgagaacg ctagctcctt 1320 ctaaaatgat tgagatcatg catctcgatg aaattaccac cccagaggag gttgcggagt 1380 ctgtcaaagc acagctcaac atcgagatcg aaatagatcg tatcaaaatg aagaaaggcc 1440 gcgcggccgg tacgcagtgg gcacggatca acgtatcgct gccagacttt caaagcttcc 1500 tgaatttggg aaagctgaaa gttggttggt cgatatgcca tatccgtgag gtaatggagg 1560 agcagaaatg ctataagtgt tggaaggtag gccatacgag ctaccattgt agggaaccag 1620 acagaagtaa tctgtgctgg aaatgcggtt tgagtggaca caagaagcaa gcttgtacca 1680 attctgttaa gtgtttggat tgcggtacga ggtcacagaa ccttcacgca acgggcagtt 1740 acatgtgtcc ccgtaggcga acgattagat cttaatggtt aggttgttac aacataacca 1800 gaatcatagt tatgctgcat ttcagttaat gtggcaaacg attagggaag aatctgcgga 1860 tatagtgttg attgcagatc cgtatctggc aacaacaaac gtcaaagtgt tacgcaatga 1920 cgataacaca gcagcggtag tggttaacgc ggacttacca gttaaggtag tcagtaaggc 1980 tctgaagggt ttaatggtag ttgacatagg tgatatgcga gtggttagcg tttacgcgcc 2040 acctagattt agtatggaag agttccagat catgttggat aacacggtaa tggccgtaac 2100 cggtatccac aaattcgtta tcggggggga cttcaacgct tggtcagcaa gttggaacaa 2160 ccaacttggc gagcgtggag aaacccagaa acgaagaggt gagttggtct tatcaacctt 2220 tgcgcagatt gacgcaattt tattgaacga cggcagcacc ccaacttatg ttggaccagg 2280 gcgcacgtca gtagttgatc ttacttttgc gagcagaaca gttgcaagat catttaagtg 2340 ggaggtgtta tctagctata tgaactctga tcatcgtgcg atacgcatag atcttgagac 2400 gcaaagcgtg cgtaatctgt cccgacccat aacgggatgg agcatcaagt attttagcaa 2460 agatatattt gaagttatga tgcaagccgc tgttgatacc gaggtcacaa caagcgaaga 2520 cttaatgcgt atacttgtca cggcgtgtaa tgcgacgatg actaagcgta agaggtacac 2580 tcctaacaag agtgcatttt ggtggacgct cgagattgag gcacttcgca aagagtgcaa 2640 acaccgcgat cgattagcgc aaagagcttt tcatactgat ctctattcta cttttaggga 2700 cgagttcaag gtggcaagga atgccctcaa gcgattgatc aagcataccc gacagaggaa 2760 gtggaaagag ttcctgggaa cagcgaacaa cgcatcattt ggtatagtat atcatacgtt 2820 caagaaagtg gccgagggtt cgattggacc ccgaactatg acattagacg agtttaggga 2880 agtggtgagc gagctttttc ctactcaccc aaacacggtg tggcctgatt atcgtatcga 2940 tcagccacga gagtttgaaa ggattactaa tgatgagatt cttgcggttg ccaggagact 3000 acccaacaag aaggcgccgg gaccagatgg tatcccgaat gaggcgctga aagttggtat 3060 gttgactgca accgatgcat tttgcagggt ttaccaaggc tgtttagaga acgcgaagtt 3120 ccccgatgag tggaaaaggc agaggttggt gttaataccg aagccgaaca aaccaccagg 3180 ggaaccgggt tcggttcgcc ccatttgtct attagacggg gcaggtaaag gtctagaacg 3240 catcatagtg caacggttaa atgcacacat cgaggaggtc aacggactgt ctgacgacca 3300 atttggtttc agaagtcgtc gatcaacagt tgatgcgatt caacgggtag tggacattgt 3360 ttcggtagct agaagcagaa accgatacag tggacggtat tgtgcagttg ttacattaga 3420 tgttactaat gcttttaaca gtgcttcgtg gttggcgatt gcaaatgcat tacagagaat 3480 taacactcct aaatatcttt atgatatcat tggtgattat tttaggaatc gtgtgctgat 3540 gtatgatacc acagatggac cggcagagat tgcagtcaca tcgggtgtac ctcaaggctc 3600 ggtactcggc ccaacgttat ggaatctcat gtacgacggt gtcctacgag ttgcaatggt 3660 ggaaggtgca cggattatcg gctatgcaga cgatatagtg ttgttggtgg aaggtaattg 3720 tgttgatgat attgaaattc tcgtttccag ccagattcgc atcatcgaca gatggatgac 3780 cgacaacgga ttaaagatag ccccgaccaa gaccgagttt attatggtca gttcccatca 3840 gaggatacag cacggggcta tcagggtagg tgatcacgta gtacattcgt cgcgcacttt 3900 aaagtatttg gggatggtct tagatgaccg cctcgaatac acttcacaca tcaggtatgc 3960 ggtggagaga gcgacgaagc tatggaccac cttggtaagg atgatgccta ataaggcagg 4020 tccaagtagt aatgttaggc gagtaattgc tcttactgtt gtggcgaagg tccggtatgc 4080 ctcgcccatt tggtgtcata cccttagatt tgctaaccgt agacaatggc tacgtcggtt 4140 ttaccggcca gtagtccagc gagttatctc ttctttcagg acaacttctc acgatgcagt 4200 ctgcgtgctt gcgggaatga tcccgctgca tctcctcctg gacgaggact ccaggacttt 4260 tcatcggaga cgagcagaga acatcgccgg atcggttgca cgtaacatgg aacgtgtcac 4320 aactatggaa cgatggcaac gagaatggga tgagagtgtt cacggtcggt ggacataccg 4380 tctcataccc gacgtcaaca gatggataag tagaagattt ggtggtgtag atttctttct 4440 ttctcagttt ctttccagcc atggcttcta cgcctaccag cttcatcgga tgcagttaac 4500 gggctcgccg ctatgcgatg cgtgcgagga acctgaggac gccgaacaca cgatattcca 4560 ttgtgtacgt catcgtgaac tgatcattag acttcagcat caagtcgacg aggagttaac 4620 gccggagaac atcatcgaag ttatgtctgc taacagatat aactggagca tggttcatca 4680 agcagtacgg acgattatga ttcgacaaca acatcggaga cacgtcatcg aacgaggcga 4740 acgacgtgct ttgctcgcca acatccagtt ggccttgcag agcagcgaca gtgacgacga 4800 gtaacgacaa ggattcatcg tagttcatcg tagcttcatc gccgagggct agacagtggc 4860 taatcaccac tgttggaagc cattcgttgc ctgggatgat ggacatccac cgcccgagtg 4920 acgtcgatac cctaacgggt gatccactcg gggccggttg aaggcacgga ggggttttag 4980 tgagtaagaa tctcacacta ccggggttga tcacccaggt gtcttatgca agatttcccc 5040 ttcgataaca aaaaaaaaaa aaaaaaaaaa aaaaa 5075 // ID BEL12-LTR_AG repbase; DNA; ANG; 204 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL12-LTR_AG is a long terminal repeat of the BEL12_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL12-I_AG; BEL12-LTR_AG; BEL12_AG; Bel clade; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-204 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL12_AG, a nonautonomous family of Bel/Pao-like LTR RT retrotransposons from African malaria mosquito."; RL Repbase Reports 3(3), 32-32 (2003). XX DR [1] (Consensus) XX CC BEL12-LTR_AG flank an internal portion of BEL12_AG (deposited as CC BEL12-I_AG). XX SQ Sequence 204 BP; 72 A; 29 C; 46 G; 57 T; 0 other; tgttggcgcc taaatggctg caaatttata gattgtaaga ttataggaaa agggtacata 60 ggattacgtt agtattagtg ttagcaggac tgtcagggtt atgacgaaaa tataaaagga 120 ttagtagcaa tcagacagtc gactcggcaa catacctttg gaaaccacta aaaaatatca 180 cagtagttta tgagctatcg ttca 204 // ID GYPSY59-I_AG repbase; DNA; ANG; 4552 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 27-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE GYPSY59-I_AG is an internal portion of retrotransposon GYPSY59_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY59-I_AG; GYPSY59-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY59_AG; mag lineage; reverse transcriptase. XX NM GYPSY59-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4552 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY59_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 157-157 (2004). XX DR [1] (Consensus) XX CC GYPSY59_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY59-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1482-aa CC GYPSY59_AGP gag-pol like polyprotein (pos. 93-4538). The CC sequence of the LTRs flanking GYPSY59-I is deposited as CC GYPSY59-LTR_AG. XX FH Key Location/Qualifiers FT CDS 93..4538 FT /product="GYPSY59_AGP" FT /note="gag-pol like" FT /translation="MEQADANQMDIFADVDPRRLSQNRASVPGVSMPNFAH FT REGASVVPPMIIQPPSQPTSTPSQSTSTPSLNNASPAIISSADSATMLQMM FT NLLQQQMTQQQQLLKDFLHARMPSQCAPPTTLQPEQIIDTLSHHISEFQYN FT KETGITFKSWFSRYLDLFQKDAARIDDAAKVRLLLRKLGPSEHDRYLSFIM FT PSRPPDFSLEQTVEKLTCLFDTQESLLSKRFKCLQIMKTRTEDHLSYACRI FT NKACVEFELKKLNEEEFKCLLYVCGLKDEIDADIRTRLLARIEDKACVTLQ FT QLSSECHRLINLKKDSAMIEAPVPERVLAVNTKMHRAPRQFQPKRDNPTTP FT CWSCGGLHWSRDCPYKQHRCTTCSRTGHKEGYCNTIRSRKPGKRPWKQRKT FT QLRMVTVNVQSVQQRRRFVSLTMNGTPVRMQLDTASDITVIDHTTWKLIGS FT PQLAAPSVIARTASGANLSLEGEFPCTVEVNGQAKQTVIRVSKSRLLLLGA FT DVIDAFALWSVPMDSFCCHVTGTSTTPKQWQERFPTVFQGIGLCKKAGVTL FT TLKDNCRPVFRPKRPVAYAMQEPVNLELDRLEKLGIITPVKFSEWAAPVVV FT VKKANGKIRLCGDYSTGLNEALRPHDYPLPLPEDIFSRLSNCTMFSKIDLT FT DAFLQVEIDPQYRPLLTINTHRGLYHYNRLPPGIKVAPAAFQQLMDTMLAG FT LKGVSGYLDDIIVGGSSEHEHDTNLAEVLHRLQEYGFTIRADKCAFKQQKI FT TYLGHVIDSHGLRPDPSKIELIKKLPEPKDISGVRSFLGAINYYGKFIPNM FT RKLRYPLDNLLKANNSFCWTPECKKSFATFKSLLSSDLLLTHYDPRQKIIV FT SADASSIGLGATISHVYPGGAMRVIQHASRALSEAERHYSQIDREGLAIIF FT AVKKFHKMIFGRRFVLQTDHRPLLHIFGSKKGIPTVTANRLQRFALTLLAY FT DFSIEYVRTDDFGNADLLSRLINTQAKPEEDTVIACIETDIKAMVVSALHN FT TPLHFADLIRETRKDPLLQKLVHYIREGWPSNATYTGELSRFFARKDALST FT VDGCILFGERVVIPRVLQQQCLQQIHHGHPGIQRMKALARSYLYWPSLDAD FT ITEWVKTCNACQAVARSPPHSSPVPWPKAAGPWQRIHVDYAGPLDGDFYLI FT VVDSFTKWPEVFRTSSTTSAATIGILRGIFARFGVPTTLVSDNGTQFTSED FT FKFFCFQNGIEHITTAPYHPQSNGQAERFVDTFKRTVKKISADGRTMQEAL FT DAFLLSYRSTPSSVLEGLKSPAEIMFGRPMRTTLDLLRPPLAGGLTDSPVA FT AKRREFRPSDLVYVKCYSRNGWSWTAGTIVSRIGNVMYTVRTVDRKTIRSH FT VNQLRERRERHHHRHRESSESDGLPLDILLDSYHLTPQPTTSTAEPSSASL FT QHTTSTHTSVSSASHPQTSAVIDPRTASNAEPLHPTNSPIPVQAPTREPRR FT FSRHRRPPSRFDPYRRF" XX SQ Sequence 4552 BP; 1264 A; 1186 C; 1071 G; 1031 T; 0 other; gtggcgacga ggacaattta ctttaacgtg tgttcagttt aacgtgtgca agtgtatcaa 60 acataggacc accgtgtcgc aagcgtcgta agatggagca agccgacgca aaccagatgg 120 atatttttgc ggacgttgac ccacgccgat tatcccaaaa tcgtgccagc gtaccaggcg 180 tgtctatgcc aaatttcgcg catcgagaag gagcatcagt agtaccgccg atgataatcc 240 agccgccatc gcagcctacg tcgacgccat cgcagtctac gtcgacgcca tcgctaaaca 300 atgcgtcacc agccatcatt agcagcgcgg atagtgcaac gatgctgcaa atgatgaatt 360 tgcttcaaca gcaaatgaca cagcagcagc agctgcttaa agattttctt cacgcgcgca 420 tgccgagcca atgcgcccca cccaccacct tgcagcctga acaaataatt gatactctgt 480 ctcaccacat ttcagagttt caatacaaca aggagactgg tataaccttc aaaagctggt 540 tttcacgcta tctcgacctg ttccagaagg acgcggcaag gatcgacgac gcagccaaag 600 tgcgattgtt gctgcgcaaa ctgggaccat ccgagcacga tcgctactta agcttcatca 660 tgccgagccg cccgccggat ttttcactcg agcagactgt ggaaaaactt acatgcctgt 720 tcgacaccca agaatcgctg ctgagcaagc ggttcaaatg cttgcagatc atgaagacgc 780 ggaccgagga tcatctcagc tatgcgtgtc gaattaacaa ggcgtgtgtc gagttcgagc 840 tgaagaagtt gaacgaggaa gagttcaagt gtttgctgta cgtgtgcggg ttgaaagacg 900 agatcgacgc agacatcagg acgcgactgc tagcccgcat tgaggataag gcctgcgtga 960 ccttgcaaca gctttcttca gagtgccatc ggttaatcaa cctgaaaaag gatagcgcaa 1020 tgatcgaggc accggtacca gagcgtgttc tagcagtaaa cacaaaaatg catcgcgcac 1080 cacgtcagtt ccaacccaag cgcgacaatc ctactacccc gtgttggtcg tgcggaggac 1140 tgcattggag ccgcgattgc ccgtacaagc aacaccgatg cacaacttgt tcccggacgg 1200 gccacaaaga aggctattgc aacaccataa gatcccgtaa gcccggcaaa cgcccctgga 1260 agcagcgtaa aacacaatta aggatggtga cagtgaacgt ccagagtgta cagcaacggc 1320 gcagattcgt gtcacttacg atgaatggta caccggttcg tatgcagttg gacacggctt 1380 ccgacattac tgtgatagat cacacaacgt ggaagctcat tggcagtccc cagttagcag 1440 caccatccgt aatcgctaga accgcctcag gtgctaacct atccctggaa ggagagttcc 1500 cgtgcactgt cgaagtgaat ggacaggcaa agcaaaccgt gattcgcgtc tctaaatcgc 1560 gtttgttact tttaggtgct gatgttatcg acgcgtttgc tctatggtcg gttccgatgg 1620 acagtttttg ctgccacgtt acaggtacgt ctaccacgcc gaagcaatgg caagaacggt 1680 tcccaacagt ctttcaagga ataggcctat gtaaaaaggc aggggttacc ttgacgctga 1740 aagacaactg ccgtccagtc tttcgtccaa aaagacccgt tgcatacgct atgcaagagc 1800 ctgtaaatct agagcttgat aggctagaaa agttgggtat aataacacct gtaaagtttt 1860 ctgaatgggc agcaccggta gtagtggtta agaaggcaaa cgggaaaatt cggctatgtg 1920 gcgattattc cactgggttg aacgaagcgc taagaccaca tgattacccg ttaccacttc 1980 cagaggacat attttcaagg ctgtccaact gcaccatgtt cagcaaaatc gacttaacgg 2040 acgcctttct ccaagtcgaa atcgatcctc aatacagacc gctacttacc ataaatacgc 2100 atagaggctt gtaccactac aaccgcctac cgcccggcat caaagttgct cctgctgcct 2160 tccagcaact tatggataca atgctcgcag ggcttaaggg cgtatcagga tatcttgacg 2220 atatcatcgt aggaggaagt agcgaacatg agcatgacac aaatttggca gaagtattac 2280 acaggcttca agaatatgga tttacgatac gagccgataa atgtgctttt aaacagcaga 2340 aaatcacgta ccttggacac gtgatcgata gccacgggtt acgaccagat ccgtcaaaaa 2400 tcgagcttat aaagaagctc cctgaaccga aagacatatc cggtgtgaga tcatttctgg 2460 gagcaattaa ctattacggg aagttcatcc cgaatatgag gaagctgcga tatccgttag 2520 acaatttgct taaggcaaat aattcattct gctggacgcc agaatgcaaa aagtcgttcg 2580 caacgttcaa gtctctccta tcgtccgacc tacttctaac gcactatgat ccacggcaga 2640 agataatcgt ttctgcagat gcttcgtcca tcggccttgg cgcaacgatt agccacgtgt 2700 atcctggagg tgcaatgcgc gtaattcaac atgcttcccg cgccctcagt gaagcagagc 2760 gtcattacag ccaaatcgat cgcgaaggct tggctatcat cttcgctgta aaaaagtttc 2820 ataaaatgat cttcggcagg cgttttgttc ttcaaacgga ccatcgacct cttcttcata 2880 ttttcggctc gaagaagggt atcccgaccg tcaccgcaaa ccgtttgcag cgtttcgcac 2940 tcactcttct agcgtatgat ttcagcatag agtatgtacg caccgatgac ttcggaaacg 3000 ctgacttgct ttcgcgcctg atcaacacac aagccaaacc tgaagaagac accgtgatcg 3060 catgcataga aacagacatc aaagcaatgg tggtaagtgc gcttcataac actccacttc 3120 attttgcaga ccttatcaga gaaacacgga aagaccccct actgcaaaag ctcgtgcatt 3180 acatacgtga aggatggccc agtaatgcaa cctacacagg ggaattgtct cgtttctttg 3240 cacgtaagga tgctttatca accgttgatg gttgtatttt gttcggggag agggtggtga 3300 ttcctcgagt tctacaacag caatgtctgc aacaaataca ccacggtcat cctggcatac 3360 aacggatgaa agcattggcc agaagctacc tttattggcc atccttggac gctgacatca 3420 cagaatgggt caaaacatgt aacgcgtgcc aagcagttgc acgatcaccc cctcactcaa 3480 gccccgttcc ttggccaaaa gctgcaggtc catggcagcg cattcatgtg gactatgctg 3540 gtccactaga tggagatttc tacctcatcg tagtagattc gttcacgaaa tggcctgaag 3600 tatttcgaac gagcagcact acatccgcag caaccatcgg catcttacgc ggaatattcg 3660 cccggttcgg tgttccaacc actctagtat ctgataacgg tacgcagttt acgagcgaag 3720 attttaagtt tttttgcttc cagaacggca tcgagcacat aacgacggca ccctatcatc 3780 cgcagtctaa cgggcaggct gaaaggttcg ttgatacttt caagcggacc gttaagaaaa 3840 tatcagccga tgggagaacg atgcaagagg cgctggatgc attcctactg tcctaccgaa 3900 gtactccgag ctccgtgtta gaagggctaa aatcaccagc ggagataatg tttggcagac 3960 caatgcgtac aaccttagac ctattacgtc caccactcgc aggagggtta acagacagtc 4020 cagtagcagc taaacgcagg gaatttcgcc cgtccgacct ggtatatgtt aagtgttact 4080 ctcgcaacgg ttggagttgg acagctggga ccattgttag ccgcattgga aacgtaatgt 4140 acaccgtcag gacagtcgat agaaagacca taagaagcca cgtgaaccag ctgcgcgagc 4200 gccgtgaacg tcatcatcat cgtcatcgtg aatcatcgga gtctgacggt ttaccgctag 4260 atattctttt ggattcctac caccttactc cgcaaccgac gacatcaaca gccgaaccat 4320 catcagcatc gctacagcat actacatcta cacacacgag cgtatcgtct gcatcacacc 4380 cacaaacgtc agcagtcatc gatccgcgta cggcctcaaa cgcggaaccg ctgcatccaa 4440 cgaattcccc aattccggta caagctccaa cccgagaacc acgtcgcttt tctaggcata 4500 gaagaccgcc cagtaggttc gacccgtaca ggcgttttta aaagagggga ga 4552 // ID GYPSY38-I_AG repbase; DNA; ANG; 4388 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY38-I_AG is an internal portion of retrotransposon GYPSY38_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY38-I_AG; GYPSY38-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY38_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4388 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY38_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 68-68 (2004). XX DR [1] (Consensus) XX CC GYPSY38_AG is a family of Gypsy-like LTR retrotransposons CC that, according to the amino acid sequence of its Reverse CC Transcriptase CC RNase and Integrase is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY29_AG, GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, CC GYPSY34_AG, CC GYPSY35_AG, GYPSY36_AG and GYPSY37_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY38-I_AG consensus was reconstructed after multiple CC alignment of 6 copies. CC The consensus encodes the 1328-aa GYPSY38_AGp gag-pol like CC protein CC (pos. 387-4370). CC The sequence of the LTRs flanking GYPSY38-I_AG is deposited as CC GYPSY38-LTR_AG. XX FH Key Location/Qualifiers FT CDS 387..4370 FT /product="GYPSY38_AGp" FT /translation="MENALIAAMVEDHTRLGLVKKCEERELSTVGTKEELA FT ARVVAYDRTDAADFADAELPTVDRVQPYSFRDVEDGVDAFGAGSTQDVRAW FT IRSFEEISTMAGWNEDQKLIMCRKKLTGVARGFLNTLKGVTTYANLRRALI FT NEFAPSVRASDIHRQLSTRRKKKEETGLEFVYAMQQIAQQIDLDEASLVEY FT VADGITNDERQRSMFYEAKNINELKEKIRLFERAANKFQQERPKVVGQRIP FT DKLKQRHCFNCGSASHRLIECPTKQEGPTCFKCGTKGHSAKDCTTAQGRRP FT HCYACGEVGHVATRCPNNSNNNRVQTMTSSFPIVCVSVGNQLCEAVIDTCS FT DVTLMRWDLFQELKLNCNELKPSARVIKGYGGKQSNVCGELTIRATIDEVE FT EIIRFVVVPSNSIDTKMLIGIDVLKRVNYVITNGHAKITKQQEEEEKPTGV FT DDRWVYRIVCESHQEIDAPSNYRRVICDMVNNYEPAKISNVESEMKILVNN FT EEVVRTLPRRIAPLEKEVVRKQVREWLNDGIIQPSRSPYASAVVVVPKKDG FT SHRVCVDYREMNKRIVRDSYPMPNIEDQIDQLADARVYSVLDLKNSYFHVK FT VEKESRQYTSFVTTDGQYEFLRAPFGLCLSGNAFGRFIDAVLHELIIDGTV FT MAFVDDIIIPSRDEEHGLASLRRVLEVAQRAGLHFNWKKCSFLKRRVEYLG FT YTVYEGKIEPAPQKIEKVKYYPQPKTVKQLQCFIGLASYFRKFIEGFARIA FT RPLTTMLKKESVFEFNEEARAAFESIKEKLVEYPVLHIFKASLVTELHTDA FT SKDALAGILLQRSEEDGLLHPCYYYSRLTNKAEKNYHSFELESLAVVESIK FT KFRCYLLGKKFKVVTDCIAFKQSLNKKVPNARVSRWFVTLSEFDFEVEHRS FT GEKMKHVDALSRANVLKISAGVCEKIIQGQLRDEDLTAIRTALDQGEERNG FT FCLINGVLYKNENAHNKICVPRAMETGMIREVHERGHFGIKKVKEQLTEDY FT FIPEVEKKIKYCIDNCVRCIVSDRKRGRAEGELHPIPKGEVPLDTFHVDHI FT GPMPSTRKCYNYILTVVDAFTKHVWLFATKTTSAEEVIKKLTIISDTFGNP FT RRIICDRGSAFTSSLFTKFCEEANIELHHIVTGVPRGNGQVERVHRIIIPT FT LAKLSYENPENWFKFVNNVQKAINNSWQRSIRMTPFELMIGVKMKEKEDIR FT INDILQEEIQKMFTDDRNEIRKIACKGIARIQDENRRTYNLRRKSIQKYKI FT GDIVAIPVTQFGIGRKIKRKFFGPYKIVKTFSNERCEVLKLDEQTEGPMRT FT TTAFDQIKIWADSGES" XX SQ Sequence 4388 BP; 1418 A; 689 C; 1131 G; 1150 T; 0 other; tgggggctcg tccgggaaga ggctaggcga tgcaaagtga tgagtttaca gttatgaaaa 60 gttgcataat gtgttaaagt tgataaagtt gttcaagtcg taaagttgat aaagttgtaa 120 agttgttaaa gtcgttcatg tcgtaaagtt gataaagttg taaagttgtt aaagtcgttc 180 aagtcgtaaa gttagtaaag ttgttaagtt gcagtagtgg ttcgaattgg aaaagttgac 240 ttagtcggtg tttcacctga tgtgaaagtg tgaatttgtg acgaaaagtt gacattgtga 300 cgcataaagg ttgttaaagt tgcgattgtg acgttgtaaa gtgtgtttaa ataaaagcta 360 aagttgtgca aaacgttaat gagaaaatgg agaacgcgtt gattgcagcg atggtcgaag 420 accatacgcg gttgggtttg gtgaaaaaat gtgaagagcg tgaattaagt acggtcggaa 480 caaaagagga gctggccgca cgcgtcgttg cgtatgatcg aaccgatgca gctgattttg 540 cggatgctga gttgcccacg gttgacagag tgcaaccata ctccttccgc gatgtcgaag 600 atggtgttga tgcatttggc gccggatcta cgcaagacgt gcgcgcatgg atcagaagtt 660 ttgaggagat aagtacgatg gctggatgga acgaggatca gaaattgatc atgtgccgta 720 aaaagttgac cggagtcgca cgtggttttc taaatacgct gaaaggtgtg acgacatatg 780 caaatctacg tagagcgttg attaacgagt ttgctccaag tgtgcgtgca agcgacatac 840 acaggcagct atcgacgcgt cgcaagaaaa aagaggagac cgggttagaa tttgtttatg 900 ccatgcagca aattgcgcag caaatcgatt tagacgaagc cagcctcgta gaatatgtgg 960 cagatggaat aactaacgac gaaaggcaga ggtcaatgtt ctacgaggca aagaatatta 1020 acgaacttaa ggagaaaata cgtttgttcg agagagcggc gaacaagttt caacaggaac 1080 gcccaaaagt tgtcgggcaa cgaattccgg ataagttaaa acaacgccat tgtttcaatt 1140 gtggcagcgc ttctcatcga cttatcgaat gtccaacaaa acaagaaggt ccgacttgtt 1200 tcaaatgtgg aacaaaagga catagtgcaa aagattgtac gacagcgcaa ggaagaagac 1260 cgcattgtta tgcttgcggc gaagttgggc atgttgcaac acgatgcccg aataacagta 1320 ataataatag agttcaaaca atgacgagct catttccaat tgtctgtgtg tcagttggaa 1380 accagttgtg cgaagcggtg atcgatacgt gcagtgatgt gacgctcatg cgctgggatt 1440 tgttccagga attaaaactc aattgtaatg agttgaaacc gtctgcacgg gtaataaaag 1500 gctatggtgg aaaacaaagc aacgtgtgtg gtgagctgac gatccgtgct accatcgatg 1560 aagttgagga aataattcgt ttcgtggtgg taccttcgaa ttctatcgac accaaaatgc 1620 tgataggaat cgatgtgctc aaaagggtta actacgttat cacaaacggt catgcaaaga 1680 taacaaagca acaggaggaa gaggaaaaac ccacgggtgt tgatgatcgt tgggtgtatc 1740 gtatagtttg tgaaagtcat caagaaatag acgcaccgtc aaattatcgt cgcgtgattt 1800 gtgatatggt gaataattat gaaccagcaa agatttctaa cgttgaaagt gagatgaaaa 1860 tattagtgaa caacgaggag gtagtgcgta cgttgccaag aagaatagct ccgctagaga 1920 aagaggtggt aaggaaacaa gttcgcgaat ggttaaacga cggaataatt cagccatcgc 1980 gaagccctta cgctagcgca gttgtggttg taccgaagaa ggatggatct caccgtgtgt 2040 gtgtggatta tcgtgagatg aataagagga tcgtgcggga ctcttatccg atgcccaata 2100 tcgaagacca gattgatcag ttggcagatg ctcgtgtgta cagtgtatta gatctgaaaa 2160 attcgtattt tcacgtcaag gtggagaagg aaagtcgcca gtacacatct ttcgtaacaa 2220 cggatgggca gtatgagttt ttacgtgctc cgtttggatt gtgtctcagc ggcaatgcgt 2280 ttggacgctt catcgatgca gtacttcatg agttgatcat agacggtacc gtgatggcct 2340 tcgtagatga catcatcatt ccgtctcgag acgaggagca tggtttggcg tctttgcggc 2400 gagttttgga agtggcacag cgggcaggac ttcacttcaa ttggaaaaag tgttcgttcc 2460 tcaaacggcg tgtagagtat ttggggtata ccgtctatga aggtaagatt gaaccagcac 2520 cccaaaaaat agaaaaagtt aaatattacc cgcagccaaa aacggtgaaa cagctgcaat 2580 gttttatagg gctagctagt tattttagaa agtttataga aggttttgcg cgtattgcca 2640 ggccattgac cacaatgtta aaaaaagaga gtgtttttga atttaatgaa gaggcaaggg 2700 cagcgtttga atcgattaaa gaaaaattag tggaatatcc tgtattacat attttcaagg 2760 cgagtttggt taccgaattg catactgatg cgagtaaaga tgctctggcc ggcatattgt 2820 tgcagcgttc tgaggaggat ggtttattac atccttgcta ttattatagt aggttgacta 2880 ataaggccga gaaaaattat cactcattcg aattagaatc attagccgtt gtggaatcca 2940 taaagaagtt caggtgttac ttgctaggta agaaatttaa agttgtgacc gattgtatag 3000 cgtttaaaca atccttaaac aaaaaagttc ccaatgcgcg agtgagtcgt tggtttgtaa 3060 cgttgtccga gtttgatttt gaagtagaac atcgttcagg tgaaaagatg aaacatgttg 3120 acgctctttc aagagcaaac gtattgaaaa tatcggcagg tgtatgtgag aaaataatac 3180 aaggacaatt gcgcgatgaa gatcttacgg caattaggac ggcgcttgat cagggcgaag 3240 aaagaaatgg tttttgtttg ataaatggag tactttataa aaatgagaat gctcataaca 3300 aaatatgtgt acctcgagca atggagacgg gtatgataag agaagtacac gaacgaggac 3360 attttggaat taagaaagtg aaagaacaac taacagagga ttattttatt cctgaagtag 3420 aaaagaagat aaaatattgt attgataact gtgtccgatg catagtgagt gatagaaaac 3480 gaggcagggc agagggagaa cttcatccaa taccgaaagg agaggttccg ttggatacct 3540 ttcatgttga tcatattggc cctatgccat ctactcgcaa atgttacaat tacattctta 3600 ccgttgtcga tgcgtttacg aagcatgttt ggctatttgc tacaaaaact actagtgcgg 3660 aagaggttat aaaaaaattg acgattatat ctgatacgtt tggaaatccg cgaaggataa 3720 tttgcgatag aggatctgct tttacctcta gtttatttac aaaattttgc gaagaagcca 3780 atatagaatt gcatcacatt gttacaggag tcccacgagg gaatggccaa gtagagcgtg 3840 tacaccgcat aattatccct acattagcga agttgtcgta tgaaaatccc gaaaattggt 3900 ttaagtttgt aaataatgtt caaaaggcga taaataatag ttggcagcga tcgattagga 3960 tgacaccttt tgaattgatg atcggagtga aaatgaaaga gaaggaagat attcgaatta 4020 acgatatttt gcaagaagaa atccaaaaga tgtttacaga tgatcgtaat gaaataagga 4080 aaatagcatg caaagggatt gccagaatcc aagatgaaaa tcgtaggacg tacaatttgc 4140 ggcgtaaatc gattcaaaag tataagatag gagatatcgt agcgattcca gtaactcagt 4200 tcggaatagg acgaaaaatt aagaggaaat tctttgggcc gtataagatt gtcaaaacat 4260 tttcaaatga gcgttgtgag gtactgaagc tagatgagca aactgaggga ccaatgagaa 4320 cgacaactgc attcgatcaa ataaaaatat gggcagattc aggagaatca tgatgtcagg 4380 aaaggccg 4388 // ID MinoAg1 repbase; DNA; ANG; 5660 BP. XX AC AB090816; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 24-SEP-2010 (Rel. 15.1, Last updated, Version 2) XX DE Anopheles gambiae retrotransposon MinoAg1 DNA, complete sequence. XX KW R1; Non-LTR Retrotransposon; Transposable Element; MinoAg1. XX NM MinoAg1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5660 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090816; Positions 1 5660. XX FH Key Location/Qualifiers FT CDS 590..1954 FT /product="MinoAg1_1p" FT /translation="MPSALRSGGLPAHRLSSSLELKQKKSATGTNLPSSPE FT MLILRQNLEETRKKNESLQEQLTQLRWLMEEKLREQREDAQRREEEARRRE FT EAAKADNEKLRVEQQETHTTLIAISAQLRDLQQKNQMKRQQQHQPPQQPGP FT STSAVSLRNVEVQAQPEEDIDHSSFVEVVRRKPRGINSGKSSSQQREQQQR FT SLQQQQQQQQQQQQQQQEQQQQQQQQRKIRRPKADLIEVVPQEGLTWDSVY FT RKVRDTVRDDPAHKNLEEHIGMGKRTRADLLRIELSRSADSTLVLQEVQEI FT IGGSGVARVVTEMTELLVTHIDPLAEEQELKAALKEELQVNAGVTAVSMWQ FT LFDGMKRARLCLPTKAAKQLAGRKLRLCGCISSIMEAMPVSVDRQRCYRCL FT ERGHLARDCQSPVDRQQACIRCGADGHYAKSCTSEIKCAACNGPHRIGHIS FT CARPAARCLH" FT CDS 1756..5358 FT /product="MinoAg1_2p" FT /translation="MLPLSGKRPSGARLSISRGSPTGVYSVRRRWSLCQKL FT HFRDQVCCVQRSPPHWPYLLCSSCSAMPALKILQLNVDHCREGQALALQAA FT REHRADILLLSDMHRPPENNGRWAYDVSKKVAVVATGSYPIQRVWGSELRG FT LVAAQVAGITFISVYAPPSLSAHEFERLLGAIELEAASHSRVVVAGDFNAW FT HEEWGSGRNGQRGIELLQTVETLGLSILNQGRKPTFIGRGFAASSVIDVSF FT ASREIVRPDTWSVIPFSRSDHELIAFEVKQPDENPAGAQQHLSHRPQRSTR FT KNPAGRQHDRCDSRRWKTTQFNRQSFRVALRANNFQERAVSHIGMIEALVD FT ACSETMQRISGLHKSPHHDMYWWTPAIKRLRDDCLAARERVQLSHDLEERS FT MAAAAHRTAKSLLEKAIRTSKRQQFQELIDIAETNDYGTGYRVVMSRLRGS FT PGPPETDRAELERIVSDLFPTHPPVSWPVSSDAPTTVPSDSRVEPQELLSI FT AAGMVTRKAPGLDGIPNAAVKVAIEEYTEEFCRLYQDCLSRGTFPPQWKRQ FT RLVLLPKPGKPPGESSSYRPLCMLDALGKVLERLILNRLNRHIEQQDSPQL FT SDAQYGFRRGRSTISAIQRVVDAGTAAKLFRRTNTRDKRCLMVVALDIRNA FT FNTANWQAIADALQSKNVPSYLMKIIGAYFEGRKLLFDTSEGPVERHISAG FT VPQGSILGPTLWNMMYDGVFGVGLPLGAEIIGYADDLVLLVPGTTPTTAAA FT AAEEAVAAVKQWLLEHRLELAHSKTEMTVISSLKQPPEEVFITFGGVNVPF FT SRSIKYLGVRVHAHLSWVPHVKEITLKATRIVHAVNRLMPNLHGPRTSMSR FT LLANVADSTMRYAAPVWHEAIGNQECCRLLRRVQRKSAIGLARTFRTVRYE FT TAVLLAGLLPICLAIKEDTRVHSRRGTGLNCAIRREERQRSMEEWQATWGA FT DAANERASRYVRWAHRVIPDVGSWQSRKHGDVTFHLSQVLSGHGFFREHLC FT GMQLTSSPDCTRCPGVAESAEHAMFECPRFDSTRTELLHGVVPETLLEHML FT QSPENWSNVCEATKRITSALQQDWDETRRELAEQGAPRVADNQHNQDNDRT FT SLYSARNTSEEQRGRRHPTPSPPPRAVGRRAEVRSLGERYRRQLLVVEERR FT QISGHSVGGVTSQEDGTLVEASRQGMNGAEKTAATEADVASR" XX SQ Sequence 5660 BP; 1507 A; 1467 C; 1639 G; 1047 T; 0 other; ggaagctgtt ggatgctgtc aggagttgtt tacaaggacg aacgtcaaaa tacgagctcc 60 tatataatag cggaaaatcg tgtgaaagcg tcaataagtt gtgcggaata gtgcggaacg 120 tgcttagtaa gcctagggcg cttgcccgtg aagtggagat tgccagctag gtatagagaa 180 gtgcaagaga acagcatcgt acggattcgg cgccggcgac cccggctggg gcggtctcaa 240 gccggtgaca gttcaggacg gcgacagccc aatcaccaat catcaacaat tcccaagcag 300 aaaatttgaa ttggccgttg gtggtgtaag atcggagaat catcttaata cgttcctcgt 360 atcacggcct agaagttaaa tctggtggat gaggaaaatt ggcacgttca ccaaaaatac 420 tcccactatc ccccatcaca ccccccccat aatccccttt actcggtgtg ccttaacacg 480 gtcttaacag gggcggtaaa gaccttcctc gatctcggtg aggcaaacac agatagtact 540 tgagaaaaaa aaaagaaaaa cccaaaacaa agctggaagg gtaagaataa tgccgtcggc 600 actccgctcc gggggtttac cggctcaccg gctgtccagt tcgttggaat taaaacaaaa 660 aaagagtgcc acgggaacca acctcccttc atcacccgag atgttgatat tgcggcagaa 720 tctggaagag accaggaaga aaaatgagtc tcttcaggaa cagctaactc agttgagatg 780 gctcatggag gagaagctcc gcgagcagcg agaagatgca cagcgtagag aggaagaagc 840 gcgtcgcagg gaagaggccg caaaagccga caatgagaag ctgcgggtgg aacagcagga 900 gactcacact acattaattg caatatcggc acagttgaga gacctgcaac agaagaacca 960 gatgaaaagg cagcagcaac atcagcctcc ccagcaacca gggccatcga cgtcagccgt 1020 ctcattgcgg aacgtagagg tgcaggctca accagaggaa gacattgacc acagctcgtt 1080 tgttgaggta gtgcgccgca agccccgcgg gataaacagc ggcaagtcct ctagtcagca 1140 acgtgagcag cagcagagat cgcttcagca gcagcaacag cagcagcaac agcagcagca 1200 acagcagcaa gaacagcagc agcaacagca acagcagcgg aagatacgtc ggccaaaggc 1260 agacctcata gaagttgttc cccaggaggg ccttacatgg gatagtgtgt accgcaaggt 1320 acgcgacaca gtgcgagatg acccagcaca caagaatctc gaagaacata ttgggatggg 1380 taagcgcacg cgagcggacc tccttcgtat agagctcagc cggtcggcag actccactct 1440 agtgctacag gaagtgcagg aaataatcgg agggtctgga gttgctcgtg tcgtaacgga 1500 gatgacggaa ctactagtta cccacattga cccacttgcc gaggagcagg aattaaaagc 1560 agctctcaaa gaagagctgc aggttaacgc tggcgtgaca gctgtaagca tgtggcaact 1620 ctttgatgga atgaagcggg caagactttg tttgccgacc aaagcagcca aacagcttgc 1680 cggacgaaag ttgagactgt gcggttgcat cagcagtata atggaagcca tgcccgtctc 1740 ggtagatcga cagcgatgct accgctgtct ggaaagaggc catctggcgc gcgattgtca 1800 atctcccgtg gatcgccaac aggcgtgtat tcggtgcggc gcagatggtc actatgccaa 1860 aagctgcact tccgagatca agtgtgctgc gtgcaacggt ccccaccgca ttggccatat 1920 ctcttgtgct cgtcctgcag cgcgatgcct gcactaaaaa ttctacaact caacgtagac 1980 cactgtcggg agggccaggc cttagcacta caagcagcgc gagagcaccg tgctgacata 2040 ttgcttctgt ctgatatgca caggccgcct gagaacaacg ggagatgggc gtatgatgta 2100 tccaagaaag tagcggtagt agccaccggc tcgtacccta tacagcgagt gtggggcagc 2160 gagctacgcg gactagtcgc tgctcaggta gccggtatca cattcatcag tgtatacgcc 2220 cctccaagcc tatcagcaca tgagtttgag cgactattgg gggccattga gttggaagcc 2280 gcgtctcatt cccgcgtcgt agttgcgggg gacttcaatg cctggcacga agagtggggc 2340 agcgggagaa acgggcagcg tgggatcgag ctactgcaaa ctgtggaaac actgggactg 2400 tcgatcctca atcaaggtcg caaaccaacc ttcatcggac gaggtttcgc ggctagtagt 2460 gtcattgatg tctcgtttgc gagtcgggag attgttcgcc ccgacacctg gtcagtgatc 2520 cccttctcga ggtcggatca tgaattaata gcgttcgagg tcaaacaacc agatgagaac 2580 ccggccgggg cgcaacagca cctgtcgcac cgaccacaaa ggtcgacacg caagaatcca 2640 gcaggccgac agcatgaccg ttgcgatagt aggcgatgga aaacgaccca attcaaccga 2700 cagtcatttc gagtagcact acgcgccaac aacttccagg agagagcggt gagccatatt 2760 ggcatgatcg aggcacttgt cgacgcctgc agtgaaacta tgcaacggat ctctgggctg 2820 cacaagagcc cacatcacga catgtattgg tggaccccag cgatcaagcg cctacgtgat 2880 gattgccttg ccgcgcgaga gagagtacaa ctgtctcacg atttggagga gaggagcatg 2940 gcggcagcag ctcaccgtac agcgaaaagt ctgctcgaga aggccatccg taccagcaag 3000 cgccaacagt tccaagagct gatagacatt gccgagacca acgattacgg aaccggttat 3060 agggtggtga tgtcccgact gcggggtagc cctgggccgc ctgaaacgga tcgcgccgaa 3120 ctggagagga ttgtctctga cctgttcccc acgcacccac ccgtatcatg gcctgtctct 3180 tcagatgctc caacgaccgt cccatcagac agcagagtag aaccgcaaga actactatct 3240 attgctgccg ggatggtaac gaggaaggcc ccaggtttgg acgggattcc gaacgctgcg 3300 gtgaaagtgg cgatagagga atacacggag gagttttgtc gcctgtacca ggactgtctc 3360 tctcgcggca ccttcccgcc gcagtggaaa agacagcgac tggtactact cccgaagcct 3420 ggcaaacccc caggagagag cagctcgtat cgaccgctgt gcatgcttga cgcacttgga 3480 aaggtactgg agcggctcat tctcaatcgc ctgaatcgtc acatcgagca acaagactca 3540 ccgcagctgt ctgatgccca gtatggattc cgccgaggac gttctaccat cagcgcgatc 3600 cagcgtgttg ttgacgcggg cacagcggcc aagttgttcc gccgcacgaa cacccgcgat 3660 aaacgctgcc tgatggtggt ggcactggac atccgcaatg cattcaacac tgccaactgg 3720 caggcaatcg ccgacgcgct gcaaagtaag aatgtcccgt catacctgat gaagatcata 3780 ggagcctact ttgaaggacg caagctgctg tttgacacta gcgaaggccc tgtcgaacgt 3840 cacatcagcg caggagttcc acaggggtcc atactgggcc ctacactgtg gaatatgatg 3900 tatgacgggg ttttcggagt tgggctgccg ctgggggcag agatcattgg ctatgctgat 3960 gacctggtgc tattagtccc aggcacaact ccgacaacag cagcagcagc agcggaggaa 4020 gcagtggcag cagtgaaaca gtggcttctc gagcaccgct tggaactggc tcattctaag 4080 acggagatga cggtgatctc tagcctcaag cagcctccag aggaagtttt tatcactttc 4140 ggcggagtga acgtgccgtt ctcgcgctcc ataaagtact tgggggtgcg tgtacatgcc 4200 cacctatcat gggtacccca cgttaaggag ataactctga aggccacgcg gattgtgcac 4260 gccgtcaatc gactcatgcc aaacctccat gggccaagga cctcgatgtc tcgcttgctg 4320 gcaaatgtgg ccgactcgac catgcgctac gcagcacctg tatggcacga agcgattggc 4380 aatcaagagt gctgcagatt acttcgtcgg gtgcaacgta agtcggcaat tggcttggcc 4440 agaacgttcc gaacggttcg ttacgagact gcagtgttgc ttgcgggact cttgccgatc 4500 tgcctggcaa tcaaggagga cacccgagtg cacagtcgcc gtggaacagg tttaaactgt 4560 gcaatacgga gggaggagcg ccagcggtcc atggaagagt ggcaagcaac gtggggcgca 4620 gacgccgcca acgaaagagc cagcagatac gtcagatggg cacaccgcgt aattccggac 4680 gtgggatcct ggcagtcacg gaaacacgga gacgtcacgt tccacctatc ccaggttctt 4740 tccggccacg gttttttccg ggagcacctg tgcggtatgc agctcacgtc gtccccggac 4800 tgtacgcgat gccccggcgt tgcggagagc gcagaacatg ctatgtttga gtgtccgcga 4860 ttcgactcga cccgaacaga gctgctgcac ggagtcgtcc cggaaacgct gcttgaacac 4920 atgctccaga gcccagagaa ctggagtaat gtatgtgagg ccaccaagcg gataacatca 4980 gcgctacagc aggattggga cgaaacccgg cgagagctgg ccgaacaagg cgccccacgt 5040 gtggccgata atcaacataa tcaagataat gaccgcacct cgttgtacag tgcgaggaac 5100 accagcgaag agcaacgggg tagacgtcac ccaacaccat cacctccacc cagagcggta 5160 gggaggcggg cagaagtccg atctcttggg gagcgctatc gcaggcagct actggtagtt 5220 gaagagcgtc ggcaaatatc cggccacagt gtcggtggag ttacctcgca ggaggatgga 5280 acattggtcg aggcgtctcg acaaggcatg aacggtgcag aaaagactgc agcgactgag 5340 gcagacgtgg catcacgcta gatgatgttg gtgatctgtg gaggatcttg ggaggaccaa 5400 agtatggtca aaagggcgat ctccgcatcc tgagcggcag ggaggacaaa acgaatatga 5460 aagggaaaga aggggggaaa taaacatgtg ttaggtgcct ggcgcacggg aaagagaggc 5520 tccgaggagc agtaaaagcc ctccctcata gacccctcgc ggggcaagag ggaagggagt 5580 gggcgaggac aggggatata atgtgtaaat caataagaac aataaaacct gtccgtgatc 5640 taaaaaaaaa aaaaaaaaaa 5660 // ID GYPSY56-I_AG repbase; DNA; ANG; 4726 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY56-I_AG is an internal portion of retrotransposon GYPSY56_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY56-I_AG; GYPSY56-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY56_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4726 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY56_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 151-151 (2004). XX DR [1] (Consensus) XX CC GYPSY56_AG is a family of Gypsy-like LTR retrotransposons that, CC according to the amino acid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY57_AG, GYPSY58_AG, GYPSY59_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY56-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. The consensus encodes the 1504-aa CC GYPSY56_AGP gag-pol like polyprotein (pos. 202-4713). The CC sequence of the LTRs flanking GYPSY56-I is deposited as CC GYPSY56-LTR_AG. CC GYPSY56_AGP: CC MNPSGSESSEEEQRRLSQMFWQQSAAAVHQPASGSSSAFVLGSLDQQARGQVASVQNPPSLVTA CC PAQNAGVPPSLSAAPGQNATISPSHNTGQTLSSQPQNASSPDLMQMILQLQIQMNQLMMQKQHQ CC PERQSVSVPVLNPEQILDSLSSHVREFQYDEEAKITFAAWYARYEDFFAKDASRLEDEAKVRLL CC VRKLGAAEYNRYANFILPRYTRDFSFQETINKLTALFGNRESLLSRRYKCLNTTKARGEDYLSY CC ACRVNRVCVDFEIGKISEEQLKCLVFVCGIKNEEDVEIRTRLLSRIEDRTDITLESLTADCQRI CC INLKEDSAMIEKVHEGNVYAVKKDVKQQQGFQHRQRSKQQLKFYRSCWSCGGQHLQRDCTRKRI CC VCNVCGKQGHIGKNCFHRRKPFGFRRRNDSADLRVVRVNAMSVCGRRKFVPVKLFGMDVRLQLD CC TASDITVIDRQLWKRIGSPCLRPVTIRAKTASGARLQLNGEFRAQVGIKEKSLVATIRVADANL CC RLLGTDLIDRFNLWSVPMDRFCCYISENQERPAESVMKIEQIKEQFPTVFRNTLGLCTKANIKL CC KLREGVAPIFRPKRPVAHAMESAVDDELNRLEKLGVITPTNFSEWAAPIVVVRKANGQIRICGD CC YSTGLNSSLCAHEYPLPLPDDIFAKLSGCHIFSKIDLSDAFLQVQIDEDYRHILTINTHRGLYY CC YNRLPPGIKVAPAAFQQLIDTMLAGTKRVCGYMDDLIIGGATEAEHDRHLKDVLKRIEEYGFTI CC RNEKCEFKTNEIRYLGHVIDSQGLRPDPNKIAAIRNLPEPTNLTEIRSFLGAINYYGKFVPNMR CC QLRFRLDELLKHGERFQWDHKCKEAFDQFKKILSSNLLLAHYNPNEKIVVSADASSVGLGATLS CC HRYADGSMKVVQHASRALTEAEKRYSQIDREGLAIVYAVTKFHRMLYGRHFLLQTDHRPLLQIF CC GSKKGIPIYTANRLQRFALTLQLYDFEIEYVRTEQFGNADILSRLIKNRSQPEDDYVIASIEME CC RDVKAMAIEAMSNFPISFREIERHTSADAMLRKVRHYIQDGWPTSVSYGNELACLNSRKDALTL CC IDGCILFGERVVIPQRLRERCLKQLHQGHPGMQRMKSVARSYIYWPFLDRDIMEYVRTCSSCAT CC AAKSPPHESPQAWGKTNTPWERIHVDYAGPIDGEYFLIVVDAHSKWPEIFKTPSSTSTATITML CC RGLFARFGMPVTLVTDNGTQFTSEAFSDFCLKNGVHHMRTAPFHPQSNGQAERFVDSFKRALRK CC IRIGGMALQEAIDIFLQTYRTTPNPQVEQNKSPAELMFGRHIRTCFELLRPPPRLENQATYNST CC RLFKQSDLVFIKEYSRNNWKWIPAVIVRRIGHVMYLVQTNDRRTRRCHMNQIRQRFYESTENRS CC LASIPLSVLLESWNLPLPPQTTINPSETETPVACTSIPQTSTTCMLQPESASQPTIASSRSTQH CC TPCHREAQQPLAPRRSSRNRTVPRRFAPYRMN. XX SQ Sequence 4726 BP; 1514 A; 870 C; 1147 G; 1195 T; 0 other; gtggcgacga gtgataaaaa agcgatgtta aaagaaaatt gtgtgtgtat gtgaacatta 60 acattattcc gtgataaaat tggtgtttaa gacaaatata gcagcgagtg aattgtacga 120 gaaaaaaaaa gaactatcag tgaaagaagt attgagtgaa gaaaagtgga caaaattccg 180 tctaaacgca agcgcaggag aatgaatcca agtggcagcg aatcatcgga agaggagcaa 240 cgacggcttt cgcaaatgtt ttggcagcag agcgcagcag ccgtacacca gcctgcaagc 300 ggatcgtcca gtgcgttcgt gttggggtca ttggaccagc aagcgcgcgg tcaagtggcg 360 tcagtgcaga atccgccatc gcttgtaaca gcgccggcac aaaacgcggg ggtgccgcca 420 tcgctgagtg ctgctcccgg gcagaatgcg acgatatcgc catcgcataa tacgggacaa 480 acgctttctt cccaaccaca aaatgcttcg tcacccgatc taatgcagat gattttacaa 540 ttgcagatac aaatgaacca gttgatgatg cagaaacaac accagccaga gcgtcaatct 600 gtgtccgttc cagtgcttaa tccggagcaa atactggatt cgttaagttc ccatgtaaga 660 gaatttcagt atgacgagga agctaaaatt acttttgcgg cttggtatgc acgatatgaa 720 gattttttcg caaaagatgc atcgaggttg gaagatgaag cgaaagtgcg tttgttagtc 780 agaaagttag gtgctgcgga gtataatcgt tacgctaatt ttatattgcc gaggtatacg 840 cgagatttca gtttccaaga aacgataaat aagctaacgg ctctgtttgg taatagggaa 900 tctctattaa gcagaaggta taagtgcctc aatacaacaa aggctcgtgg ggaggattat 960 ttgtcgtatg cgtgtcgtgt aaaccgtgtg tgtgtagatt ttgaaatcgg caagataagt 1020 gaggaacaat taaaatgcct agtgtttgta tgtggaatta aaaatgaaga agatgtggaa 1080 ataaggacac gtttgctatc gcgtatcgaa gatcgtacag atataacgct agaaagcttg 1140 accgccgatt gtcagcgtat tatcaattta aaagaggaca gtgctatgat agaaaaagtt 1200 cacgagggta atgtttatgc cgtaaaaaaa gatgttaaac agcaacaggg gttccagcac 1260 aggcaacgaa gcaagcaaca gttaaagttt taccgatcat gctggtcatg cggaggacag 1320 catttacaac gtgattgtac tagaaagcgt attgtgtgca acgtgtgtgg taaacaaggg 1380 cacataggga aaaattgctt tcaccgaaga aaaccgtttg gattccgtcg aaggaatgat 1440 agtgcagatc tacgtgtggt tcgagtaaat gctatgagcg tatgtggtcg gcggaagttc 1500 gtgcctgtaa aactgtttgg tatggatgtg aggttacaat tagatacagc ttcggatatt 1560 acggttatag atcgtcagtt atggaaaaga attgggagtc catgtttaag gccagtgaca 1620 ataagagcca aaacagcatc tggtgctagg ttgcagctaa acggagaatt tagagcacaa 1680 gttgggatta aagaaaaaag tctagtagca acaatacgag tggctgatgc gaatttacga 1740 ttgcttggta cagatttaat cgatcgtttt aatctgtggt ctgtaccgat ggatagattt 1800 tgctgttata tttcggaaaa tcaagagaga ccagcggaaa gtgtaatgaa gatagagcaa 1860 attaaagagc aatttccgac agtttttcgc aacacattgg gattatgtac taaggcaaat 1920 ataaagctga agttaagaga aggtgttgca cctattttcc gtcccaagcg gcctgtagct 1980 catgccatgg agagtgccgt agatgatgag ttgaatagat tagaaaaact tggagttata 2040 acacctacaa acttttctga atgggcggct cccatagtag tagtaaggaa agcaaatggg 2100 caaattagaa tatgtggcga ttattcaaca gggttgaatt catctttatg tgctcatgaa 2160 tacccgcttc cattgcccga tgacatattc gcaaagctat caggatgcca catatttagc 2220 aagatagatt tatctgatgc attccttcaa gttcagatag acgaagatta ccgacatatt 2280 cttacaatca atacgcatcg tggcctatac tactataatc gcctcccacc aggcataaag 2340 gttgctccag cagcttttca gcaactgatc gacacaatgc tagctggaac taagagagtg 2400 tgtgggtata tggatgattt aatcatcgga ggcgcaaccg aagcagaaca tgataggcat 2460 ttaaaggatg tgttaaagcg aatcgaggaa tatggattta ccatcagaaa cgaaaaatgc 2520 gaatttaaaa cgaatgaaat aagatacctg gggcacgtaa ttgatagtca aggtttaaga 2580 ccagatccaa ataagattgc agcgataaga aatttaccgg agcctacaaa cctcactgag 2640 attagatcgt ttctgggagc cataaattat tatggaaaat tcgtgccgaa tatgcgacag 2700 ctccgatttc gcttggatga attattaaag cacggagagc gtttccagtg ggatcataaa 2760 tgtaaagaag cgtttgatca gtttaagaag atactttcat caaatttgct cctcgcgcac 2820 tataacccta atgaaaagat agtggtatca gctgatgcat cctctgttgg gttaggagct 2880 acattgagcc ataggtatgc ggatggtagt atgaaagtgg ttcagcacgc ttcgagagcg 2940 ttaacagagg cagagaaacg atatagccaa attgaccgtg aagggttggc aatagtttat 3000 gcagtaacaa aattccacag aatgttatat ggacggcatt ttttactgca gacggatcat 3060 cgaccgctat tgcaaatttt tggctctaag aaaggcatac cgatttatac agcaaataga 3120 ttgcaaagat ttgcattaac tctgcaatta tacgattttg agatcgaata cgttagaact 3180 gagcagtttg gcaacgcgga tatattatcc agattgataa aaaaccgttc gcaaccagag 3240 gatgattatg ttatcgccag tatagagatg gaaagagatg taaaggctat ggcgatagag 3300 gccatgagta attttcctat aagcttcagg gaaatagaaa ggcatacatc ggcagacgcg 3360 atgctacgta aagtgcgaca ttatatacaa gatggttggc caacaagcgt gtcttacggt 3420 aatgagctag catgtcttaa tagcagaaag gatgctttaa cgcttattga tggttgtatt 3480 cttttcgggg agagagtggt gataccacaa agattgcgag agcgttgttt gaagcaattg 3540 caccagggtc atccaggtat gcaaagaatg aagtcagttg cacgaagtta tatatattgg 3600 ccattcttgg atagagatat aatggagtat gtcagaacat gttcatcgtg tgcaactgcc 3660 gcaaaatcac caccacatga atctccacag gcttggggta aaactaacac tccatgggag 3720 cgcatacacg tagattatgc aggacccatt gatggggaat attttctgat tgtagttgat 3780 gcacactcta agtggccgga aatatttaag acaccaagta gtacatcaac agctacgata 3840 actatgttga gaggattatt cgcccggttt ggcatgccgg tgactttagt aacagataat 3900 ggtacgcaat ttaccagcga agcattcagt gatttttgtt tgaaaaatgg tgtgcatcat 3960 atgagaacag cgccgttcca ccctcagtcc aacggccagg ccgagcgttt tgtggattcg 4020 tttaaaaggg cattgcggaa gataagaatt ggtggcatgg cgttacaaga ggcaatcgac 4080 atattcctgc aaacgtatag aacaactcca aatcctcagg tagagcagaa caaatcaccc 4140 gcagagctca tgtttggaag acatattaga acctgttttg agttactgcg gccgccacca 4200 agactagaga accaagccac atacaacagc acgagattat tcaaacaaag cgatctagta 4260 tttatcaaag agtatagccg aaacaattgg aaatggatac ctgcggttat agttagaagg 4320 attggccacg taatgtatct agtgcaaaca aatgatcggc gtactaggag gtgtcatatg 4380 aaccagataa gacaacgatt ttatgaaagc actgaaaaca ggtcattagc tagcattcct 4440 ctaagtgttt tgttggaatc gtggaattta ccgttaccgc cgcaaaccac gatcaatcct 4500 agcgagacag agacacccgt tgcatgtacg tctatacctc agacttctac gacatgcatg 4560 ttacagccag aaagcgcgtc acagccaaca atagcatcat caaggtcgac gcagcatacc 4620 ccttgtcatc gcgaagccca acagccactt gcacctcgtc gctcttctag aaatagaacg 4680 gtaccacgaa ggtttgcacc ctatcgtatg aactaaaagg gggaga 4726 // ID GYPSY2-LTR_AG repbase; DNA; ANG; 368 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 20-SEP-2005 (Rel. 10.1, Last updated, Version 2) XX DE GYPSY2-LTR_AG is an LTR of the GYPSY2_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY2-I_AG; GYPSY2-LTR_AG; GYPSY2_AG; Gypsy clade; GYPSY51-I_AG; KW GYPSY51-LTR_AG. XX NM GYPSY2-LTR_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-368 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "GYPSY2_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 76-76 (2003). XX DR [1] (Consensus) XX CC GYPSY2-LTR is a long terminal repeat of GYPSY2_AG (it internal CC portion is deposited as GYPSY2-I_AG). XX SQ Sequence 368 BP; 97 A; 115 C; 81 G; 75 T; 0 other; tgtagcgacc agaccgccat ctggcgtgag aatcgtgagc gatcgtgaca tccagggaca 60 aagacacgga tctccgtgga gcgcactcac caggcacacg aatgtgtcaa agtgacgtgc 120 gccgcgtgtc cagcgctaca cttcctgctg tcagcgagca cattctctct cttgcgaccc 180 aacctcgaaa gcgaacagac ctcttccttc gctccgcgct cgaaacttcc tcaacgtgaa 240 accctctcgc acgaacgtgc gcaaccgttc ataatatagt gtaaaataaa gttccgtatt 300 acctactcac gaaacccaac gcgttcgcga cataaaaaat tagggaccac ttttgtggcg 360 cccctaaa 368 // ID hAT-2_AG repbase; DNA; ANG; 4803 BP. XX AC . XX DT 24-MAR-2005 (Rel. 10.02, Created) DT 29-MAR-2005 (Rel. 10.02, Last updated, Version 1) XX DE hAT-2_AG is a hAT-like autonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; 8-bp TSD; KW Autonomous DNA transposon; hAT superfamily; hAT-2N_AG; hAT-2_AG; KW transposase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4803 RA Kapitonov V.V. and Jurka J.; RT "hAT-2_AG: a family of autonomous hAT DNA transposons from RT African malaria mosquito."; RL Repbase Reports 5(2), 46-46 (2005). XX DR [1] (Consensus) XX SQ Sequence 4803 BP; 1617 A; 890 C; 830 G; 1466 T; 0 other; taggcatggg ctaattgtgc gagacctatg ccgcacactc agttcactca agtgtgcggc 60 tgaccaccgc acgcgaaaaa gtgtacgggt aggtcttccg cacgcataag gaatgacttc 120 ggttgtgact ggttgaacga aaaaaaaaac attaatttta gtcgactgtg ttcaacagtc 180 aaataaaaaa tcgatcacac tctgagcgag aaagaatacg caaattgtta gcgtatttag 240 ttgggtcagt cgcactaaaa gaactcctgt ccaagactga catggaacga gaaatacact 300 catacccgtt gggtcagtcg cacactgccg cacgcgtgtt cagtttggtc agtcgcacac 360 acagctgcac aaaaagaaat tctgtcgcac actgccgcac gtttgctcag ttcgttcatt 420 cgcacacagc cgcacaaaaa gaacttctgg cgcacactgc cgcacgcgtg ttcagttggg 480 tcagtcgcac acagccgcac aaaaagaact tctgccgcac actgccgcac gtttgctcaa 540 ttcgttcatt cgcacacagc cgcacaaaaa gaacttctgc cgcacgctgc cgcacgtgag 600 gtcagttggg tcagtcgcac acagccgcac aaaaagaact tctgccgcac actgccgcac 660 gtttgctcaa ttcgttcatt cgcacacagc cgcacaaaaa gaacttctgc cgcacgctgc 720 cgcacgtgag gtcagttggg tcagtcgcac acagccgcac aaaaataact cacactgact 780 acgctagtaa tatgctcgtt ctctctgcgc tgtcggatca gtcgcacaca gtcatacaaa 840 gaactgctac cgcacactgt cctgatgaaa cacagggttt gatgtctttt tggttgtttc 900 ttcatttttt ggatttatct cacattttat tgcatgcttc ccaaattatt ttggactcat 960 tccacaattt attatatgtt tctcatataa ttttgtattc gttttgtgat ttattcttac 1020 catcccagtt gttttgaaac aaataattga acaaatttta tgaaacgtta aataatcgtg 1080 atccataaac cctgtaatag gagacaaaat tggtatcaat agcatcttct tccttctcat 1140 tttctgctga ttgtttttta cgtagcacca gacattctta gcgtggggaa ctgtgtggta 1200 ttgatatgta tcgaatgtaa gagcaaggag ggaaaaacgt acactaggta ttcgctagtt 1260 gaatgatttg aaatgtggat cattttttta agttgcatgg tcgatgtgag atttataaat 1320 cgatgatttt taatcttgaa atattcacca catgcaaaat caaaaaatct ccaaatcttt 1380 atattctttt gctttttatc atttaaaaaa atatgccaat tcaaaatgca atacatgtca 1440 cacattacaa tttcaattag cgagctcaaa ctgtaccatc ccacgatgtt ttattcgaac 1500 accattacta gtgaggcatg cttttcgcac actaagttga tgagtgcaat atgtatataa 1560 aagaacgtca agtattcgga aagtggctta cccattatga atgagtccct cgtttgttgg 1620 tgttgtatga ataatattgc agtaggtaac aaagtgaaat agtattgaag tgcgtttccg 1680 gtatcagagt tttagctttg cattgtcgtt gcgattccta ttttggttgt gtatcgaatc 1740 agtcgcatta aaggaaagta cccagtttga aaattcataa cgcgtaaata ctttttctgt 1800 ggaagaatcg gcagggataa cctgataact aattaaaggt aagttgcttt attattttta 1860 acttttaaca agatgagaaa aaatgacaaa tcgtcattta ttcttttttt tatttcataa 1920 aaaaaatcct agttcagtgg tcgagaaagt acggcaatcc gtttacgatt tttgccagca 1980 ttttcactcc ttaacagccc cttgctattg taggagtatt caaaatctcc cacagtttca 2040 cgccttcatc tgaaaatccg gtttggaaat ttactgacat ttgtatagaa tttagctgtc 2100 tatggcgtca agttaaccga cgtttttttt ctagttcagt ggtcagcaaa ctttgtggtg 2160 aaaagagcca aattctatga aaatgttggt agatttcgat atgaagagcc aaatttgaaa 2220 accaagcata caattgagtt aaccactgtc ctcatttgga ccttccttcc acaaatgaca 2280 ttcaaagcca aaactattca atgttttggt cacaccagtg acaactatga tcgactgact 2340 tttcatgatg actgagtaac ttttaatcga agtaatagat ttaatcatta tagtaaaatt 2400 tacatcatca attaaaaata atgatctaca atttatcaat aattaaagtt tcctcaaaat 2460 accaaagcac agtgatagta ttacgtttat tttatttgaa taatattgct ataattataa 2520 tttacaacaa ttgctgacgg gactggaact ataccaaaac gttctttaat taaaaaaaaa 2580 ctgttcgttt tagcagtcat actgttaaag ccatttaggc tgttagataa ttgaacaaat 2640 atttaaaatt ttatttttga ttaatgttgt aatatgtttt agttaagatg gcttctaatg 2700 tcgctacgtc cagctcgatt tgggatcatt ttacaaggaa cggaacaaaa gcgaaatgtc 2760 gttactgcta caacgaaata gcatttacga aaggatctac ctccaaccta aagcgtcata 2820 tgaaagcaaa acatccatca atatcacttg aaaggcaaca tctcgcatca actacatcac 2880 catcaacaca aaatagtgaa ccaggtccat cgtgcagttc gcgccctcct aatttaattt 2940 caaatttttt caaacaaccc atgaatagtg aaacaaaacg taaattagat atgatgcttt 3000 taaaattaat ttgcaaagat tgcctcccat tgtcaatagt agagagcgaa gcattcaaaa 3060 catttgttgg ttgtttgaat caaaattatg atttgcctac ccgtaagaat gtatcaaatg 3120 ctctattacc cagtatttac aacgaaatat tggttaaagt gcaaggagaa gtgagaaatg 3180 caacttctat agcattgact actgacggtt ggacaaatgt aaataataca agttttttag 3240 gtttaactgc acattttata gataatgatt acaaactacg ttcgtgcctt ttagaatgtt 3300 ccgagatatc actatcacat agtggccaaa atatagcggc ttggatcaag gaagttataa 3360 ttaaatacga aattcaggac aaaatagttg gtattgtaac cgataatgcg gccaacatga 3420 aaagcgcggc tagagaatta gaatttaatc acgttacatg ttttgcccat agtttacatt 3480 taattgttaa agatgccatt aaaaaatcaa tcatatctac cgttgatgaa gttaaaagga 3540 tagttatgta ttttaagaag agtccaaaag ccactaatga attagctgat acccagtcaa 3600 aatttaattt accaaattta aaattaaagc aggatgttcc aacgcgatgg aattcaactt 3660 atgatatgct aaacaggttt tataagaata aaatagctat tgttgcttgt gcagacaaat 3720 taaacacctt ggatcctatt aaaatagact gggcaatttt agaacattca ttgaacgctt 3780 taaaaatatt cgatgttgct actaacatgg tttcagcaga aaaaaatatt actgtttctc 3840 acgtaggcct actaagtaag atgataatac gcaaattaaa cgaaactgat tatactattc 3900 cggaattagg caatctcgtg acacatctta aagaaggcgt aaaaaaacgt ttggaaattt 3960 attccaataa tcaaatcata gcaaaatcta tgttattgga tcctaggata aaaaagcaag 4020 gtttccatga agaaccctta aaatatcggg aaacatacga acttattata caagagttga 4080 ttccgtttca aactccctca atgaatgcac aagaaccaca caatatagac aatgaggcaa 4140 acttgttact tggtgagttt ataacatggg ttaataacgc agaatgcgag attgaaagcc 4200 ctatagaatt agcaaaaaat gagctaaata gctttcttaa aataagaaac atagacatca 4260 aaaatgatcc tttggaatgg tggagaattc atagttcaaa atatccgtca atttatgcac 4320 tcgccaaaac aattatttgt attccgggaa catctgtacc atgtgaaaga ttattttcaa 4380 aagctggaca gatttattca gataagcgtt ctagattaca tcctaaaaaa tttaaagaaa 4440 ttatttttat tcagcaaaat gtagataaat tttaaaaaat tatttagttt ttggattgaa 4500 tttaggtaag cctaagaatg aaataaattg ttagtgtgta tgatatattt tttataactt 4560 ttctcttaaa ctaaaaattt ataaattaga attgtaacac catggactaa acatctcaac 4620 aaatacataa aagtaaagtt ttgttcgagc atccgcatta tttatttaaa ataaaagaca 4680 gttcaaaatt taattctttg tgtgaccgac agtgcgactg acctaccgca cgcgttcagt 4740 tcctcatact gaacgaacta caccgcacgc gttcactgaa aagaatcaaa ttgcccaagc 4800 cta 4803 // ID CR1-1a_AG repbase; DNA; ANG; 5247 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 29-OCT-2010 (Rel. 15.11, Last updated, Version 3) XX DE CR1-1a_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW L2B; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; CR1 clade; DNA/RNA-binding; KW CR1-1_AG; CR1-1a_AG. XX NM CR1-1a_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5247 RA Kapitonov V.V. and Jurka J.; RT "CR1-1a_AG, a subfamily of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 13-13 (2003). XX DR [1] (Consensus) XX CC CR1-1a_AG is a subfamily of CR1-1_AG non-LTR retrotransposons. CC The CR1-1a_AG consensus sequence was reconstructed based on CC multiple alignment of ~10 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-1a_AG occurred less than 1 million years ago. CC The CR1-1a_AG and CR1-1_AG consensus sequences are 79% identical CC to each other. The 3' terminus of CR1-1a_AG is composed of the CC CAT CC microsatellite. CC CR1-1a_AG encodes two proteins: a 444-aa CR1-1a_AG-ORF1p CC (positions 441-1772) and 932-aa CR1-1a_AG-ORF2p (positions CC 1776-4571). CR1-1a_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 3-38) and gag-like zinc knuckle CC regions (aa positions 340-444). CR1-1a_AG-ORF2p is composed of CC the AP endonuclease and reverse transcriptase domains. XX FH Key Location/Qualifiers FT CDS 441..1772 FT /product="CR1-1a_AG-ORF1p" FT /translation="MECLACSAVVLINDDPILCAGKCGGNFHRRCVTPSLS FT KTAAKIINENKNVLYMCDRCLEHKSGLAGMDVDVSGSYDLLTQSIKNLESN FT VSVWISSALEKGIETLKTELCAQVERKLEQTLRESLSAVECSNKAKEALRA FT TFDDTKARETVEDESWATVTKKRKRTNSGNSNVQTIINRFDTGNVNSTPKI FT SDKVTGPILANKNKNSKTLVIVPKVGQSCDKTRADLRAKLDPRKQQVSEFR FT NGKDGQVFVQCSAQVKLDELRKEVENILGDEYATDLPLSRVKIIGMSEKYT FT DSHLVDLLKSQNEGIPWKQVKVIGMFENKIYKYQKYNAVLEIDYESDLCLA FT KLGKINVGFDRCKISKSVHVMRCYGCGQFNHKSTECKNKQACSKCGDEHKT FT SECTSSSLKCVNCVLANSVRNLKLDVRHAANDYNCPMFKKQIERRMQLSQ" FT CDS 1776..4571 FT /product="CR1-1a_AG-ORF2p" FT /translation="QVGEELGRFREILYFNVAGLSSNYAMFRETVEKVQPL FT LVLISETHVIEEEAFQQFHINGYRVVSCLSHSRHTGGVAVYARSEIVLKVI FT FNESLEGNWFLGVAVSKGMTAGNYGILYHSPSASDSRFVDILEEWLDRFLN FT FSKLNIIVGDFNIDWLNVEKSAKLKSLMDSVNMKQKVNEFTRIARQSRTLI FT DQVYSSTDSIKVTTDPLLKISDHETLVLNINVKRCETIQRKFKCWNRYSKF FT ALCNHVSQGLRQEAPSFNEAADLLWNTLKKAMGTLVEEKTILSRETSRWYT FT LELSRAKRKRDEAYQHFIRSNSGYDWTEYTRLRNTYSRNLKTTRRNYFSGE FT ISKHKGNSKELWKVLKSLLRPEESRVSVVKFDGLVESEDSIICQKFNLFFV FT NSVLDINQNIADARAPDSLFCNDAPGNQFKFQKITKEKLKTICFSLSKTAG FT IGNVNCNTIQDCFHVVGESLLIVINQSLEEGIFPETWKESLVIPIPKVSGA FT ASAEEFRPINMLHVLEKVLELVVKEQLVQFLTRNDLLISEQSGYRQGHSCE FT TALNLVLARWKVLMDRKESIVAVFLDLKRAFETISRPLLLQTLRRFGIVGK FT ELNWFENYLKDRTQRTVFGNSISEPIENTLGVPQGSVLGPILFIMYINDMK FT QVLKSCEINLFADDTVLFISHKDIKQAESLINFDLNALDGWLRYKKLALNV FT KKTSYMVMTAGVLDSPPSIVINKEPIERVRQVKYLGVILDDRLKFNTHIDW FT VIAKVASKCGVISRLAKDLDFFGIVNLYKSLISPHFDFCSSILFLGNKGQI FT KRLQRLQNRIMRLALGCGRRTSSFVMLDILQWMSVEQRIVYQTMTFIFKLL FT GGLLPGYLGERIVRGSDVHRHCTRRANEPRVPNLISHGARNSLFFKGIQLY FT NRLPGEIKNASNLPDFKRRCAAYVKQTV" XX SQ Sequence 5247 BP; 1575 A; 847 C; 1255 G; 1570 T; 0 other; tcaacccttg aggtgaatgg tgacaagtgc agtcctgtga atgtgtagct cccgtttgtg 60 aactgtgaaa acttgtgaaa tagagagcta agtgttattt tgttttgttc tacttgttac 120 ttaatgttag tgctacctgc ggtatcattg tttagtattt gtattattgt tttgtgaatg 180 taaaacacgg gtgaagtatc tcggtcgaca gcgtcactgc gttgtcgaaa gtgattaatc 240 tttgctcggt atgtgtttgt gttcgtgcta gacatacgta tcgttgtccg tttgtcaaga 300 cgtgtacacg agtgtaagcg tttgatatcg caggctgtag tagagtagtg tttgtgtgca 360 tttgctgtta cctttttttc atgtctgtgt gataattttt ttaattctaa aacagctcgc 420 ccgaattttg tttttacggc atggagtgtt tagcatgctc cgccgtagtt ttaataaacg 480 atgacccaat tttgtgtgcg gggaaatgtg gtggcaactt tcatcgtcgc tgtgttaccc 540 cctcgctctc gaaaacagcg gccaaaataa ttaatgagaa taagaatgtg ctgtatatgt 600 gtgatagatg tttagaacac aaatcgggct tggcgggtat ggatgtagat gtgagtggat 660 cgtacgattt actcacacaa tccataaaaa atttggagtc gaatgtgagt gtatggatat 720 cgagtgcctt ggaaaaggga atcgagactc taaaaactga gctctgcgca caagtggaac 780 gcaaattgga gcaaacattg cgtgaaagct tgagtgcggt agaatgctca aataaggcaa 840 aggaagcctt gcgtgcaacc tttgacgata ctaaggccag agaaacagta gaggatgaaa 900 gttgggctac agtgactaag aaaagaaaaa ggacgaatag tgggaacagt aatgttcaaa 960 ccattattaa tcgttttgac acggggaatg ttaattcgac gcccaagatt tctgacaagg 1020 ttactggacc cattttagca aacaaaaata agaatagtaa gactctggtt attgtaccaa 1080 aggtgggtca atcttgtgat aagacaagag ctgaccttcg cgctaagctg gatccaagga 1140 agcagcaggt gtcggaattc cgtaacggca aggacggtca ggtgtttgtt caatgttctg 1200 ctcaggttaa attagatgaa ctcaggaaag aagtagaaaa cattttggga gatgagtatg 1260 caacggattt accattgtca cgtgtaaaga taattgggat gagcgaaaaa tacactgact 1320 cacatttagt agatctttta aaatctcaaa atgagggaat accctggaaa caggtcaagg 1380 tcataggaat gtttgaaaat aaaatttaca agtaccaaaa atataatgcg gtcttggaaa 1440 ttgattatga gtctgaccta tgtctggcaa aattaggaaa aataaatgtg ggatttgata 1500 ggtgtaaaat ttcgaagtcc gtgcatgtta tgaggtgtta tggttgtggt caatttaatc 1560 acaagagcac cgagtgcaag aataagcaag cttgttcaaa atgtggtgat gaacacaaaa 1620 cgtccgaatg tacttcatct tctttgaaat gtgtgaattg cgtgttagca aactctgtta 1680 gaaaccttaa actagatgta agacatgcgg ccaatgatta taattgtccg atgtttaaaa 1740 aacagataga aaggcgtatg caactttctc aatagcaggt aggggaggag ttaggacggt 1800 tcagagagat tttatatttc aatgtagccg gtctttcgtc caactacgcc atgtttcgtg 1860 agacagtaga gaaagttcaa ccgttgctgg tcttaatctc tgaaactcat gtgattgagg 1920 aagaagcatt tcagcagttt catattaatg gttatagggt tgtgtcgtgt ttgtcccact 1980 cacgtcatac aggaggtgta gctgtttatg ccaggagcga aattgtccta aaggtgattt 2040 ttaatgaatc attggagggt aattggtttc ttggtgttgc agtttctaag ggaatgacag 2100 caggcaatta cgggatattg tatcattcgc caagtgcaag tgactcaagg tttgttgaca 2160 ttttggaaga atggttagat aggttcttga attttagcaa acttaacatt atcgtcggtg 2220 atttcaatat tgactggtta aatgttgaaa aatctgcgaa gttgaaaagc ttaatggatt 2280 cagtaaacat gaaacaaaaa gttaacgaat tcacacgaat tgctaggcag agcagaacat 2340 tgatcgatca ggtttacagt agcacagact caatcaaggt cactactgat ccgttattaa 2400 aaatatcgga tcatgaaaca cttgttttga atataaacgt gaaacgttgt gaaacgattc 2460 aacgaaaatt taaatgctgg aataggtact cgaaatttgc tctttgcaat catgtgtcac 2520 aaggtttaag gcaagaagca ccgagcttca acgaagctgc agacttgtta tggaacacat 2580 tgaaaaaagc tatgggcacc ttggttgaag agaagacaat cttgtcaaga gagactagta 2640 ggtggtatac tttggaactt agccgtgcga aacggaaaag agacgaagca taccaacatt 2700 ttattagatc aaactcaggc tacgattgga ctgaatacac tagactaagg aatacataca 2760 gcaggaatct caaaactact cgtagaaatt actttagtgg tgagatatct aagcataaag 2820 gaaatagcaa ggagttgtgg aaagtgctta aaagtttact aaggccagag gaatcacgcg 2880 tttctgttgt aaaatttgat gggttggtag aatcggaaga ttcaataata tgccagaaat 2940 tcaatttgtt ttttgtaaac agtgttttag atataaatca aaacattgct gatgccagag 3000 cgcctgattc tttgttttgt aatgatgctc caggaaacca attcaaattt cagaaaatta 3060 caaaagagaa attaaaaact atttgtttca gcctgtccaa aacagcgggc atagggaatg 3120 ttaactgtaa tactattcag gattgcttcc atgtggtagg agagtctctt cttatagtga 3180 tcaatcagtc gctggaggag ggtatttttc cggagacttg gaaggaatca ttggtaatac 3240 ctattcctaa agtgagcgga gctgccagtg cggaagagtt tcgtcccatc aatatgttgc 3300 atgtactcga aaaggtgctg gaattggtgg ttaaggagca attggtccag tttctaactc 3360 gaaatgacct gttgattagt gaacaatcag gatatcgaca gggacactct tgtgaaactg 3420 ctttaaatct tgtactggcg aggtggaagg tgttgatgga tcgaaaggaa tcgatagttg 3480 ccgttttctt ggatctaaaa cgagcatttg agactatatc aagaccgtta ttgctgcaga 3540 ccttaaggcg ttttggtatt gtggggaaag agctcaattg gttcgaaaat tatttaaaag 3600 acagaactca gagaacagtt tttggaaact ctatatcaga gcctatagaa aatacccttg 3660 gagttccgca aggaagtgtt cttggaccaa ttttgtttat aatgtatatc aatgacatga 3720 aacaggtttt gaagtcttgt gagatcaatc tttttgccga tgatactgtt ttgtttatct 3780 cgcacaaaga catcaagcaa gcagagtctc tgattaattt cgatttaaac gctctggatg 3840 gttggcttag gtacaaaaag ctagcattaa acgttaagaa gactagttac atggtaatga 3900 ctgctggtgt attagacagt cccccatcca tcgtgataaa taaggaacca atcgaaagag 3960 tccgtcaggt taaatatctg ggggttattt tagacgacag attgaagttc aacactcaca 4020 tagactgggt catcgctaaa gtggcatcaa agtgtggggt tattagtagg ctggcaaaag 4080 atctcgattt ttttgggata gttaacctct ataagtcact gatttcacca cattttgatt 4140 tttgctcgtc gattctgttt cttggcaata agggacaaat taaaaggctt cagagattgc 4200 aaaatcgtat tatgcggtta gctttagggt gtggccgacg tacgtcgtct tttgttatgc 4260 tagatattct tcagtggatg tctgtagagc agaggattgt gtatcaaacc atgaccttta 4320 tatttaaact tttggggggc cttttgccag ggtatttagg ggaacgcatt gttcgaggat 4380 ctgatgttca tcggcactgt acacgcagag caaatgagcc gagggttccc aacttaattt 4440 cacatggtgc cagaaactct ttgtttttca aggggattca attatacaac agattacccg 4500 gagaaatcaa gaatgcgagt aacttgccag acttcaaacg taggtgtgcg gcatatgtta 4560 aacaaactgt gtaatgtcat atttgtgtag aagtcctatg tcactatgtt atgtacaact 4620 gcacttgtca tcatgagctt gatgatgatg ataagatttt tcttgatata tgaaaacaaa 4680 ttagaaaaaa tatataagaa agagacaaca taagtttgag acacgcgcgc gtacaagtgg 4740 acaatcggat ttagtttggg agagcttgct gtctgcatgt ctgggaggtc acagcgagat 4800 tcactcattg atgattgtaa gtggccaatt ccagatacat taggttctcc tgcatcggaa 4860 ggtagtgccc tggtgccatc gttggtacca gcggaactgg ccatatgcat gttgcagtgt 4920 gccctgataa tttccaatac attaggttct gctgcttcgg aaggtagtgc cctggagccg 4980 gtgtggcacc agtggaactg gccatatgca tgtcgcagtt atgtcctgat gtgttcgatg 5040 gagttgaccc gttttttctt gatgaaacta cgtcggtcgt cttgggtgtg tgtatgactt 5100 ggtggattct tttgaatgac gtccgatgcc accacttgct cgtaaatttt tattttattc 5160 aaaaactacc ttagagtaat attatcgtaa agatacttcc gtccttctca aacctgtgtt 5220 ggggtaagag gtgggactta tcatcat 5247 // ID GYPSY41-LTR_AG repbase; DNA; ANG; 311 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY41-LTR_AG is an LTR of retrotransposon GYPSY41_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY41_AG; GYPSY lineage; GYPSY41-I_AG; GYPSY41-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-311 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY41_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 75-75 (2004). XX DR [1] (Consensus) XX CC GYPSY41-LTR is a long terminal repeat of GYPSY41_AG (its internal CC portion is deposited as GYPSY41-I_AG). XX SQ Sequence 311 BP; 99 A; 67 C; 79 G; 66 T; 0 other; agttacgtag accgaatagc tgagtccacg taataagaag ccaacacgtg gacacgcgca 60 ccgagcaacg accgcatcac cggaacacgc gcgcatagca acatgacgcg catagcaacg 120 ggaaccggca cgtggacacg cgcataccgg atatgcagag tcagcagcta tagggtagaa 180 tcgaaggcta ggtgcattgt agaatttaag aaataagtta gttgtaattt ggatcagctc 240 agtgtaagaa gcgcttgcgc ttaagcaata aagttttttt ttatgaaacg tgaagccttg 300 cacttataat t 311 // ID Ag-Jock-1 repbase; DNA; ANG; 4725 BP. XX AC . XX DT 29-OCT-2010 (Rel. 15.1, Created) DT 29-OCT-2010 (Rel. 15.1, Last updated, Version 2) XX DE A Jockey clade non-LTR retrotransposon family from Anopheles DE gambilae. XX KW Jockey; Non-LTR Retrotransposon; Transposable Element; Ag-Jock-1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4725 RA Biedler J. and Tu Z.; RT "Non-LTR Retrotransposons in the African Malaria Mosquito, RT Anopheles gambiae: Unprecedented Diversity and Evidence of Recent RT Activity."; RL Molecular Biology and Evolution 20(11), 1811-1825 (2003). XX RN [2] RP 1-4725 RA Kojima K.K. and Jurka J.; RT "Jockey clade non-LTR retrotransposons from Anopheles gambiae."; RL Direct Submission to Repbase Update (24-SEP-2010). XX DR [2] (Consensus) XX CC [2] Consensus update. This consensus is generated from 6 CC sequences with >98% identity. XX FH Key Location/Qualifiers FT CDS 113..1726 FT /product="Ag-Jock-1_1p" FT /translation="MSKWRTVPYHAGPAVRKKRVKRTPDLLAEMRRRKREE FT AQPESSTPASKKKGEIKITPATAGDVAVNAAMDGIDVPNTAEEENHRKEKI FT PPIVVSGLNEEEYGSLCDATQSGAIKASFSFASGNTCRINAASRDDHDKVK FT KLLENYTKEFYSHEFKADKPYQVVLKGLRFGSAEQVTVMLRDVKLQPSLVR FT EVKIGEGRGLSSKLFVVSFPKGAISLEELQKITHINYTTVKWERFQPKHRD FT VLQCLNCLNFGHGAKHCAMSPRCAKCAGNHRSKQCTKEMGDVHTCANCMGD FT HSPYNRNCPSRKQYLQKRYSTQNAPRRLVPAPIPTRSSWAEPLPWVGQYRK FT DSVCTCHGCPHGHQQQNQQLQQQQLQQKLQQQQQQQQQQNQQLQQQQLQQK FT LQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQLQQQQQRQQQQQQQQ FT QQQQQQQQQQSLQCKYQIASASCSYSIPRKTHRQPAQDKLDQLRSRLSDEE FT DEEIKSQFGDIVRLWVEFKRLSKQCPRDQILMKMVEFITANF" FT CDS 1731..4373 FT /product="Ag-Jock-1_2p" FT /note="apurinic-like endonuclease and reverse FT transcriptase." FT /translation="MIKCATWNACSVLPKRLELIDFLNHEKIDLLAVTETH FT LKHNKTFFLPDHTVVRLDRPDQPKGGVLIAIRRTLAFKVLPLPNTHYIEAV FT GIEMLSAEGNFHFYAAYCPRQVSDTSGTSRLLKRDLQLLTRASRRFIVAGD FT LNARHQEWGNLRANRNGMILRELTQASACSVSFPFQPSFQNGNSYSTIDIF FT ISNITDRLDVPSTITALSSDHLPVVMNIVCPGTHATRQQRNYREADWPRFE FT RYITENISSAPPLLTIEDINTAITNVESCMKEAVDRFVPLIQVRHKVTELD FT ADTKRLIASKNSLRRTYQRTRNRTYYSLYRQVAKIVDARMVEVRNRQFSDR FT LKTLPKHSRPFWRLTKVLKDKRRPIPVLIDGGIPAFTPREKAVKLSVNFAQ FT AHRLSESMTSTYESRVADSISSLVSEPIEATNIERISSGELLGTIKYLVGY FT KAPGADGIFNIMLKHVGHSTVVLLTDVFNRCLELGYFPHLWKYAKVVPVLK FT PGKDPSLGSSYRPISLLGALSKLFERTIYTRMMAHCESNNVISEVQFGFRK FT HRSAAQQLQRVLDVVDSAKMRGKTTALALLDVEKAFDNVWHDGLIHKLRVQ FT GFPLYIVRLVQSYLSGRSSAVYIGSEKSDPYENNAGVPQGSILGPLLYNCY FT TADVPTLGANASLALYADDTAILYSAKPLRFIRAGLQRGLDSYVNFLKEWK FT IVVNNTKTQALIFPYKVGMTASNIIAKVGSLKMENNIIPWASEARYLGVIL FT DRRLTFKAHVGYIKTKTGHLFRMLYSLLKYNSGLSIENRLAIYGQIVLPSI FT TYGSIAWGRCSRTNMQTIQVIQNRFLKTIMGLPNRYPTRALHSETGFVPIK FT DKLSEIIECLKRKCRDSEVDIIRNLFSS" XX SQ Sequence 4725 BP; 1448 A; 1068 C; 1070 G; 1139 T; 0 other; ttcgctgtcg ggacgttgac gtgatcggac gtgttttttg cgctctgtgc tcgtttgtga 60 ccttgacgat taaagtgtgt ggtgtgttct gctcgcacca acacatgtgt aaatgagtaa 120 gtggcgaact gtgccatatc atgccggccc cgcagttcga aagaaaagag tgaagcgtac 180 gccggattta cttgctgaaa tgagaagaag aaaacgcgag gaggctcaac cggagtcatc 240 gaccccggcc tcaaagaaaa aaggggaaat aaaaatcacg ccggcgacag ctggcgacgt 300 agctgttaat gcagcgatgg atggtattga tgtaccaaat actgcggagg aagaaaacca 360 ccgcaaagaa aaaatcccgc caattgttgt ctcgggatta aatgaggaag aatacggttc 420 cctttgtgat gccactcagt caggtgccat taaagcctcc tttagtttcg cttctggaaa 480 cacgtgtcgc ataaatgcgg catcaagaga tgaccatgat aaggtcaaaa agttacttga 540 aaactacacg aaagaatttt attctcatga attcaaggct gataagcctt atcaggtagt 600 cctgaaagga ctacgattcg gctcggctga gcaggttacc gtaatgttac gtgacgtgaa 660 attacaacca tcgcttgtac gtgaagttaa aattggtgag ggccggggac tctcatcaaa 720 actctttgtt gtctcctttc caaaaggagc aatcagccta gaggagttgc aaaaaataac 780 ccatattaac tacaccaccg ttaaatggga acgattccaa ccgaaacaca gagatgtact 840 acagtgtttg aactgcctaa actttggaca tggtgctaag cactgcgcaa tgtcaccgcg 900 atgtgctaag tgtgcaggaa atcaccgctc aaagcagtgc acaaaagaga tgggagatgt 960 acatacgtgt gcaaactgta tgggggacca tagcccctat aatcgaaact gcccctcgcg 1020 taagcagtac ctgcagaagc gttactcaac acaaaatgct ccaagacgat tagtcccggc 1080 gccaatacca actagatcat cgtgggcgga accattacct tgggttggac agtaccgtaa 1140 agattctgtc tgcacttgcc acggttgtcc gcatggccat cagcaacaga accaacagct 1200 tcagcagcag cagctgcaac agaaactaca gcagcagcag cagcagcagc agcaacagaa 1260 ccaacagctt cagcagcagc agctgcaaca gaaactgcag cagcagcagc agcagcagca 1320 gcagcagcag cagcagcagc agcagcagca gcagcagcaa cagcagcagc agcagcagca 1380 gcagcagcag cagcagcagc agctgcagca gcagcagcag cggcagcagc agcagcagca 1440 gcagcagcag cagcagcagc agcagcagca gcagcaaagc ctacagtgca aatatcagat 1500 tgcatctgct agttgttcgt attcaattcc acgaaaaacc catcggcaac cagcccaaga 1560 taagctggat cagctgcgat cccgtctttc cgatgaggaa gatgaagaaa taaaatcaca 1620 gtttggtgac attgtccgct tatgggtgga attcaagcgc ttatcaaaac agtgcccgcg 1680 tgatcagatt ctaatgaaaa tggtcgaatt tatcacagcc aacttttgag atgattaagt 1740 gcgcaacgtg gaacgcttgc tccgtccttc caaaaaggct ggagttgatt gactttctga 1800 atcatgaaaa aatcgacctg cttgctgtga ctgaaacgca cctaaagcat aacaaaacat 1860 tttttctccc tgatcacact gtagttcgac tggaccgtcc tgatcagcca aaagggggag 1920 ttctaattgc tatccgtcga actttagcgt tcaaagtact cccccttcct aacacccatt 1980 acattgaagc tgtgggaatc gaaatgctca gtgctgaggg caactttcat ttttatgcgg 2040 cttactgccc gagacaagta agcgacacta gtggtaccag caggctgttg aaacgcgatc 2100 ttcaactgct cacccgggcc tcaagaagat tcattgtagc tggcgacctg aatgcccgtc 2160 accaggagtg gggaaacctt cgcgccaata ggaatggaat gattctgaga gaactcacac 2220 aagccagcgc atgctcagtg tcatttccat ttcagccttc gttccagaat ggcaattcct 2280 actctacgat tgacatattc atctcgaata taacagatcg tcttgacgtc ccttccacca 2340 tcacggcttt gtcatctgat cacttaccag tggttatgaa cattgtttgc ccggggacgc 2400 atgccactag acagcagcga aattatcgag aagcagactg gccacgattt gagcggtata 2460 taacggaaaa catcagttcg gcacctcctc ttctgaccat cgaagacatt aatacagcaa 2520 tcactaatgt cgaaagctgt atgaaggaag cagtagatcg ttttgttccg ctgatccaag 2580 tacgacataa ggtaactgaa ctcgacgcag atacaaaacg gcttattgcg tcaaaaaact 2640 cgttgagacg gacataccaa aggacgagga acaggactta ctattcttta tacaggcagg 2700 tagctaagat tgttgacgcc agaatggttg aagtaagaaa tagacagttt tcagatcgtt 2760 taaaaacact tcctaaacac tctcggccct tctggcgtct aacgaaggtg ctcaaagata 2820 agcgtcgacc gataccagtc ctaattgatg gtggaatccc tgcctttaca cctagggaaa 2880 aagcagtgaa attgtcagtc aattttgctc aggcacatag attgagcgaa tccatgacta 2940 gcacatatga gagtagggtt gctgacagca tcagttctct tgtttcagaa cctattgaag 3000 cgacaaatat tgagcgtatc tccagcggag aactcctagg aaccattaag tatcttgttg 3060 gctacaaagc accaggagcg gatggtatct ttaatattat gttgaaacac gtcggacaca 3120 gcactgtcgt gttactgacc gacgtgttta acaggtgtct ggagctaggc tactttccac 3180 acttgtggaa atatgcaaaa gtggttcctg ttttaaaacc tggtaaggac cctagcttag 3240 gatcaagcta tcgtccgatt tcactccttg gtgccttaag taagttgttc gaaagaacaa 3300 tttacactcg tatgatggct cactgtgaat cgaacaatgt catcagtgaa gttcagttcg 3360 gttttagaaa acatcggtcg gcagcccagc agcttcagcg cgttttggac gttgtcgact 3420 ctgccaaaat gcgtggaaaa accactgcct tggctcttct tgatgtggaa aaggcgtttg 3480 ataacgtctg gcatgatggt cttatacata agctgagggt gcagggcttt cctctgtaca 3540 ttgttaggct tgtacaaagc tacctaagtg gacggtcttc ggctgtctac attggatcag 3600 agaaatctga cccatacgaa aacaatgctg gtgtgccgca ggggagtatt ctgggtcccc 3660 tgttgtacaa ctgctacact gctgacgtcc ctacattggg ggcaaacgcg agcctggcac 3720 tgtatgcaga cgacacagcc atcctgtact cggcaaaacc tctgagattc atacgagcag 3780 gtctccaacg agggcttgac agttatgtga atttcctcaa agagtggaag attgtggtta 3840 acaacaccaa aacccaagca ttaattttcc catataaggt tggaatgaca gcatccaaca 3900 taatcgctaa agtgggcagt ttaaaaatgg aaaataacat cattccatgg gcttcagaag 3960 ctcgctatct aggagtcatt ctggataggc gtctaacttt caaagctcac gtaggctata 4020 taaaaaccaa aacaggccat ctttttcgaa tgctctactc cctcttaaaa tataattcag 4080 gattatcgat agaaaacaga ttggccattt acggtcagat tgttttacca tcaataacgt 4140 acggtagcat agcttgggga aggtgctctc gaacaaatat gcagactata caagtcatac 4200 aaaacagatt cctgaaaacc attatgggac tgccaaatag atatcccacc agggcccttc 4260 atagcgaaac aggttttgtt cctataaaag ataaattaag tgaaatcatt gagtgcctaa 4320 agcgcaaatg tcgagactcg gaagttgaca taattaggaa tttatttagc agttaggaat 4380 acgatgaaaa taccaatgaa ttagattaaa acaaaaatag attttaagtt agttaagtgc 4440 cttgtaaata atcttctcct ttcccataaa attcggtaat attaagccga tagagaattg 4500 atctcaagtc acaaggctca gaagccttgt aaattttttg cgaagtactg cttgttcttc 4560 tgctccaagg aactataaag atacatctca tgtaacatgt acctagtgta agcgccaaaa 4620 cggtaggaaa cctagcgagg attaaaaaac atcttgtctt tcaacaaatt tatatgaata 4680 aatcataaca tctttttgtg cttaaaaaaa aaaaaaaaaa aaaaa 4725 // ID BEL14-I_AG repbase; DNA; ANG; 5776 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 18-JUL-2005 (Rel. 10.08, Last updated, Version 2) XX DE BEL14-I_AG is an internal portion of the BEL14_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL14-I_AG; BEL14-LTR_AG; BEL14_AG; Bel clade; PHD domain; KW integrase; protease; reverse transcriptase. XX NM BEL14-I_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5776 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL14_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 35-35 (2003). XX DR [1] (Consensus) XX CC BEL14_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL14-I_AG, an internal portion of BEL1_AG is flanked by CC BEL14-LTR_AG CC LTRs. The BEL14-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 16 copies; they are ~1.5% divergent from CC the consensus sequence. CC The consensus sequence encodes a 1898-aa BEL14_AGp Bel-like CC protein CC (pos. 57-5750). CC BEL14_AGp is composed of the PHD (pos. 9-55), protease (pos. CC 277-400), CC reverse transcriptase (pos. 900-1060) and integrase (pos. CC 1600-1765) CC domains. XX FH Key Location/Qualifiers FT CDS 57..5750 FT /product="BEL14_AGp" FT /translation="MGPKKARGCKACGNQVDDTLYVQCDECDAWWHFSCAG FT ITASVEAVEKCAWLCEECARKTLREQSSPREGNKEPKEGTSKHVDGDLVRN FT LSLEAATDGGARPVTNPQRRPLLSLDEANDEIAPGTSTHVAGGPVHNLNQD FT AATEGGVRPVMTPKRRPLSSLDEADRGKTSVSSNIVHRGSCPNLNLDAASD FT DVARQLAVLKRRQEVEKRRMELELQLKFVQEEEALLGFGENKSFSISPQLN FT SFQTEKRTVKRSEEEKEEPDLTPRQEAARHMVSKELPVFSGDPAEWPIFIS FT HYEYTTRRCGYSNWENMLRLQKCLKGPALEAVRSRLVLPDVVPQVIEKLRS FT KYGRPVHLIKTFIEKVRKIPAPQTDKLDSLVEYGEAVQCMVDHMVAAGERA FT HITNPLLLQEVVGKLPTDQQLRWSHHIRGMTSVDLSTFSDYMEDLAEDAAR FT LTTIDSPSVRGTSKGRPTKGYVHAHVDPDGATTSSAAERQCVSCNVAGHVL FT STCTNFRGLPVKDRWRRARELSVCFSCLEKHNWRSCKNRSRCGINDCAFRH FT HALLHDPDAIESPSTADRERRHFPRTSGSQTHQVINNYHQSNPMSALFRIV FT PVTAYGPGVMIKTFAFLDEGSSMTLMDEDLAKQLGVKGDRRPLCIKWTGDT FT TRVEPASMMIDLQIGPVTSTKRFTLKAVRTVTSLSLPQQTFTMDDKRWDHL FT KQLPLPEYRDARPQLLIGLDNLRLAVPLKTREGLAGEPVAVKTRLGWCVYG FT KTAGSQIGRVLHMCECGASDENSTIQGALRKFYELEQLGTVSSDVPDPDER FT RALTILETTTVRIGNRFESGLLWKTDNVELPSSLGMARRRLECLERRMERD FT PKLKTVVHHHIADMMEKGYIHKATSAELAECNSKRIWYLPLGVVTNPKKPG FT KVRIIWDAAAKVQGTSLNDMLLKGPDELISLPGVLFRFRMYGIAVCADVKE FT MFLQIRMRDEDKHAQRFLWREDPADDIATYFVDVVTFGSACSPATAQYVKN FT RNAKEHAEKYPRAVRGILTSTYVDDYLDSFGTFEEASRVSREVRGIFSNGG FT FVLRNWVSNNPVVLERLGGESSSPGMKSLTSTADDGERVLGLRWNPSSDQL FT SFYTQACVGMAEIFETECTPTKREVLKCVMSLFDPLGLLANFTIHGRILIQ FT DLWRAGTGWDEAISPSQMRDWRRWVDVFPLIAQLRIPRCYFPEAREKVYEN FT AELHLFVDASQLAYACVLYLRVVDSEGEPHCTMLCGKAKVAPLKPLTIPKM FT ELQACLLGARLLKSTEQHHPISVKKRVLWTDSTVALSWIHADPRNYRPFVA FT NRVAEIQENTNVNEWRWVPTQDNPADEATKWKGRANFNWDGIWFQGPSFLL FT QDEESWPTRRLVSTTPEEEIRRVNLHREKLNPGLLPLKAERFSRLERMIRT FT LAWIVRYVDNLMRKVGGAPLHLGILSQDELERAETIAWKQAQGEYFQDEVR FT VLSVGEGTGRSTVPKESPIYGLLPYADERGVLRMRGRIGAAPELPYAARYP FT IVLPRDAWITHLLVDKFHRRFRHANNETVVNELRQYFQIPKMRRLVSKVVR FT QCVFCHIRRTLPQIPPMAPLPKQRLTAFVRPFTFVGLDYFGPLLVRRGRAQ FT EKRWVALFTCLTIRAIHLEVVSSLSTDSCILAVRRFVARRGAPVEVFSDNG FT TNFVGASQQLRKEIDERNDALAATFTNANTRWTFNPPGAPHMGGVWERMVR FT SVKAAMSTMTELQRTPDDETLLTVIVEAEGMINTRPLTYIPLESADQESLT FT PNHFLLGSSSGVKQRPVAPTSLQTGLRSNWKMVQHILDGFWRRWIKEYLPV FT LARQSKWFETVREIEVGDIVLIVDGGARNQWKRGIVERVVSGADGRIRQAW FT VRTNTGTLRRPAAKLALLEIRKGDK" XX SQ Sequence 5776 BP; 1463 A; 1325 C; 1755 G; 1233 T; 0 other; ctcttttgcc tacaaaaaag ggttcttcag tgcgttaagt gtaatagtga agaacaatgg 60 gaccgaagaa agcacgtggt tgcaaggctt gcggtaatca ggtcgacgac actttgtacg 120 tgcagtgcga tgaatgtgat gcgtggtggc atttctcgtg tgccggtata acggcatccg 180 tagaagccgt ggagaaatgt gcgtggttgt gcgaggagtg tgccaggaag acgctgagag 240 agcaatcatc gccacgcgag ggcaataagg agcccaagga aggaacctcg aaacacgtgg 300 atggggatct cgttcgtaac ctcagtttgg aagcagcgac ggatggcggg gcgcgcccgg 360 ttacaaaccc acagaggcgg ccgcttttat cgctcgatga ggccaacgac gagatagcgc 420 caggaacatc gacccacgtg gcagggggac ccgtacataa cctcaaccag gatgctgcga 480 cggaaggcgg ggtgcgcccg gtcatgacac caaagaggcg gccgctttca tcgctcgatg 540 aggccgacag aggtaaaacg tctgtctcat cgaacatcgt gcacagagga tcgtgtccta 600 acctcaacct ggatgcggca agtgatgacg tggcacgtca actcgccgtg ctgaagcggc 660 gacaggaggt ggagaaacgg cgcatggagc ttgaactgca gctgaagttc gtgcaggagg 720 aagaggcact tctcgggttt ggggaaaata agtctttttc aatttcacca caacttaact 780 cttttcagac tgaaaagaga acagtgaaac gcagcgaaga agaaaaagag gaaccagacc 840 taactccacg acaagaggct gcgcggcaca tggtttctaa agagctccca gttttctccg 900 gtgatcccgc tgagtggcca atttttatat cgcactacga gtatactacc aggcgatgtg 960 gatactcgaa ttgggagaat atgctgcgcc tgcaaaagtg cctgaaagga cctgccctcg 1020 aagctgttcg gagtcgattg gtgttaccgg acgtagttcc gcaggttatc gagaagctac 1080 gttccaaata tgggcggccg gtgcacttaa ttaaaacatt catcgagaag gtgcggaaga 1140 ttccggcacc ccaaactgac aagctggaca gtttagtcga gtatggggaa gcagtgcagt 1200 gtatggtgga ccatatggtt gcggctggtg aacgtgcgca tatcaccaac ccgctcttgc 1260 tgcaagaggt ggtcggcaag ttaccaacgg atcaacagtt acgttggtcg catcacatcc 1320 gcggaatgac ctcggtagat ctgtccacat tcagcgacta catggaggat ttggctgaag 1380 acgctgcgag gctgacgaca attgactctc cttcagtgcg cgggaccagc aagggaaggc 1440 ctacgaaggg ctacgtccac gcgcacgtgg atccagatgg agcgacaacg tccagcgcgg 1500 ctgagaggca gtgtgtatcc tgtaacgtcg cggggcatgt attgtcgaca tgcactaatt 1560 ttcgaggact gccggtaaag gatcgatgga ggcgagcgcg tgagctatcg gtgtgcttta 1620 gctgcctgga gaagcacaat tggcgatcgt gcaaaaatcg ctctcgttgt ggaatcaacg 1680 attgtgcatt ccgacatcac gcgcttctac acgacccgga tgcaatagag tcgccttcta 1740 ctgcagaccg agaacggcgg cacttcccga gaaccagtgg aagtcagacg caccaggtaa 1800 taaataatta tcatcagtcg aatccgatgt cggcgctttt tagaatcgtt ccagtaacag 1860 cgtatggacc cggagttatg ataaaaacct tcgcgttcct ggacgaaggt tcgtcaatga 1920 cgctgatgga cgaagacctg gcaaagcagt taggggtgaa gggagataga cgacctctat 1980 gtatcaagtg gacaggtgat acgactaggg tcgagccggc gtcgatgatg atcgatttac 2040 agatcggacc tgtgacgtcg acaaaaaggt tcaccctgaa agctgtgcgg actgtcacca 2100 gccttagcct cccacagcaa actttcacga tggatgacaa gagatgggac catcttaagc 2160 agctgccatt accggagtac cgtgatgctc ggcctcagtt gttgatcggg ctggacaacc 2220 ttcgattggc ggtgccgctg aagacgcgtg aaggccttgc aggggaaccg gttgccgtaa 2280 agactcggct tggatggtgc gtgtacggaa agacggctgg aagccaaatc ggaagggtgc 2340 tgcatatgtg cgagtgtgga gcatcggacg aaaactccac catccagggg gccttacgca 2400 agttttatga gttggagcaa ctcgggactg tctccagtga cgtgcctgat ccagatgaac 2460 gaagggcact gacgatcctg gaaacaacga cggtgcggat tggtaatcgg tttgaaagcg 2520 gtctgttgtg gaagacagac aacgtggagc ttccttcgag cttgggtatg gcgcgtcgca 2580 ggctggaatg cttggaaaga agaatggaac gtgaccctaa gctgaaaacc gtggtgcacc 2640 atcacatagc cgatatgatg gaaaagggtt atatccacaa ggcgacgtct gctgagcttg 2700 cagagtgtaa ttcgaagcga atttggtacc tgccgttggg agtggttacc aatccgaaga 2760 agccagggaa ggtgcgcatc atctgggacg ccgctgctaa ggtacaaggt acgtccctaa 2820 atgacatgtt gctgaagggg ccggacgagt taatttcttt gccaggggtg ttgttccggt 2880 ttcgaatgta cgggatagcg gtgtgcgctg atgtcaagga aatgttcctg cagatacgca 2940 tgcgcgacga agacaagcat gcgcagcggt tcctgtggcg ggaagatcct gctgacgata 3000 tcgcaacgta tttcgtggac gtcgttacct ttgggtcagc ctgctcccca gccaccgcac 3060 aatacgtgaa aaaccggaac gccaaggaac atgccgaaaa ataccctcgt gccgtacgtg 3120 gcatcttgac cagcacgtat gtcgacgact atttggatag tttcggaaca ttcgaagaag 3180 ccagtcgagt atccagagaa gtcaggggaa tcttctcgaa cggcgggttc gtactccgga 3240 actgggtttc caacaatccg gttgttttgg aacggctggg cggcgaaagc tccagtcccg 3300 gtatgaagag tttgacatct acggcggatg atggagaacg ggtgctcgga ttgcggtgga 3360 acccgagctc ggaccaattg tccttttaca cgcaggcgtg tgtgggaatg gcggagatat 3420 ttgagacgga gtgtacccct accaagcgag aagtgctcaa atgcgtgatg tcactttttg 3480 atccgcttgg actgttggca aactttacca tccatggaag gatcttgatt caagaccttt 3540 ggcgagctgg taccggttgg gatgaggcca tcagtcccag tcaaatgcga gattggcgta 3600 gatgggtgga tgtttttcct ctgatagccc agcttaggat tccgaggtgc tacttcccgg 3660 aggcacgaga gaaagtgtac gagaatgcgg agctacactt gtttgtggat gccagccagc 3720 tagcgtacgc ttgcgtgctg tatttacggg tcgtcgattc tgaaggagaa ccgcattgta 3780 ccatgctatg cggaaaggca aaggttgctc ctctgaagcc tttgacgata ccaaagatgg 3840 agttacaagc ctgcttgtta ggtgcacggc ttctgaagtc cacggaacag catcacccga 3900 tttctgttaa aaaacgggtg ctctggacgg acagcacggt ggcgctatca tggatacatg 3960 ccgaccctag gaattacagg ccatttgtcg cgaatagagt ggcggagatt caggagaaca 4020 ccaacgtgaa tgagtggcga tgggtgccca ctcaggacaa tccagcagac gaagctacca 4080 aatggaaagg gcgtgcgaac ttcaactggg atggcatttg gttccagggt ccatcatttc 4140 tgctgcagga tgaagagtct tggccgacga gaagactcgt ttcaactact ccggaggaag 4200 agatacggcg ggtcaacctt caccgtgaga agttgaatcc tggacttctc cctctaaaag 4260 ctgaacgctt cagccgcctg gaaagaatga tcaggacgtt ggcgtggatt gtcaggtacg 4320 tggacaattt gatgagaaag gtgggaggag cccctctaca ccttgggatc ctctctcaag 4380 acgaattgga gagagcggag acgatcgcgt ggaagcaagc gcaaggggaa tattttcagg 4440 atgaagtacg agtcctgagt gtcggtgagg gaacaggaag gagtaccgtg cctaaggaaa 4500 gtcctatcta tggtctctta ccctacgcgg atgagcgtgg tgttttgcgc atgcggggac 4560 ggattggagc agctccggaa ctgccatatg ctgccaggta cccaatcgta ttgccacgtg 4620 acgcatggat aacccacctg ctggtggaca aatttcatcg ccggtttcga cacgccaata 4680 acgaaaccgt ggtgaacgag ctgaggcagt atttccaaat cccaaagatg agacggttgg 4740 tttcaaaagt ggttcggcaa tgcgtgttct gccatattcg acgaacattg ccacagatcc 4800 ccccgatggc tccattaccg aaacagcggc tcactgcatt cgtgaggccg ttcacatttg 4860 tgggactgga ctactttgga ccgctgttgg tgaggagagg aagagcacag gagaaacgat 4920 gggtggcgct tttcacatgc ctaaccataa gagcaattca tttagaagtt gtgagtagtc 4980 tttccacaga ttcctgtatt ttggcagtga gacgctttgt ggccaggaga ggcgctcccg 5040 ttgaggtgtt cagcgacaac gggacgaatt tcgtgggagc cagccagcag ctaaggaagg 5100 aaatcgacga gcgcaacgat gccttagctg cgacctttac caacgcgaac acccgatgga 5160 cgttcaaccc ccctggcgca ccccatatgg gaggggtatg ggaacgcatg gtgcgatcgg 5220 tgaaggctgc gatgagtacg atgacggaac tacagcgtac acctgatgac gagacgctgc 5280 ttacggtgat agtggaagcg gagggaatga tcaacacacg cccactgacg tacatcccgc 5340 tggaatcggc ggatcaggag tctcttactc ctaaccactt cttgctgggc agttcatcgg 5400 gagtgaagca gagaccggtg gcaccgacta gccttcagac ggggttacgg agcaactgga 5460 aaatggtgca acatatcctg gacgggtttt ggagacggtg gataaaagag tatcttccgg 5520 tgttggcacg gcaaagcaaa tggtttgaga ctgtgagaga gattgaggtt ggagacattg 5580 ttctgatagt cgacggtggc gctaggaatc agtggaagag agggatagta gaacgagtgg 5640 tttcgggagc cgacgggcgg atacgacaag cttgggtgcg aacaaacaca gggaccctca 5700 gaaggccggc ggctaaactt gccttattag agataagaaa gggtgacaaa tagcgtattg 5760 gtcacgggct ggggga 5776 // ID INVADER1-LTR_AG repbase; DNA; ANG; 204 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE INVADER1-LTR_AG, a long terminal repeat of the INVADER1_AG DE Gypsy-like LTR retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW Gypsy superfamily; INVADER group; INVADER1-I_AG; INVADER1-LTR_AG; KW INVADER1_AG; endogenous retrovirus; gag; integrase; protease; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-204 RA Kapitonov V.V. and Jurka J.; RT "INVADER1_AG: a family of Gypsy-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 16-15 (2002). XX DR [1] (Consensus) XX CC INVADER1-LTR_AG is a long terminal repeat from INVADER1_AG. CC The internal portion of INVADER1_AG is listed in Repbase as CC INVADER1-I_AG. CC INVADER1_AG is a member of Gypsy-like retroviruses that belong CC to the INVADER group originally identified in Drosophila (see CC description of INVADER1-I_AG). XX SQ Sequence 204 BP; 61 A; 34 C; 64 G; 45 T; 0 other; tgtgaagtag ggaaaagaga tagcgatatc gtggcgagcg ccgagtgagc gaaggagggt 60 atttaagcgg cgcgcgctgc tgggtcgaga ggtcagttga acgcgaaatg tcgaaggtga 120 aggacgtgcg aattggagaa taaataaagt gaactaaacg aaagccaccc atcttcttct 180 tttgatgttt aactgattct caca 204 // ID BEL2-I_AG repbase; DNA; ANG; 5541 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE BEL2-I_AG is an internal portion of the BEL2_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL2-I_AG; BEL2-LTR_AG; BEL2_AG; Bel clade; PHD zinc finger; KW integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5541 RA Kapitonov V.V. and Jurka J.; RT "BEL2_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 10-10 (2003). XX DR [1] (Consensus) XX CC BEL2_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL2-I_AG, an internal portion of BEL1_AG is flanked by CC BEL2-LTR_AG CC LTRs. The BEL2-I_AG consensus sequence was reconstructed based on CC multiple alignment of 5 copies that are ~3% divergent from it. CC The consensus sequence encodes one protein: a 1750-aa BEL2-I_AGp CC (positions 239-5488). It is composed of the PDH domain, reverse CC transcriptase and integrase domains. CC BEL2-I_AGp: CC MASRRLYFVENEQGACRLCTKPDDIDDMVRCDECDRWFHASCVKVIRLPDEDEEFVCVKCxNDRAEYMxI CC SQSTNQDTTLKALIEALKMSGLTSSTHIKRMTLNSLPNFNGTSKDWPKFKRAFEETTEEGSFSNVENLNR CC LQHALKGEAERCVRRLFLEPDNVPIIMKKLEEQFGRPEQVYQDLLGEVLKVRVENQMKIPDLSDALEDMI CC INLKAIKKEGYLQDHxLVDELIFKFSTDKQLKWIEFKSSLEKxNKIPTLEHFSEWLYPIAENIRKLPKRN CC ERFRQPLNFHRPQFSPNQQRPAEINNNRPMNRPPQPHNSHPMNTQTRPFNTRVRNVQRFFQPCPCCQGSH CC ALYRCERFKNIPVHERLEIVVNSQQCQACLTSNNHSQNNCNAARECGIQGCRETHHQLLHTGDTVRMNYH CC QTIQNVYYQIVPVVLRNNNHTLETYAFLDAGSSLTLIEENTANKLHLNGVTDPLTLTWTQNLSVQENCSR CC RVSCMIKGVNEKKEHLLNGIRTVKNLQLPSQTLSGSILAARYPHLKGIKLSDYQKARPTVLIGLNHSHLL CC MPLGRKMGRPEEPMAIKTKLGWLIFGIDKICLSETNHLMIHKSEDLMTEMMRRYFSTEEFGVKPVKTVKS CC QALERAETIIEKTLKKTDGRYEVGLLWRDDDVILPNSYSNALRRLATQEKQLAKDPGLKNWLCKTFEEYQ CC QKGYIRKLTKEELKRHSQKIFYIPHFVVVNKNKPIPKPRLVFDAAAKVNGVSLNSLLLTGPDEMASLFGV CC LLRFREGPICVCGDIKEMFHQVKIRKEDQDAQRILWRDGDSTRTPDTYVMQVMTFGATCSPACAQVVKNR CC NAEANSEMYPLALEPIRNQHYVDDYLDSFFSMEKAIKTVTEVIQVHENGGFHIRNFISNKRELMEAIPQE CC RHQVKAIVDIKEKDSCVEKILGVQWNTQLDCFGYKVDVGRINLEKKPTKREALSFVMSAYDPLGLISHIT CC IQGRILMQSINAATNDWDTQIPDSLHGKWIEWLKMITSVKDLAIPRPIVASIMNPVEIHTFVDASQEAFA CC AAVYARSNFNGCFVVRLVAAKSRVAPTKALSIPKLELQAAVLGARLTASVIKELRLKISRTIFWSDSKTV CC LAWINSEHRKYEVFVSHRVSEILDTTSANQWRWVSTKDNPADIATKITSNSWTWYNGPQFLQEHERDWPG CC VHEVAVEQHLVTVHKETIAPEYYFSSWDKLLKHQTVMKKYVDFLKNRSNFSRTISYRDMETAKLSLLRKA CC QWEGFPEEMEALEKRKEISNKSSIRTLVPFLDEKGILRSRGRLENASCLPYSARLPIILPQRSRVVKLLV CC RDYHEKYMHQADNVVIGVLRQNYWIINLRTVLKNVKSCCQRCILNTAAPKAPLMAPLPTYRTHPYNPPFL CC HTGVDYFGPLDVTVKRSTEKRWGAIFTCMSTRAVHLELAEKLDTDSFMVCLNNFLHRRGKITHLYSDNGT CC NFVGAEKELKKIVEDIDLRMGREAALKYKIEWKFNPPAAPHFGGSWERLIQNIKKALRHMVTEWKTRHPT CC PETLRATLIQIESILNSRPLTHLPLTSEEEEVLTPFHFLIGRGVDSLPAPTETSQVDRQQFRLAQHNAKT CC FWDRWKKEYLPTLIKRNKWTCKVEPIKVNDIVIITNDNAPPGQWLKGRxVETATAPDGQVRSVSVKTVQG CC VIKRPAVKVAVIDVKQKEHLLFVKKPPHSPTKRKVLSEENLQKVPPKKRMIAPCNWAPKLVQQLKKEDDV. XX SQ Sequence 5541 BP; 1875 A; 1082 C; 1224 G; 1351 T; 9 other; tttggtggct ccagagagga agaagatttc gcggaaaatt acgattccga aactatcggt 60 tcttgaaaga cggaagaaga tttaacagaa aatttctttt tcgaagctat cggttcctgt 120 gagaattttc tctacgaagc attcgatcta gaagtctgtt gtgtcggaga cataattttt 180 ctgcttgtcg aaagggatct gtgtattctt tgctgcacag atatactaac tttcaacaat 240 ggcgagcaga agattatatt ttgttgaaaa tgagcaaggt gcttgtcgct tgtgcaccaa 300 gccagatgac attgatgata tggtacgctg cgatgaatgc gaccgttggt ttcacgcttc 360 atgcgtaaag gtgatacgat tgcccgatga agatgaagaa tttgtttgcg tgaaatgcar 420 aaacgatagg gcggagtata tggraatttc gcaatcaaca aatcaggata caaccctaaa 480 ggcactaatt gaagcgttga aaatgagtgg tttgacctca tcaactcata tcaagcgcat 540 gaccctcaat agtttgccaa attttaatgg tacttcaaaa gattggccga agttcaaacg 600 wgcctttgaa gaaaccaccg aggaaggaag tttcagcaat gtagaaaact taaatcgttt 660 acaacatgct ttgaaaggag aagcagaaag atgtgttcgt cgattgtttc tcgaaccaga 720 caacgttcca atcattatga agaaattaga agaacagttc gggagaccag aacaagtgta 780 tcaagattta cttggagaag tattgaaggt tagagtcgag aaccagatga agataccgga 840 tctgtcagat gctttggaag acatgatcat caacctcaaa gccattaaaa aagagggata 900 ccttcaagac catmgwttag tggatgaatt aatttttaaa ttttcgaccg ataaacaatt 960 gaaatggatc gaatttaagt ctagccttga gaaaragaac aaaataccaa cacttgaaca 1020 ttttagtgaa tggctttatc cgatagcaga aaatatcaga aaattgccga aaaggaatga 1080 aagatttcga caacctttaa actttcatcg tcctcaattt tcgccaaatc aacaacgtcc 1140 agcggaaata aacaacaatc gtccgatgaa tcgtccacca caacctcata attctcaccc 1200 gatgaacaca caaacaagac cattcaatac acgtgtgcgg aacgttcaac gatttttcca 1260 gccatgtcca tgttgtcagg gttctcacgc actatatcga tgtgaacgtt ttaagaatat 1320 tccggtacat gaaaggctag aaatagtggt aaacagccag cagtgtcaag catgtttgac 1380 atcgaataat catagccaga acaattgcaa tgcggctaga gaatgcggta ttcaaggatg 1440 cagagaaacc catcatcagt tactgcacac tggagacaca gttcggatga attaccatca 1500 aacaattcaa aatgtctact accagattgt tccagtggtg ttgcgaaaca ataatcacac 1560 gttagaaacg tacgcatttt tagatgccgg gtcatcttta acactcatcg aagagaatac 1620 ggcaaataaa ttgcacttaa acggtgtaac cgatccatta accctaacat ggacacaaaa 1680 cctatcggtg caggaaaact gtagcagaag agtgagctgc atgatcaaag gagtgaacga 1740 gaaaaaagaa catttactta atggtatacg gactgtgaaa aatctacaac tgccaagtca 1800 aacgytatct ggcagcatat tagcagcacg ttatccgcat cttaaaggca tcaagctatc 1860 cgactaccag aaagctcgtc caactgtgct gattggatta aaccacagtc atcttttaat 1920 gccgcttggt cgaaagatgg gacgaccgga agaaccaatg gcgattaaaa ccaaacttgg 1980 atggttgatt tttggcatcg acaaaatatg tctatcagaa acaaatcact taatgattca 2040 caagagcgag gatttgatga ctgaaatgat gcgacgatat ttttccaccg aagagtttgg 2100 cgttaaaccg gtgaaaacag taaaatctca agcattagag agagcggaga ccatcatcga 2160 gaaaacactg aagaaaacgg atggtagata cgaagtaggc ttgctatgga gagatgatga 2220 cgtcatattg ccgaacagct atagcaacgc gctccgtcgc ctagcaacgc aagagaaaca 2280 gctagcaaag gaccctggtt tgaaaaattg gctgtgcaaa acatttgaag agtaccagca 2340 gaaaggatac atacgcaagc ttacgaaaga agaacttaaa cggcattcac aaaaaatatt 2400 ttacattcct cactttgtgg tagtgaacaa aaataagccg ataccaaagc cgaggttagt 2460 atttgacgca gcagcaaaag tcaacggcgt ttcactaaat tcactattat taactggtcc 2520 agacgagatg gcatctttat ttggagtgct ycttcgattc cgtgaaggac ccatctgtgt 2580 gtgtggtgac atcaaggaga tgtttcacca agtgaaaatt cggaaggaag atcaagatgc 2640 gcagcgaatt ttatggcgtg acggcgatag taccagaaca ccagatactt acgtcatgca 2700 ggttatgacg ttcggagcaa cttgctcacc ggcatgtgcg caggtagtga aaaatcgcaa 2760 tgcggaagcg aatagtgaaa tgtacccgtt agcacttgaa ccaattcgaa accagcatta 2820 cgtggatgat tacctcgata gtttcttttc catggaaaag gcgatcaaaa cggtgactga 2880 ggtgattcag gttcacgaaa atggtggatt ccatataaga aacttcatct cgaataagcg 2940 agagttgatg gaagctattc ctcaagaacg acatcaggtg aaggctatcg tcgatataaa 3000 ggaaaaagat tcgtgtgtcg agaagatcct gggtgtccag tggaacacac aattagattg 3060 cttcgggtac aaggtagatg ttggacggat aaatctggaa aagaagccaa ccaaaagaga 3120 agctttgagc ttcgtaatga gtgcatatga tccacttggc ctcattagtc acataaccat 3180 ccagggacgt attctaatgc aatctatcaa cgccgcaaca aatgattggg atacgcagat 3240 tcctgatagt ttgcatggaa aatggatcga atggctcaaa atgatcacca gcgttaaaga 3300 tttagccatc ccaagaccaa ttgttgcttc aattatgaat ccagttgaaa tccacacatt 3360 tgtagacgct tcacaggaag catttgctgc agccgtatat gcaagaagca atttcaatgg 3420 ttgctttgtc gtacgactag ttgcagcaaa gtctagagtg gctccaacga aggctttgtc 3480 aataccaaaa ctcgagctac aagcagcagt tttaggagcc agattgactg ctagtgtaat 3540 taaagagctt cgattaaaga ttagtcgtac aatattctgg agtgattcga aaactgttct 3600 tgcgtggatt aacagcgaac atagaaagta tgaagttttc gtatcgcatc gtgtaagtga 3660 aatattggac acaaccagtg ccaaccaatg gagatgggta tccacaaagg ataatccagc 3720 cgatattgcc acgaaaatca ccagcaattc gtggacatgg tacaatggtc cacagttttt 3780 acaagagcat gaacgcgact ggcctggagt acatgaagta gctgtggaac agcacctcgt 3840 taccgtgcac aaagaaacaa ttgctccaga atactatttt tcatcgtggg acaaattgct 3900 gaagcaccaa acagttatga aaaaatacgt cgatttcttg aagaaccgga gcaatttttc 3960 tcgaaccata tcgtacaggg acatggaaac ggctaaactc tcattactgc gaaaagctca 4020 gtgggaaggt tttcctgaag agatggaagc tctcgaaaaa agaaaggaga tttcgaacaa 4080 aagcagcatt cgaacactag taccgtttct agatgaaaaa ggaatattaa gatcaagagg 4140 ccgtctagaa aatgcatcat gcttgccata cagcgcgcgt ttgccaatta tcttacccca 4200 aaggtcacgt gttgtgaaac ttctcgttcg agactaccac gagaaatata tgcaccaggc 4260 agataatgtt gtaataggag ttttaagaca aaattactgg attatcaatc tacgaacagt 4320 cttaaaaaat gtgaaaagtt gctgccaaag atgtattttg aacaccgctg ctcctaaagc 4380 acctctcatg gcaccgcttc caacatatag aacgcatcca tacaatcctc catttctaca 4440 tacaggagta gattactttg gtcctctaga tgtgaccgtt aagcgctcta ctgagaaacg 4500 gtggggagcg atatttacct gtatgagcac aagagctgta cacctggaat tagcagagaa 4560 actggacacc gatagtttca tggtgtgttt gaataacttc cttcatcgcc gcggaaagat 4620 aacacatctg tacagtgaca acggtacaaa ttttgttggc gcagaaaaag aattaaagaa 4680 aattgtcgaa gatatcgatc taagaatggg acgtgaggct gcactaaaat ataaaataga 4740 atggaagttc aacccgcctg cagcaccgca cttcggaggt tcctgggaac gtctgattca 4800 aaacataaag aaagcattgc gacatatggt tactgaatgg aagacgcggc acccaacacc 4860 agaaacatta agagcgactt taattcagat cgaatccatt ttaaattcgc gacctttaac 4920 acatttgccg ctaacgtcag aagaagaaga ggttcttaca ccattccatt tcttaattgg 4980 tagaggtgta gattccttac cagcacctac tgaaacatcg caagtggatc ggcaacaatt 5040 tagattggca cagcacaatg ccaaaacatt ctgggatcga tggaaaaagg aatacctgcc 5100 tactttgatc aaacgaaata agtggacgtg caaggtagaa cccatcaagg ttaatgatat 5160 agtgatcatc acgaatgata acgctccacc tggacaatgg cttaaaggaa gartagtaga 5220 gactgcaaca gcaccagatg gacaagttag atcagtgtcg gtcaaaacag tccaaggtgt 5280 aataaaacga ccagcagtga aggttgcagt catagacgtg aagcagaagg agcatttgct 5340 gtttgtgaag aagcctccgc attcaccaac aaaaagaaaa gtgctatctg aagaaaattt 5400 gcagaaagtc ccgccaaaga aacgtatgat agctccatgc aattgggctc cgaaacttgt 5460 acaacagttg aaaaaagaag atgatgtatg aatcgcggcc aatatagcat aatccattgg 5520 gatgaattat ggtggggaga a 5541 //