======== SOAPdenovo2 ========== ['SOAPdenovo-63mer', 'all', '-s', 'SOAPdenovo2.config', '-o', 'SOAPdenovo2', '-K', '55', '-N', '184000000', '-R', '-M', '1', '-L', '200', '-p', '32'] Start time: 2021-06-15 15:23:50.040000 Version 2.04: released on July 13th, 2012 Compile Apr 5 2019 17:00:46 ******************** Pregraph ******************** Parameters: pregraph -s SOAPdenovo2.config -K 55 -p 32 -R -o SOAPdenovo2 In SOAPdenovo2.config, 5 lib(s), maximum read length 100, maximum name length 256. 32 thread(s) initialized. Import reads from file: /local/genbank/workspace/reads.all3/2008-09allU.fq Import reads from file: /local/genbank/workspace/reads.all3/3-6.2-7.2GBallU.fq Import reads from file: /local/genbank/workspace/reads.all3/AGRFUall.fq Import reads from file: /local/genbank/workspace/reads.all3/454U.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_7_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_7_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_8_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae6.2GB_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae6.2GB_8_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_6_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_6_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_7_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_7_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_8_R2.fq --- 100000000th reads. Import reads from file: /local/genbank/workspace/reads.all3/AB2kb_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/AB2kb_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/AB4kb_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/AB4kb_R2.fq Time spent on hashing reads: 543s, 194357892 read(s) processed. LIB(s) information: [LIB] 0, avg_ins 0, reverse 0. [LIB] 1, avg_ins 250, reverse 0. [LIB] 2, avg_ins 250, reverse 0. [LIB] 3, avg_ins 1591, reverse 0. [LIB] 4, avg_ins 2228, reverse 0. 866243586 node(s) allocated, 4834341919 kmer(s) in reads, 4834341919 kmer(s) processed. done hashing nodes 808269269 linear node(s) marked. Time spent on marking linear nodes: 16s. Time spent on pre-graph construction: 560s. Start to remove frequency-one-kmer tips shorter than 110. Total 28232704 tip(s) removed. 32 thread(s) initialized. 12462367 linear node(s) marked. Start to remove tips with minority links. 6653761 tip(s) removed in cycle 1. 118596 tip(s) removed in cycle 2. 1453 tip(s) removed in cycle 3. 8 tip(s) removed in cycle 4. 0 tip(s) removed in cycle 5. Total 6773818 tip(s) removed. 32 thread(s) initialized. 0 linear node(s) marked. Time spent on removing tips: 1796s. 4513517 (2257210) edge(s) and 233832 extra node(s) constructed. Time spent on constructing edges: 767s. In file: SOAPdenovo2.config, max seq len 100, max name len 256. 32 thread(s) initialized. Import reads from file: /local/genbank/workspace/reads.all3/2008-09allU.fq Import reads from file: /local/genbank/workspace/reads.all3/3-6.2-7.2GBallU.fq Import reads from file: /local/genbank/workspace/reads.all3/AGRFUall.fq Import reads from file: /local/genbank/workspace/reads.all3/454U.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_7_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_7_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_8_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae6.2GB_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae6.2GB_8_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_6_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_6_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_7_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_7_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_8_R2.fq --- 100000000th reads. Import reads from file: /local/genbank/workspace/reads.all3/AB2kb_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/AB2kb_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/AB4kb_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/AB4kb_R2.fq 194357892 read(s) processed. Time spent on: importing reads: 155s, chopping reads to kmers: 47s, searching kmers: 341s, aligning reads to edges: 79s, searching (K+1)mers: 55s, adding pre-arcs: 39s, recording read paths: 43s. 257728047 marker(s) output. Reads alignment done, 34823928 read(s) deleted, 3014500 pre-arc(s) added. LIB(s) information: [LIB] 0, avg_ins 0, reverse 0. [LIB] 1, avg_ins 250, reverse 0. [LIB] 2, avg_ins 250, reverse 0. [LIB] 3, avg_ins 1591, reverse 0. [LIB] 4, avg_ins 2228, reverse 0. Time spent on aligning reads: 774s. 2521092 vertex(es) output. Overall time spent on constructing pre-graph: 65m. ******************** Contig ******************** Parameters: contig -g SOAPdenovo2 -M 1 -R -s SOAPdenovo2.config -p 32 There are 2521092 kmer(s) in vertex file. There are 4513517 edge(s) in edge file. Kmers sorted. 4513517 edge(s) input. 3907220 pre-arcs loaded. 38871210 markers overall. 38871210 markers loaded. 1127100 none-palindrome edge(s) swapped, 0 palindrome edge(s) processed. 4513517 edge(s) sorted. Arcs sorted. 7052 repeat(s) are solvable, 14108 more edge(s). 28216 dead arc(s) removed. Time spent on solving repeat: 1s. Start to pinch bubbles, cutoff 0.100000, MAX NODE NUM 3, MAX DIFF NUM 2. 942604 start points, 1673101 dheap nodes. 257961 pair(s) found, 23579 pair of path(s) compared, 17159 pair(s) merged. Sequence comparison failed: Path crossing deleted edge 0 Length difference of two paths greater than two 2267 Mismatch score greater than cutoff (2) 2411 Mismatch score ratio greater than cutoff (0.1) 0 Path length shorter than (Kmer-1) 1742 DFibHeap: 70026 node(s) allocated. 98796 edge(s) concatenated in cycle 1. 1316 edge(s) concatenated in cycle 2. 3 edge(s) concatenated in cycle 3. 0 edge(s) concatenated in cycle 4. Time spent on pinching bubbles: 6s. Start to destroy weak inner edges. 60854 weak inner edge(s) destroyed in cycle 1. 585 weak inner edge(s) destroyed in cycle 2. 10 weak inner edge(s) destroyed in cycle 3. 1 weak inner edge(s) destroyed in cycle 4. 0 weak inner edge(s) destroyed in cycle 5. 121540 dead arc(s) removed. 31208 inner edge(s) with coverage lower than or equal to 1 destroyed. 64794 dead arc(s) removed. 141301 edge(s) concatenated in cycle 1. 575 edge(s) concatenated in cycle 2. 0 edge(s) concatenated in cycle 3. Before compacting, 4527625 edge(s) existed. After compacting, 3821287 edge(s) left. Strict: 0, cutoff length: 110. 716424 tips cut in cycle 1. 82642 tips cut in cycle 2. 14372 tips cut in cycle 3. 4084 tips cut in cycle 4. 1768 tips cut in cycle 5. 1004 tips cut in cycle 6. 604 tips cut in cycle 7. 393 tips cut in cycle 8. 281 tips cut in cycle 9. 247 tips cut in cycle 10. 144 tips cut in cycle 11. 109 tips cut in cycle 12. 96 tips cut in cycle 13. 76 tips cut in cycle 14. 41 tips cut in cycle 15. 36 tips cut in cycle 16. 25 tips cut in cycle 17. 25 tips cut in cycle 18. 13 tips cut in cycle 19. 13 tips cut in cycle 20. 14 tips cut in cycle 21. 13 tips cut in cycle 22. 8 tips cut in cycle 23. 6 tips cut in cycle 24. 5 tips cut in cycle 25. 3 tips cut in cycle 26. 5 tips cut in cycle 27. 4 tips cut in cycle 28. 5 tips cut in cycle 29. 3 tips cut in cycle 30. 1 tips cut in cycle 31. 0 tips cut in cycle 32. 360662 dead arc(s) removed. 285380 edge(s) concatenated in cycle 1. 6303 edge(s) concatenated in cycle 2. 21 edge(s) concatenated in cycle 3. 0 edge(s) concatenated in cycle 4. Before compacting, 3821287 edge(s) existed. After compacting, 1592951 edge(s) left. There are 597154 contig(s) longer than 100, sum up 267397120 bp, with average length 447. The longest length is 110960 bp, contig N50 is 612 bp,contig N90 is 196 bp. 796920 contig(s) longer than 56 output. Time spent on constructing contig: 1m. ******************** Map ******************** Parameters: map -s SOAPdenovo2.config -g SOAPdenovo2 -p 32 -K 55 Kmer size: 55. Contig length cutoff: 57. 796920 contig(s), maximum sequence length 110960, minimum sequence length 56, maximum name length 10. Time spent on parsing contigs file: 0s. 32 thread(s) initialized. Time spent on hashing contigs: 23s. 237248380 node(s) allocated, 237697391 kmer(s) in contigs, 237697391 kmer(s) processed. Time spent on graph construction: 25s. Time spent on aligning long reads: 0s. In file: SOAPdenovo2.config, max seq len 100, max name len 256 32 thread(s) initialized. 1592951 edge(s) in the graph. Import reads from file: /local/genbank/workspace/reads.all3/20090202_7_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_7_R2.fq Current insert size is 250, map_len is 32. Import reads from file: /local/genbank/workspace/reads.all3/20090202_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/20090202_8_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae6.2GB_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae6.2GB_8_R2.fq Current insert size is 250, map_len is 32. Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_6_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_6_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_7_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_7_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_8_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/Algae7.2GB_8_R2.fq Import reads from file: /local/genbank/workspace/reads.all3/AB2kb_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/AB2kb_R2.fq Current insert size is 1591, map_len is 35. --- 100000000th reads. Import reads from file: /local/genbank/workspace/reads.all3/AB4kb_R1.fq Import reads from file: /local/genbank/workspace/reads.all3/AB4kb_R2.fq Current insert size is 2228, map_len is 35. Total reads 145557758 Reads in gaps 35247865 Ratio 24.2% Reads on contigs 114337330 Ratio 78.6% 4 pe insert size, the largest boundary is 145557758. LIB(s) information: [LIB] 0, avg_ins 0, reverse 0. [LIB] 1, avg_ins 250, reverse 0. [LIB] 2, avg_ins 250, reverse 0. [LIB] 3, avg_ins 1591, reverse 0. [LIB] 4, avg_ins 2228, reverse 0. Time spent on aligning reads: 1553s. Overall time spent on alignment: 26m. ******************** Scaff ******************** Parameters: scaff -g SOAPdenovo2 -p 32 -L 200 -N 184000000 gzip: stdout: Broken pipe Files for scaffold construction are OK. There are 4 grad(s), 145557758 read(s), max read len 100. Kmer size: 55 There are 1592951 edge(s) in edge file. Mask contigs with coverage lower than 1.0 or higher than 20.0, and strict length 0. Average contig coverage is 10, 258925 contig(s) masked. Mask contigs shorter than 200, 744728 contig(s) masked. 865120 arc(s) loaded, average weight is 8. 796920 contig(s) loaded. Done loading updated edges. Time spent on loading updated edges: 22s. ***************************************************** Start to load paired-end reads information. For insert size: 250 Total PE links 1638363 Normal PE links on same contig 612891 Incorrect oriented PE links 37 PE links of too small insert size 222558 PE links of too large insert size 0 Correct PE links 802803 Accumulated connections 794518 Use contigs longer than 250 to estimate insert size: PE links 595547 Average insert size 253 SD 64 397259 new connections. For insert size: 250 Total PE links 17735168 Normal PE links on same contig 10078277 Incorrect oriented PE links 489 PE links of too small insert size 861270 PE links of too large insert size 0 Correct PE links 6793887 Accumulated connections 2795256 Use contigs longer than 250 to estimate insert size: PE links 9811685 Average insert size 250 SD 65 1397628 new connections. For insert size: 1591 Total PE links 19558107 Normal PE links on same contig 1077753 Incorrect oriented PE links 3075 PE links of too small insert size 10225316 PE links of too large insert size 0 Correct PE links 7291866 Accumulated connections 12947224 Use contigs longer than 1591 to estimate insert size: PE links 749246 Average insert size 314 SD 347 6473612 new connections. For insert size: 2228 Total PE links 8138638 Normal PE links on same contig 1816134 Incorrect oriented PE links 1623 PE links of too small insert size 873044 PE links of too large insert size 0 Correct PE links 4948161 Accumulated connections 3099120 Use contigs longer than 2228 to estimate insert size: PE links 600815 Average insert size 256 SD 233 1549560 new connections. All paired-end reads information loaded. Time spent on loading paired-end reads information: 159s. ***************************************************** Start to construct scaffolds. *************************** For insert size: 250 Total PE links 1794889 PE links to masked contigs 1657502 On same scaffold PE links 0 Cutoff of PE links to make a reliable connection: 3 Active connections 234056 Weak connections 67698 Weak ratio 28.9% 6 circles removed. Start to remove transitive connection. Total contigs 1592951 Masked contigs 1003677 Remained contigs 589274 None-outgoing-connection contigs 423556 (71.877602%) Single-outgoing-connection contigs 165182 Multi-outgoing-connection contigs 13 Cycle 1 Two-outgoing-connection contigs 523 Potential transitive connections 0 Transitive connections 0 Transitive ratio 0.0% Start to linearize sub-graph. Picked sub-graphs 522 Connection-conflict 0 Significant overlapping 510 Eligible 0 Bubble structures 3 Mask repeats: Puzzles 437 Masked contigs 437 Start to remove transitive connection. Total contigs 1592951 Masked contigs 1004557 Remained contigs 588394 None-outgoing-connection contigs 424258 (72.104408%) Single-outgoing-connection contigs 164136 Multi-outgoing-connection contigs 0 Cycle 1 Two-outgoing-connection contigs 0 Potential transitive connections 0 Transitive connections 0 Start to linearize sub-graph. Picked sub-graphs 0 Connection-conflict 0 Significant overlapping 0 Eligible 0 Bubble structures 0 Start to mask puzzles. Masked contigs 0 Remained puzzles 0 Freezing done. Rank 1 Scaffold number 48047 In-scaffold contig number 596985 Total scaffold length 85811694 Average scaffold length 1785 Filled gap number 3251 Longest scaffold 127533 Scaffold and singleton number 515087 Scaffold and singleton length 239028477 Average length 464 N50 1108 N90 158 Report from smallScaf: 48047 scaffolds by smallPE. *************************** For insert size: 1591 Total PE links 6473613 PE links to masked contigs 5168616 On same scaffold PE links 17024 Cutoff of PE links to make a reliable connection: 5 Report from checkScaf: 0 scaffold segments broken. Add large insert size PE links: 10095 orientation-conflict links, 84 contigs acrossed by normal links. Active connections 2350498 Weak connections 2324420 Weak ratio 98.9% 599 circles removed. Start to remove transitive connection. Total contigs 1592951 Masked contigs 1005997 Remained contigs 586954 None-outgoing-connection contigs 404420 (68.901482%) Single-outgoing-connection contigs 177184 Multi-outgoing-connection contigs 335 Cycle 1 Two-outgoing-connection contigs 5015 Potential transitive connections 1793 Transitive connections 537 Transitive ratio 10.7% Cycle 2 Two-outgoing-connection contigs 4466 Potential transitive connections 1255 Transitive connections 0 Transitive ratio 0.0% Start to linearize sub-graph. Picked sub-graphs 4265 Connection-conflict 33 Significant overlapping 2684 Eligible 5 Bubble structures 0 Mask repeats: Puzzles 1981 Masked contigs 1671 Start to remove transitive connection. Total contigs 1592951 Masked contigs 1009339 Remained contigs 583612 None-outgoing-connection contigs 403388 (69.119209%) Single-outgoing-connection contigs 180008 Multi-outgoing-connection contigs 5 Cycle 1 Two-outgoing-connection contigs 211 Potential transitive connections 121 Transitive connections 31 Transitive ratio 14.7% Cycle 2 Two-outgoing-connection contigs 180 Potential transitive connections 90 Transitive connections 0 Transitive ratio 0.0% Start to linearize sub-graph. Picked sub-graphs 179 Connection-conflict 0 Significant overlapping 134 Eligible 0 Bubble structures 0 Start to mask puzzles. Masked contigs 90 Remained puzzles 0 Freezing done. Rank 2 Scaffold number 45633 In-scaffold contig number 596985 Total scaffold length 110068527 Average scaffold length 2412 Filled gap number 2522 Longest scaffold 127533 Scaffold and singleton number 506714 Scaffold and singleton length 254309154 Average length 501 N50 1587 N90 163 *************************** For insert size: 2228 Total PE links 1549560 PE links to masked contigs 1194348 On same scaffold PE links 37503 Cutoff of PE links to make a reliable connection: 5 Report from checkScaf: 3 scaffold segments broken. Add large insert size PE links: 10282 orientation-conflict links, 2816 contigs acrossed by normal links. Active connections 432390 Weak connections 294242 Weak ratio 68.1% 1692 circles removed. Start to remove transitive connection. Total contigs 1592951 Masked contigs 1013357 Remained contigs 579594 None-outgoing-connection contigs 290028 (50.039856%) Single-outgoing-connection contigs 244733 Multi-outgoing-connection contigs 5850 Cycle 1 Two-outgoing-connection contigs 38983 Potential transitive connections 13866 Transitive connections 3208 Transitive ratio 8.2% Cycle 2 Two-outgoing-connection contigs 35642 Potential transitive connections 10650 Transitive connections 7 Transitive ratio 0.0% Cycle 3 Two-outgoing-connection contigs 35634 Potential transitive connections 10643 Transitive connections 0 Transitive ratio 0.0% Start to linearize sub-graph. Picked sub-graphs 34751 Connection-conflict 478 Significant overlapping 26813 Eligible 85 Bubble structures 0 Mask repeats: Puzzles 18690 Masked contigs 13822 Start to remove transitive connection. Total contigs 1592951 Masked contigs 1041001 Remained contigs 551950 None-outgoing-connection contigs 290606 (52.650787%) Single-outgoing-connection contigs 258198 Multi-outgoing-connection contigs 108 Cycle 1 Two-outgoing-connection contigs 3038 Potential transitive connections 2082 Transitive connections 729 Transitive ratio 24.0% Cycle 2 Two-outgoing-connection contigs 2303 Potential transitive connections 1354 Transitive connections 0 Transitive ratio 0.0% Start to linearize sub-graph. Picked sub-graphs 2059 Connection-conflict 20 Significant overlapping 1261 Eligible 0 Bubble structures 0 Non-strict linearization. Start to linearize sub-graph. Picked sub-graphs 1088 Connection-conflict 15 Significant overlapping 744 Eligible 0 Bubble structures 0 Start to mask puzzles. Masked contigs 450 Remained puzzles 0 Freezing done. Recover contigs. Total recovered contigs 201 Single-route cases 192 Multi-route cases 5 All links loaded. Time spent on constructing scaffolds: 92s. The final rank ******************************* Scaffold number 43975 In-scaffold contig number 596985 Total scaffold length 199946808 Average scaffold length 4546 Filled gap number 1299 Longest scaffold 131125 Scaffold and singleton number 460831 Scaffold and singleton length 317791674 Average length 689 N50 4205 N90 182 Weak points 4 ******************************* 1000 scaffolds processed. 2000 scaffolds processed. 3000 scaffolds processed. 4000 scaffolds processed. 5000 scaffolds processed. 6000 scaffolds processed. 7000 scaffolds processed. 8000 scaffolds processed. 9000 scaffolds processed. 10000 scaffolds processed. 11000 scaffolds processed. 12000 scaffolds processed. 13000 scaffolds processed. 14000 scaffolds processed. 15000 scaffolds processed. 16000 scaffolds processed. 17000 scaffolds processed. 18000 scaffolds processed. 19000 scaffolds processed. 20000 scaffolds processed. 21000 scaffolds processed. 22000 scaffolds processed. 23000 scaffolds processed. 24000 scaffolds processed. 25000 scaffolds processed. 26000 scaffolds processed. 27000 scaffolds processed. 28000 scaffolds processed. 29000 scaffolds processed. 30000 scaffolds processed. 31000 scaffolds processed. 32000 scaffolds processed. 33000 scaffolds processed. 34000 scaffolds processed. 35000 scaffolds processed. 36000 scaffolds processed. 37000 scaffolds processed. 38000 scaffolds processed. 39000 scaffolds processed. 40000 scaffolds processed. 41000 scaffolds processed. 42000 scaffolds processed. 43000 scaffolds processed. Done with 43975 scaffolds, 0 gaps finished, 136526 gaps overall. Overall time spent on constructing scaffolds: 122m. Time for the whole pipeline: 215m. Finish time: 2021-06-15 18:59:46.381955 Elapsed time: 3:35:56.341955