Difference between revisions of "Prediction pipeline"
Line 14: | Line 14: | ||
5. Denovo assemble of whole sequencing reads using AbySS with k-mer = 51 and q-vale = 20 : W-contig | 5. Denovo assemble of whole sequencing reads using AbySS with k-mer = 51 and q-vale = 20 : W-contig | ||
− | 6. BLASTN between W-contigs and U-contigs longer than 2k with e-value threshold, 1e-100 | + | 6. BLASTN between W-contigs and U-contigs longer than 2k with e-value threshold, 1e-100 (1.blastn_between.unmappedNwhole.py) |
− | 7. Retrieve W-contigs which contain full length U-contigs on mid-region, not point-region. | + | 7. Retrieve W-contigs which contain full length U-contigs on mid-region, not point-region, with 100% identity.(2.ret_single_type.py) |
− | 8. BLASTN between W-contigs and G. max reference sequence with e-value threshold, 1e-100 | + | 8. BLASTN between W-contigs and G. max reference sequence with e-value threshold, 1e-100 (3.blastn_with_reference_genome.py) |
− | 9. Indel candidates were selected by followed condition: | + | 9. Indel candidates were selected by followed condition: (4.ret_indel_candidates.py) |
− | : * Indel candidate | + | : * Indel candidate have homology pairs with G. max on same chromosomes which were same directional |
− | : * | + | : * The homology pairs have to be flanking region of U-contigs |
− | : * homologous region of G.max have to smaller than 10k | + | : * The homology pairs' regions on G. max were not overlapped |
− | : * | + | : * The homologous region of G.max have to smaller than 10k |
+ | |||
+ | 10. Retrieve indel candidates using read evidence (5.read_evidence.py) | ||
+ | : * If aligned reads, in the gap between homology pair, were seemed to have blunt end, They could be indel candidates | ||
+ | : * (We can detect that feature using samtools tview or CIGAR string of samtool format) | ||
+ | |||
+ | 11. Design primer sets using indel candidates |
Revision as of 07:37, 17 April 2014
1. Whole reads were mapped against reference genome
2. Both unmapped reads retrieve using following command
samtools view –hb –f 12 –F 256 input.bam > output.bam |
3. Make fastq file in bam file using bam2fastq
4. Denovo assemble of both unmapped reads using AbySS with k-mer = 51 and q-value = 20 : U-contig
5. Denovo assemble of whole sequencing reads using AbySS with k-mer = 51 and q-vale = 20 : W-contig
6. BLASTN between W-contigs and U-contigs longer than 2k with e-value threshold, 1e-100 (1.blastn_between.unmappedNwhole.py)
7. Retrieve W-contigs which contain full length U-contigs on mid-region, not point-region, with 100% identity.(2.ret_single_type.py)
8. BLASTN between W-contigs and G. max reference sequence with e-value threshold, 1e-100 (3.blastn_with_reference_genome.py)
9. Indel candidates were selected by followed condition: (4.ret_indel_candidates.py)
- * Indel candidate have homology pairs with G. max on same chromosomes which were same directional
- * The homology pairs have to be flanking region of U-contigs
- * The homology pairs' regions on G. max were not overlapped
- * The homologous region of G.max have to smaller than 10k
10. Retrieve indel candidates using read evidence (5.read_evidence.py)
- * If aligned reads, in the gap between homology pair, were seemed to have blunt end, They could be indel candidates
- * (We can detect that feature using samtools tview or CIGAR string of samtool format)
11. Design primer sets using indel candidates