Difference between revisions of "Tuxedo pipeline"

From Crop Genomics Lab.
Jump to: navigation, search
 
(16 intermediate revisions by one user not shown)
Line 1: Line 1:
 
=== Step 1. fastq dump ===
 
=== Step 1. fastq dump ===
fastq-dump -A SRR203363 SRR203363.sra  
+
* This step is only required, when you want to using SRA fastq data
-A $sradata_accession $sra_data.sra   
+
  fastq-dump -A SRR203363 SRR203363.sra  
if paired-end sequence file were used, we should use --split-3 option and check making *_1.fa and *_2.fa
+
  -A $sradata_accession $sra_data.sra   
+
 
=== Step 2. bowtie build (align preprocessing) ===
+
* if paired-end sequence file were used, we should use --split-3 option and check making *_1.fa and *_2.fa
bowtie-build Gmax_189.fa Gmax_189.fa  
+
 
+
=== Step 2. bowtie build (Build index file) ===
=== Step 3. tophat - calculate splice junction ===
+
  /location_of_bowtie/bowtie-build reference.fa reference.fa
/data/program/tophat-1.4.1.Linux_x86_64/tophat -p 4(core) -G /data/ref/Gmax_189_gene.gff3 -o SRR203363_thout Gmax_189.fa(bowtie_index) SRR203363.fastq(reads)
+
 
+
=== Step 3. tophat (calculate splice junction) ===
 +
  /loaction_of_tophat/tophat -p number_of_threads \
 +
  -G gff3_file_of_genome \
 +
  -o tophat_ouput_dir \
 +
  reference.fa \
 +
  fastq_file
 +
 
 
=== Step 4. cufflinks - make gtf ===
 
=== Step 4. cufflinks - make gtf ===
cufflinks -p 4 -o ./SRR203363_clout(cufflinks_out_dir) SRR203363_thout/accepted_hits.bam(tophat_output)
+
  /location_of_cufflinks/cufflinks -p number_of_threads \
+
  -o cufflinks_out_dir \
 +
  tophat_out_dir/accepted_hits.bam -g gfffile
 +
 
 
=== Step 5. cuffmerge - merging gtf ===
 
=== Step 5. cuffmerge - merging gtf ===
in the assemblies.txt, file location such as ./SRR####_clout/transcripts.gtf should be written in each line
+
* locations of transcripts.gtf files derived by cufflinks should be listed in assembly.txt
 +
 
 +
  * in assembly.txt
 +
  cufflinks_out_dir/transcript.gtf
 +
  cufflinks_out_dir/transcript.gtf
 +
 
 +
  /location_of_cuffmerge/cuffmerge -g gff3_file_used_in_tophat \
 +
  -s reference.fa \
 +
  -p number_of_threads
 +
  assemblies.txt
  
/data/program/cufflinks-2.1.1.Linux_x86_64/cuffmerge -g /data/ref/Gmax_189_gene.gff3(used_at_tophat) -s ./Gmax_189.fa(bowtie_indexed_genome_fasta) -p 4 ./assemblies.txt
 
 
 
=== Step 6. cuffdiff - deg ===
 
=== Step 6. cuffdiff - deg ===
/data/program/cufflinks-2.1.1.Linux_x86_64/cuffdiff -o ./diff_out/ -b Gmax_189.fa -p 4 -L w82Leaf,w82leaf -u ./merged_asm/merged.gtf ./SRR203363_thout/accepted_hits.bam ./SRR203363_thout/accepted_hits.bam 
+
  /cuffdiff_location/cuffdiff \
+
  -o cuffdiff_out_dir \
 +
  -b reference.fa \
 +
  -p number_of_threads \
 +
  -L label_of_bam_1,label_of_bam_2 \
 +
  -u gtf_file_derived_by_cuffmerge(merged_asm/transcript.gtf) \
 +
  tophat_out_dir/accepted_hits.bam_1 tophat_out_dir/accepted_hits.bam_2
 +
 
 
=== Step 7. cummeRbund - analysis ===
 
=== Step 7. cummeRbund - analysis ===
R  
+
* Execute R package in diff_out directory
Library(cummeRbund)  
+
  diff_out_dir/$ R
Diff->
+
* Import cummeRbund package
 +
  library(cummeRbund)
 +
* [[cummeRbund command]]

Latest revision as of 00:30, 2 January 2018

Contents

Step 1. fastq dump

  • This step is only required, when you want to using SRA fastq data
  fastq-dump -A SRR203363 SRR203363.sra 
  -A $sradata_accession $sra_data.sra  
  • if paired-end sequence file were used, we should use --split-3 option and check making *_1.fa and *_2.fa

Step 2. bowtie build (Build index file)

  /location_of_bowtie/bowtie-build reference.fa reference.fa

Step 3. tophat (calculate splice junction)

  /loaction_of_tophat/tophat -p number_of_threads \
  -G gff3_file_of_genome \
  -o tophat_ouput_dir \
  reference.fa \
  fastq_file

Step 4. cufflinks - make gtf

  /location_of_cufflinks/cufflinks -p number_of_threads \
  -o cufflinks_out_dir \
  tophat_out_dir/accepted_hits.bam -g gfffile

Step 5. cuffmerge - merging gtf

  • locations of transcripts.gtf files derived by cufflinks should be listed in assembly.txt
  * in assembly.txt
  cufflinks_out_dir/transcript.gtf
  cufflinks_out_dir/transcript.gtf
  /location_of_cuffmerge/cuffmerge -g gff3_file_used_in_tophat \
  -s reference.fa \
  -p number_of_threads 
  assemblies.txt

Step 6. cuffdiff - deg

  /cuffdiff_location/cuffdiff \
  -o cuffdiff_out_dir \
  -b reference.fa \
  -p number_of_threads \
  -L label_of_bam_1,label_of_bam_2 \
  -u gtf_file_derived_by_cuffmerge(merged_asm/transcript.gtf) \
  tophat_out_dir/accepted_hits.bam_1 tophat_out_dir/accepted_hits.bam_2

Step 7. cummeRbund - analysis

  • Execute R package in diff_out directory
  diff_out_dir/$ R 
  • Import cummeRbund package
  library(cummeRbund)