Difference between revisions of "Tips kang"

From Crop Genomics Lab.
Jump to: navigation, search
Line 11: Line 11:
 
''GATK pipe''<br />
 
''GATK pipe''<br />
 
#bwa mem -M -t 10 Va.ref.fa ysp-2_1.fastq.gz ysp-2_2.fastq.gz | /data/program/samtools-0.1.19/samtools view -Sb - | /data/program/samtools-0.1.19/samtools sort - ysp.bwamem.Va.ref.fa.sort # GATK pipe는 -M 옵션이 필요
 
#bwa mem -M -t 10 Va.ref.fa ysp-2_1.fastq.gz ysp-2_2.fastq.gz | /data/program/samtools-0.1.19/samtools view -Sb - | /data/program/samtools-0.1.19/samtools sort - ysp.bwamem.Va.ref.fa.sort # GATK pipe는 -M 옵션이 필요
 +
#/data/program/jdk1.7.0_25/bin/java -jar /data/program/picard-tools-1.91/MarkDuplicates.jar INPUT=ysp.bwamem.Va.ref.fa.sort.bam OUTPUT=ysp.bwamem.Va.ref.fa.sort.bam.dedup.bam METRICS_FILE=metrics.txt MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000
 
''Maker''<br />
 
''Maker''<br />
 
#/data2/k821209/programs/maker/bin/gff3_merge -d Va.ref_master_datastore_index.log<br />
 
#/data2/k821209/programs/maker/bin/gff3_merge -d Va.ref_master_datastore_index.log<br />

Revision as of 05:05, 23 March 2014

Python
Fisher's exact test
from scipy import stats
oddsratio, pvalue = stats.fisher_exact([[A,B], [C, D]]) [1]
63:/home/k821209/py/NGS/vcfq2fa.py : vcfutil로 만들어진 fq 파일을 fa로 변환

Excel
=TEXT(2.2323,"(0.00)")
(2.23)

Softwares
GATK pipe

  1. bwa mem -M -t 10 Va.ref.fa ysp-2_1.fastq.gz ysp-2_2.fastq.gz | /data/program/samtools-0.1.19/samtools view -Sb - | /data/program/samtools-0.1.19/samtools sort - ysp.bwamem.Va.ref.fa.sort # GATK pipe는 -M 옵션이 필요
  2. /data/program/jdk1.7.0_25/bin/java -jar /data/program/picard-tools-1.91/MarkDuplicates.jar INPUT=ysp.bwamem.Va.ref.fa.sort.bam OUTPUT=ysp.bwamem.Va.ref.fa.sort.bam.dedup.bam METRICS_FILE=metrics.txt MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000

Maker

  1. /data2/k821209/programs/maker/bin/gff3_merge -d Va.ref_master_datastore_index.log
  2. /data2/k821209/programs/maker/bin/maker_map_ids --prefix=Vang --iterate=1 --suffix=. Va.ref.all.gff > id_map.txt # 복잡하게 나오는 maker의 유전자 이름들을 심플하게 바꾸는 툴
    • python3 /data2/k821209/Redbean/maker_pseudo/Va.ref.maker.output/genename_change_nonAnchor.py # 이름이 맘에 안들게 바뀌어서 개인적으로 만든 툴
  3. /data2/k821209/programs/maker/bin/map_gff_ids id_map.txt Va.ref.all.gff # 그리 나온 이름들을 gff 반영하는 툴
  4. python3 /data2/k821209/Redbean/maker_pseudo/Va.ref.maker.output/header_change.py Va.ref.all.maker.proteins.fasta Vang.scaffold.map # 그리 나온 이름들을 fasta에 반영하는 툴
  5. /data2/k821209/programs/maker/bin/iprscan2gff3 Va.ref.all.maker.proteins.fasta.tsv.hc.tsv Va.ref.all.gff > Va.ref.all.gff.ipr.gff # interpro result를 jbrowser에 들어가는 모양으로 만들어주는 툴

Deconseq [2]
Illumina read의 contamination을 확인한다.
63:/data/program/deconseq-standalone-0.4.3
/usr/bin/perl deconseq.pl -keep_tmp_files -f 800_both.fq -dbs bact,vir,arch -dbs_retain gmax

ePCR
Re-PCR
$ famap -tN -b genome.famap org/chr_*.fa
$ fahash -b genome.hash -w 12 -f3 ${PWD}/genome.famap
Work> /data/program/e-PCR-2.3.12/re-PCR -S genome.hash -n1 -g1 SSR.sts -o SSR.sts.mapped
SSR.sts
Mungbean_SSR_ID_1 CAAAAACATGAGTTGCACACAA TCATAACGCAGAACAGCGAA
Mungbean_SSR_ID_2 ATGTGTGTGAGCACCTCGAC TTTGGCCATGCAAGATGTAA
Mungbean_SSR_ID_4 GCGGTTCACCTAGCCATAAA GGACCCTTCTGTGCGTGTAT
Mungbean_SSR_ID_5 GTTTGTGCTGCGGATTCTTT TTGGCAATTTGGACTAAGGC
Mungbean_SSR_ID_7 TTGACCCAAAACTTACCAATTT GCTAAGGACTGGGGGTCTTC

Mummer, alignment draft genome to finished genome
$nucmer --prefix=ref_qry ref.fasta qry.fasta
$show-coords -rcl ref_qry.delta > ref_qry.coords
$show-aligns ref_qry.delta refname qryname > ref_qry.aligns
$show-tiling ref_qry.delta > ref_qry.tiling
추천 논문
Paterson, A.H., M. Freeling, H. Tang and X. Wang. 2010. Insights from the comparison of plant genome sequences. Annual Review of Plant Biology 61: 349-372. [3]

  1. scipy, fisher's exact
  2. deconseq
  3. Paterson, A.H., M. Freeling, H. Tang and X. Wang. 2010. Insights from the comparison of plant genome sequences. Annual Review of Plant Biology 61: 349-372.