@@ Line 1: / Line 1: @@
-== Ongoing ==
+[[2017 Jan Taeyoung Lab note]]
-. TBLASTX using Jat Species Transcriptome
-== 2017 1.2 ==
-==== Jatropha transcriptome Trinity assemble ====
- raw data : 244:/NGS/NGS/JatrophaCurcas/RNA
- Jatropha species transcriptome assemble : Jct,Jcu, Jin, Jgo, Jci, Jpo, Jmu, Jma, Jac, Rco (listed in 244:/NGS/NGS/JatrophaCurcas/RNA/list). All done
- Jatropha organ transcriptome assemble : Leaf, Root, Stem, Female flower, Male flower, LG, SG, Y, B. All done
-==== Cdhit ====
-:/data2/alima90/program/cdhit/cd-hit -i Y.cds.fa -M 10000 -o Y.cds.fa.cdhit -T 5
-:/data2/alima90/program/cdhit/cd-hit -i LG.cds.fa -M 10000 -o LG.cds.fa.cdhit -T 5
-==== UV GBS mapping (w/ joinmap) ====
-:python vcf.parsing.for.mandf.py UV.vcf.SNPonly 3 0.01 except_sample.txt > UV.vcf.SNPonly.except.LowDepthSample.d3.Q30.m0.1.loc
- loc file is manually edited by excel
- Genetic map is constructed using Joinmap 4.1
-==== KaKs calculation using scripts provided by MCscanX ====
- '''KaKs calculation between Jatropha species'''
-:python /alima9002/63_backup/Jatropha/CDS/run.kaks.py
-==== Large Insertion Prediction ====
-===== LIP short primer preparation =====
- '''Primer info'''
- >LIP01short_F
- AACTGAACACAGACAATGAA
- >LIP01short_R
- CAATTTATACACCACCTTAC
- >LIP02short_F
- CTCTTTGTATTTGGTGACAA
- >LIP02short_R
- GTATTAGCAGCTTTTGCTTA
- >LIP03short_F
- AATTGTAAGACATATCCCTC
- >LIP03short_R
- CTGCCCCACTAATAATTAAT
- >LIP04short_F
- TAAAAACAGAACTTGTCCAC
- >LIP04short_R
- ATCACAAGACTGAACAAGTA
- >LIP05short_F
- ATTGACATAAGGTTGCATAG
- >LIP05short_R
- CCTTAGCTCTTTTCTTTTGT
- >LIP06short_F
- GAAGGAAGGAAGCAATTATT
- >LIP06short_R
- TGACTTACCCTTTTTACCTT
- >LIP07short_F
- CACATGTTTGTCACTCTAAT
- >LIP07short_R
- GAAGTGAGGCCTAAAATAAA
- >LIP08short_F
- GAATGTATTGTCTTTGATCC
- >LIP08short_R
- GTTGGATTTTGTTCTTTCCA
- >LIP09short_F
- AGAAAAACGTCGATACCAAA
- >LIP09short_R
- CGATTTAGTAACCTTAGAAC
- >LIP10short_F
- ATCTTCAAAATGTCTCTAGG
- >LIP10short_R
- TACAGATATTCTTAGGCAGT
- >LIP11short_F
- TGTAACTCTCAATTAAGCAG
- >LIP11short_R
- ATCTTTCTGTAAGCACTTAG
- >LIP12short_F
- CTAGAACCGATTTGTTCAAA
- >LIP12short_R
- GCAGTTGTTTTGGATTAACA
- >LIP13short_F
- AAAGAGAAAGCAGAGAAATC
- >LIP13short_R
- ATGTATAGATTGGAGGAAAG
- >LIP14short_F
- ATTATGGAAAGGAATTGGAG
- >LIP14short_R
- CCATGTCTAGTATTTACTCA
- >LIP15short_F
- TTAATGACTGATCGTTAGTG
- >LIP15short_R
- CGGGAGTTATGAAAAATAGT
- >LIP24short_F
- AGTATGGTTTCAACATATGG
- >LIP24short_R
- GATATGAAGTTGACATGCTA
- >LIP16short_F
- ATTTAAAAGCTCGTAACTCC
- >LIP16short_R
- GGATAAGCAATTACAACACA
- >LIP17short_F
- CCCAAATTTTTAAATGCACC
- >LIP17short_R
- CTCTTGGAACGTGAAAAATT
- >LIP18short_F
- TTTTCTAGAAGGATTTGTGC
- >LIP18short_R
- CCATGCAAACCCAATTTTAA
- >LIP19short_F
- GTAAAACTAAGGTTGAGCTA
- >LIP19short_R
- CCACAAGTCACAACAATTTA
- >LIP20short_F
- TTATTTGTATGTTGGAGACC
- >LIP20short_R
- CATGGTATATAGGTTTAGGT
- >LIP21short_F
- CATAGAGAGTTTTGGATTAC
- >LIP21short_R
- AAAGAACTGATAGTGTCATG
- >LIP22short_F
- ATATGTACATGTATGGTGTG
- >LIP22short_R
- CCTAAATCTAGCAGAAGATT
- >LIP23short_F
- ATGTATGGAGAAATGGGTTA
- >LIP23short_R
- ATATAGAAATGGAGGTTGCT
- (listed in BACKUP(J:)/박사/Indel Candidate/LIP_short_primer.fa)
- Primer dilution
-== 2017. 1. 3 ==
-===AMORE work===
- python ~/py/ret_fasta_by_gene_name.py   /alima9002/ref/Athaliana/annotation/Athaliana_167_TAIR10.cds.fa gene_list.txt > gene_list.txt.fa
- blastp -db /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.protein.fa -query gene_list.txt.pep.fa -evalue 1e-5 -num_alignments 1 -outfmt 6 -num_threads 6 -out gene_list.txt.pep.fablastp.Gm275.1e-5.out6
-'''Homolog with Ath'''
-Glyma.08G014900
-Glyma.05G208300
-Glyma.20G001900
-Glyma.03G176600
-Glyma.19G177400
-Glyma.03G262600
-Glyma.06G202300
-Glyma.05G021800
-Glyma.17G077700
-Glyma.05G022000
-Glyma.09G234900
-Glyma.19G025000
-Glyma.10G224000
-Glyma.02G081000
-Glyma.20G167800
-Glyma.14G072700
-Glyma.17G252200
-Glyma.17G050500
-Glyma.07G038000
-Glyma.13G109100
-Glyma.16G007200
-Glyma.19G105100
-Glyma.09G283800
-Glyma.20G172700
-Glyma.02G076300
-'''SNP typing among IT182932,IT1099098,Hwangkeum-Kong'''
-.Read mapping using bwa mem with default options (/home/hayasen/Workspace/Glycine/GlycineMax/ver275/Reads/)
-.mpileup
- samtools mpileup -f /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa -v -t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -u -b bam_list | bcftools call -v -m -O v > Variant.vcf
-===LIP short primer gradient PCR===
-~8 primer is tested with CS-12
-Gradiant lower temp is 50 upper temp is 65
-Sample is loaded on 1% agarose gel and It was run with 100 V on 1 hour.
-{| class="wikitable"
-|-
-| 52.7 || 54.1 || 55.5 || 56.8 || 58.2 | 59.5 || 60.9 || 62.3
-|}
-<gallery>
-File:17010301.jpeg|LIPshort01-04
-</gallery>
-2
-4
-LIPshort1 -> Error when it is loaded
-LIPshort4 -> 55.5~59.5에서 증폭한 샘플만 로딩
-<gallery>
-File:17010302.jpg|Caption2
-</gallery>
-6
-8
-Estimated Tm:55.5~56.8
-== 2017.1.4~2017.1.6 ==
-농장 출장
-===LIP Gradient PCR===
-~65 celsius degree
-% agar 100V 1h
-<gallery>
-File:2017010401.jpg|LIP01,09,10,11
-File:2017010402.jpg|LIP12,13,14,15
-</gallery>
-All good
-<gallery>
-File:2017010601.jpg|LIP16,17,18,19
-File:2017010602.jpg|LIP20,21,22,24
-</gallery>
-LIP16's lower band is our target
-LIP20 did not show band
-== 2017.1.9 ==
-===AMORE(GK, IT182932, IT109098)===
-==== VCF parsing ====
- python ~/py/Reseq/filter.vcf.by.phred.hetero.depth.py Variant.vcf.SNP 5 > Variant.vcf.SNP.filtered.d5.Q30.homo
-==== Typing ====
- cat Variant.vcf.SNP.filtered.d5.Q30.homo.diff | python ~/py/Reseq/[Reseq]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type
-===Jatropha OrthoMcl===
-Retrieve complete pep only for OrthoMcl
-== 2017.1.10 ==
-===AMORE(GK, IT182932, IT109098)===
-==== filtering SNPs on homologous ====
- python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only
-==== Syn or Nonsyn typing ====
- python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS.SynNonsyn
-===Lactuca Indica Cdhit===
-  /alima9002/program/cd-hit-v4.6.4-2015-0603/cd-hit -i L.Trinity.fasta -o L.Trinity.fasta.cdhit -T 10 -M 10000
-== 2017.1.11==
-농장 출장
-== 2017.1.12==
-===Amore===
-====Make SNP tables====
-====INDEL analysis using snpEff====
- java -jar /alima9002/program/snpEff/snpEff.jar ann -c /alima9002/program/snpEff/snpEff.config -ud 1000 gmax275 Variant.vcf.INDEL > Variant.vcf.INDEL.snpEff
-==== homologs filtering using annotation file ====
-==== VCF filter by homologs retrieved by ann file ====
- python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.by.ann.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann
-==== Determination Synonymous ====
- python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS.SynNonsyn
-===Jatropha KaKs===
- python parsing.all.kaks.py all.kaks > all.kaks.ksonly
-====Drawing graph using R====
- require(ggplot2)
- data<-read.table("all.kaks.ksonly",header=F)
- colnames(data)<-c("Species","Ks")
- Ks <- data$Ks
- Species <- data$Species
- ggplot(data,aes(Ks,colour=Species))+geom_freqpoly(binwidth=0.01)+scale_x_continuous(limits=c(0,0.8))
-==2017.1.13==
-농장 출장(꼬투리 lwt)
-==2017.1.16==
-===Jatropha Ks value using transcriptome===
-Jat species which were not clustered by cdhit were used for TBLASTX
- tblastx -db Jct.cds.fa.complete.fa -query Jgo.cds.fa.complete.fa -evalue 1e-10 -outfmt 6 -num_alignments 5 -out Jct.tblastx.nocdhit.Jgo.1e-10.out6 -num_threads 8
-===Amore snpEff===
-Split result as one by one lines
- perl /alima9002/program/snpEff/scripts/vcfEffOnePerLine.pl Variant.vcf.INDEL.snpEff
-===Amore SNP typing===
-for check IT182932 mapping depth, SNP typing is performed in not filtered vcf file
-  python ~/py/Reseq/\[Reseq\]SNP_counter.py Variant.vcf.SNP /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type
-==2017.1.17==
-===OrthoMcl for Jat Organ===
- blastp -db goodProteins.fasta -query goodProteins.fasta -outfmt 6 -out goodProteins.fasta.allvall.jat.organ -num_threads 15 -evalue 1e-5 -seg yes -soft_masking true -max_target_seqs 999999999
-===Drawing Venn diagram using JatSp orthomcl result===
- D:\Lab work\Jatropha\JatSp_Orthomcl_Venn
-===Re-SNP typing of amore study===
-명령어가 잘못된 것을 발견
- cat Variant.vcf.SNP | python ~/py/Reseq/\[Reseq\]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type
-===snpEff parsing===
-명령어가 잘못된 것을 발견
- cat Variant.vcf.INDEL.snpEff | 'perl' /alima9002/program/snpEff/scripts/vcfEffOnePerLine.pl > Variant.vcf.INDEL.snpEff.parsed
-===snpEff typing===
- python Indel.Typing.Using.snpEff.Result.py Variant.vcf.INDEL.snpEff.parsed > Variant.vcf.INDEL.snpEff.parsed.tp
-===filtering homologs only on snpEff typing results===
- python get_INDEL_on_homolog.py Variant.vcf.INDEL.snpEff.parsed.tp homologs.by.ann.txt > Variant.vcf.INDEL.snpEff.parsed.tp.hom.ann
- less Variant.vcf.INDEL.snpEff.parsed.tp.hom.ann | sort -u > Variant.vcf.INDEL.snpEff.parsed.tp.hom.ann.sorted
-===count INDEL type===
- python count.indel.component.py Variant.vcf.INDEL.snpEff.parsed.tp.hom.ann.sorted
-===Discussion===
-organ은 3반복을 따로 orthomcl에 집어넣고 셋 다 있는 것을 카운트
-==2017.1.18==
-===ret SNP sets(1 missing GT is permitted)===
- python filter.vcf.by.phred.hetero.depth.py.only.one.missing.permitted.py
-===Amore SNP typing(1 missing GT is permitted)===
- cat Variant.vcf.SNP.filtered.d5.Q30.missing1.diff | python ~/py/Reseq/\[Reseq\]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.filtered.d5.Q30.missing1.diff.type
-==2017.1.19==
-===LIP related figure===
-LIP로 예측한 insertion type별 그림
- python ~/py/GeneInfo2Figure_v1.1.py gene_list.gi Glyma.08G.123500.gi2f.config > gene_list.gi.svg
-이후 Illustrator로 manually 수정
-==2017.1.23==
-===finishing SNP, INDEL typing of amore study===
-===Jat species with no cdhit OrthoMcl ===
- follow Orthomcl manual
-==2017.1.31==
-===Drawing size distribution of predicted insertion ===
-R with ggplot2 ver. 2.2
- ggplot(SV, aes(SV$Size))+geom_histogram(aes(fill=SV$Program),binwidth=200,position="dodge")+scale_y_continuous("Counts")+coord_cartesian(ylim=c(400,2000),expand=FALSE)

Difference between revisions of "2017 Taeyoung Lab note"

Revision as of 05:01, 6 February 2017

Navigation menu

Search