Difference between revisions of "2017 Taeyoung Lab note"
(→2017.1.17) |
(→2017.1.17) |
||
Line 295: | Line 295: | ||
===Drawing Venn diagram using JatSp orthomcl result=== | ===Drawing Venn diagram using JatSp orthomcl result=== | ||
D:\Lab work\Jatropha\JatSp_Orthomcl_Venn | D:\Lab work\Jatropha\JatSp_Orthomcl_Venn | ||
+ | ===Re-SNP typing of amore study=== | ||
+ | 명령어가 잘못된 것을 발견 | ||
+ | cat Variant.vcf.SNP | python ~/py/Reseq/[Reseq]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type |
Revision as of 05:35, 18 January 2017
Contents |
Ongoing
1. TBLASTX using Jat Species Transcriptome
2017 1.2
Jatropha transcriptome Trinity assemble
raw data : 244:/NGS/NGS/JatrophaCurcas/RNA Jatropha species transcriptome assemble : Jct,Jcu, Jin, Jgo, Jci, Jpo, Jmu, Jma, Jac, Rco (listed in 244:/NGS/NGS/JatrophaCurcas/RNA/list). All done Jatropha organ transcriptome assemble : Leaf, Root, Stem, Female flower, Male flower, LG, SG, Y, B. All done
Cdhit
193:/data2/alima90/program/cdhit/cd-hit -i Y.cds.fa -M 10000 -o Y.cds.fa.cdhit -T 5 193:/data2/alima90/program/cdhit/cd-hit -i LG.cds.fa -M 10000 -o LG.cds.fa.cdhit -T 5
UV GBS mapping (w/ joinmap)
244:python vcf.parsing.for.mandf.py UV.vcf.SNPonly 3 0.01 except_sample.txt > UV.vcf.SNPonly.except.LowDepthSample.d3.Q30.m0.1.loc loc file is manually edited by excel Genetic map is constructed using Joinmap 4.1
KaKs calculation using scripts provided by MCscanX
KaKs calculation between Jatropha species 244 :python /alima9002/63_backup/Jatropha/CDS/run.kaks.py
Large Insertion Prediction
LIP short primer preparation
Primer info >LIP01short_F AACTGAACACAGACAATGAA >LIP01short_R CAATTTATACACCACCTTAC >LIP02short_F CTCTTTGTATTTGGTGACAA >LIP02short_R GTATTAGCAGCTTTTGCTTA >LIP03short_F AATTGTAAGACATATCCCTC >LIP03short_R CTGCCCCACTAATAATTAAT >LIP04short_F TAAAAACAGAACTTGTCCAC >LIP04short_R ATCACAAGACTGAACAAGTA >LIP05short_F ATTGACATAAGGTTGCATAG >LIP05short_R CCTTAGCTCTTTTCTTTTGT >LIP06short_F GAAGGAAGGAAGCAATTATT >LIP06short_R TGACTTACCCTTTTTACCTT >LIP07short_F CACATGTTTGTCACTCTAAT >LIP07short_R GAAGTGAGGCCTAAAATAAA >LIP08short_F GAATGTATTGTCTTTGATCC >LIP08short_R GTTGGATTTTGTTCTTTCCA >LIP09short_F AGAAAAACGTCGATACCAAA >LIP09short_R CGATTTAGTAACCTTAGAAC >LIP10short_F ATCTTCAAAATGTCTCTAGG >LIP10short_R TACAGATATTCTTAGGCAGT >LIP11short_F TGTAACTCTCAATTAAGCAG >LIP11short_R ATCTTTCTGTAAGCACTTAG >LIP12short_F CTAGAACCGATTTGTTCAAA >LIP12short_R GCAGTTGTTTTGGATTAACA >LIP13short_F AAAGAGAAAGCAGAGAAATC >LIP13short_R ATGTATAGATTGGAGGAAAG >LIP14short_F ATTATGGAAAGGAATTGGAG >LIP14short_R CCATGTCTAGTATTTACTCA >LIP15short_F TTAATGACTGATCGTTAGTG >LIP15short_R CGGGAGTTATGAAAAATAGT >LIP24short_F AGTATGGTTTCAACATATGG >LIP24short_R GATATGAAGTTGACATGCTA >LIP16short_F ATTTAAAAGCTCGTAACTCC >LIP16short_R GGATAAGCAATTACAACACA >LIP17short_F CCCAAATTTTTAAATGCACC >LIP17short_R CTCTTGGAACGTGAAAAATT >LIP18short_F TTTTCTAGAAGGATTTGTGC >LIP18short_R CCATGCAAACCCAATTTTAA >LIP19short_F GTAAAACTAAGGTTGAGCTA >LIP19short_R CCACAAGTCACAACAATTTA >LIP20short_F TTATTTGTATGTTGGAGACC >LIP20short_R CATGGTATATAGGTTTAGGT >LIP21short_F CATAGAGAGTTTTGGATTAC >LIP21short_R AAAGAACTGATAGTGTCATG >LIP22short_F ATATGTACATGTATGGTGTG >LIP22short_R CCTAAATCTAGCAGAAGATT >LIP23short_F ATGTATGGAGAAATGGGTTA >LIP23short_R ATATAGAAATGGAGGTTGCT (listed in BACKUP(J:)/박사/Indel Candidate/LIP_short_primer.fa)
Primer dilution
2017. 1. 3
AMORE work
python ~/py/ret_fasta_by_gene_name.py /alima9002/ref/Athaliana/annotation/Athaliana_167_TAIR10.cds.fa gene_list.txt > gene_list.txt.fa blastp -db /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.protein.fa -query gene_list.txt.pep.fa -evalue 1e-5 -num_alignments 1 -outfmt 6 -num_threads 6 -out gene_list.txt.pep.fablastp.Gm275.1e-5.out6
Homolog with Ath Glyma.08G014900 Glyma.05G208300 Glyma.20G001900 Glyma.03G176600 Glyma.19G177400 Glyma.03G262600 Glyma.06G202300 Glyma.05G021800 Glyma.17G077700 Glyma.05G022000 Glyma.09G234900 Glyma.19G025000 Glyma.10G224000 Glyma.02G081000 Glyma.20G167800 Glyma.14G072700 Glyma.17G252200 Glyma.17G050500 Glyma.07G038000 Glyma.13G109100 Glyma.16G007200 Glyma.19G105100 Glyma.09G283800 Glyma.20G172700 Glyma.02G076300
SNP typing among IT182932,IT1099098,Hwangkeum-Kong
1.Read mapping using bwa mem with default options (/home/hayasen/Workspace/Glycine/GlycineMax/ver275/Reads/)
2.mpileup
samtools mpileup -f /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa -v -t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -u -b bam_list | bcftools call -v -m -O v > Variant.vcf
LIP short primer gradient PCR
1~8 primer is tested with CS-12
Gradiant lower temp is 50 upper temp is 65
Sample is loaded on 1% agarose gel and It was run with 100 V on 1 hour.
52.7 | 54.1 | 55.5 | 56.8 | 59.5 | 60.9 | 62.3 |
1 2
3 4
LIPshort1 -> Error when it is loaded
LIPshort4 -> 55.5~59.5에서 증폭한 샘플만 로딩
5 6
7 8
Estimated Tm:55.5~56.8
2017.1.4~2017.1.6
농장 출장
LIP Gradient PCR
50~65 celsius degree
1% agar 100V 1h
All good
LIP16's lower band is our target
LIP20 did not show band
2017.1.9
AMORE(GK, IT182932, IT109098)
VCF parsing
python ~/py/Reseq/filter.vcf.by.phred.hetero.depth.py Variant.vcf.SNP 5 > Variant.vcf.SNP.filtered.d5.Q30.homo
Typing
cat Variant.vcf.SNP.filtered.d5.Q30.homo.diff | python ~/py/Reseq/[Reseq]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type
Jatropha OrthoMcl
Retrieve complete pep only for OrthoMcl
2017.1.10
AMORE(GK, IT182932, IT109098)
filtering SNPs on homologous
python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only
Syn or Nonsyn typing
python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS.SynNonsyn
Lactuca Indica Cdhit
/alima9002/program/cd-hit-v4.6.4-2015-0603/cd-hit -i L.Trinity.fasta -o L.Trinity.fasta.cdhit -T 10 -M 10000
2017.1.11
농장 출장
2017.1.12
Amore
Make SNP tables
INDEL analysis using snpEff
java -jar /alima9002/program/snpEff/snpEff.jar ann -c /alima9002/program/snpEff/snpEff.config -ud 1000 gmax275 Variant.vcf.INDEL > Variant.vcf.INDEL.snpEff
homologs filtering using annotation file
VCF filter by homologs retrieved by ann file
python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.by.ann.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann
Determination Synonymous
python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS.SynNonsyn
Jatropha KaKs
python parsing.all.kaks.py all.kaks > all.kaks.ksonly
Drawing graph using R
require(ggplot2) data<-read.table("all.kaks.ksonly",header=F) colnames(data)<-c("Species","Ks") Ks <- data$Ks Species <- data$Species ggplot(data,aes(Ks,colour=Species))+geom_freqpoly(binwidth=0.01)+scale_x_continuous(limits=c(0,0.8))
2017.1.13
농장 출장(꼬투리 lwt)
2017.1.16
Jatropha Ks value using transcriptome
Jat species which were not clustered by cdhit were used for TBLASTX
tblastx -db Jct.cds.fa.complete.fa -query Jgo.cds.fa.complete.fa -evalue 1e-10 -outfmt 6 -num_alignments 5 -out Jct.tblastx.nocdhit.Jgo.1e-10.out6 -num_threads 8
Amore snpEff
Split result as one by one lines
perl /alima9002/program/snpEff/scripts/vcfEffOnePerLine.pl Variant.vcf.INDEL.snpEff
Amore SNP typing
for check IT182932 mapping depth, SNP typing is performed in not filtered vcf file
python ~/py/Reseq/\[Reseq\]SNP_counter.py Variant.vcf.SNP /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type
2017.1.17
OrthoMcl for Jat Organ
blastp -db goodProteins.fasta -query goodProteins.fasta -outfmt 6 -out goodProteins.fasta.allvall.jat.organ -num_threads 15 -evalue 1e-5 -seg yes -soft_masking true -max_target_seqs 999999999
Drawing Venn diagram using JatSp orthomcl result
D:\Lab work\Jatropha\JatSp_Orthomcl_Venn
Re-SNP typing of amore study
명령어가 잘못된 것을 발견
cat Variant.vcf.SNP | python ~/py/Reseq/[Reseq]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type