Difference between revisions of "2017 Taeyoung Lab note"

From Crop Genomics Lab.
Jump to: navigation, search
(Re-SNP typing of amore study)
 
(20 intermediate revisions by one user not shown)
Line 1: Line 1:
== Ongoing ==
+
[[2017 Jan Taeyoung Lab note]]
  
1. TBLASTX using Jat Species Transcriptome
+
[[2017 Feb Taeyoung Lab note]]
  
== 2017 1.2 ==
+
[[2017 Mar Taeyoung Lab note]]
==== Jatropha transcriptome Trinity assemble ====
+
raw data : 244:/NGS/NGS/JatrophaCurcas/RNA
+
Jatropha species transcriptome assemble : Jct,Jcu, Jin, Jgo, Jci, Jpo, Jmu, Jma, Jac, Rco (listed in 244:/NGS/NGS/JatrophaCurcas/RNA/list). All done
+
Jatropha organ transcriptome assemble : Leaf, Root, Stem, Female flower, Male flower, LG, SG, Y, B. All done
+
  
==== Cdhit ====
+
[[2017 May Taeyoung Lab note]]
193:/data2/alima90/program/cdhit/cd-hit -i Y.cds.fa -M 10000 -o Y.cds.fa.cdhit -T 5
+
193:/data2/alima90/program/cdhit/cd-hit -i LG.cds.fa -M 10000 -o LG.cds.fa.cdhit -T 5
+
  
==== UV GBS mapping (w/ joinmap) ====
+
[[2017 Sep Taeyoung Lab note]]
244:python vcf.parsing.for.mandf.py UV.vcf.SNPonly 3 0.01 except_sample.txt > UV.vcf.SNPonly.except.LowDepthSample.d3.Q30.m0.1.loc
+
loc file is manually edited by excel
+
Genetic map is constructed using Joinmap 4.1
+
  
==== KaKs calculation using scripts provided by MCscanX ====
+
[[2017 Oct Taeyoung Lab note]]
  
'''KaKs calculation between Jatropha species'''
+
[[2017 Nov Taeyoung Lab note]]
244 :python /alima9002/63_backup/Jatropha/CDS/run.kaks.py
+
  
==== Large Insertion Prediction ====
+
[[2017 Dec Taeyoung Lab note]]
===== LIP short primer preparation =====
+
'''Primer info'''
+
>LIP01short_F
+
AACTGAACACAGACAATGAA
+
>LIP01short_R
+
CAATTTATACACCACCTTAC
+
>LIP02short_F
+
CTCTTTGTATTTGGTGACAA
+
>LIP02short_R
+
GTATTAGCAGCTTTTGCTTA
+
>LIP03short_F
+
AATTGTAAGACATATCCCTC
+
>LIP03short_R
+
CTGCCCCACTAATAATTAAT
+
>LIP04short_F
+
TAAAAACAGAACTTGTCCAC
+
>LIP04short_R
+
ATCACAAGACTGAACAAGTA
+
>LIP05short_F
+
ATTGACATAAGGTTGCATAG
+
>LIP05short_R
+
CCTTAGCTCTTTTCTTTTGT
+
>LIP06short_F
+
GAAGGAAGGAAGCAATTATT
+
>LIP06short_R
+
TGACTTACCCTTTTTACCTT
+
>LIP07short_F
+
CACATGTTTGTCACTCTAAT
+
>LIP07short_R
+
GAAGTGAGGCCTAAAATAAA
+
>LIP08short_F
+
GAATGTATTGTCTTTGATCC
+
>LIP08short_R
+
GTTGGATTTTGTTCTTTCCA
+
>LIP09short_F
+
AGAAAAACGTCGATACCAAA
+
>LIP09short_R
+
CGATTTAGTAACCTTAGAAC
+
>LIP10short_F
+
ATCTTCAAAATGTCTCTAGG
+
>LIP10short_R
+
TACAGATATTCTTAGGCAGT
+
>LIP11short_F
+
TGTAACTCTCAATTAAGCAG
+
>LIP11short_R
+
ATCTTTCTGTAAGCACTTAG
+
>LIP12short_F
+
CTAGAACCGATTTGTTCAAA
+
>LIP12short_R
+
GCAGTTGTTTTGGATTAACA
+
>LIP13short_F
+
AAAGAGAAAGCAGAGAAATC
+
>LIP13short_R
+
ATGTATAGATTGGAGGAAAG
+
>LIP14short_F
+
ATTATGGAAAGGAATTGGAG
+
>LIP14short_R
+
CCATGTCTAGTATTTACTCA
+
>LIP15short_F
+
TTAATGACTGATCGTTAGTG
+
>LIP15short_R
+
CGGGAGTTATGAAAAATAGT
+
>LIP24short_F
+
AGTATGGTTTCAACATATGG
+
>LIP24short_R
+
GATATGAAGTTGACATGCTA
+
>LIP16short_F
+
ATTTAAAAGCTCGTAACTCC
+
>LIP16short_R
+
GGATAAGCAATTACAACACA
+
>LIP17short_F
+
CCCAAATTTTTAAATGCACC
+
>LIP17short_R
+
CTCTTGGAACGTGAAAAATT
+
>LIP18short_F
+
TTTTCTAGAAGGATTTGTGC
+
>LIP18short_R
+
CCATGCAAACCCAATTTTAA
+
>LIP19short_F
+
GTAAAACTAAGGTTGAGCTA
+
>LIP19short_R
+
CCACAAGTCACAACAATTTA
+
>LIP20short_F
+
TTATTTGTATGTTGGAGACC
+
>LIP20short_R
+
CATGGTATATAGGTTTAGGT
+
>LIP21short_F
+
CATAGAGAGTTTTGGATTAC
+
>LIP21short_R
+
AAAGAACTGATAGTGTCATG
+
>LIP22short_F
+
ATATGTACATGTATGGTGTG
+
>LIP22short_R
+
CCTAAATCTAGCAGAAGATT
+
>LIP23short_F
+
ATGTATGGAGAAATGGGTTA
+
>LIP23short_R
+
ATATAGAAATGGAGGTTGCT
+
(listed in BACKUP(J:)/박사/Indel Candidate/LIP_short_primer.fa)
+
 
+
Primer dilution
+
 
+
== 2017. 1. 3 ==
+
===AMORE work===
+
 
+
python ~/py/ret_fasta_by_gene_name.py  /alima9002/ref/Athaliana/annotation/Athaliana_167_TAIR10.cds.fa gene_list.txt > gene_list.txt.fa
+
blastp -db /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.protein.fa -query gene_list.txt.pep.fa -evalue 1e-5 -num_alignments 1 -outfmt 6 -num_threads 6 -out gene_list.txt.pep.fablastp.Gm275.1e-5.out6
+
 
+
'''Homolog with Ath'''
+
Glyma.08G014900
+
Glyma.05G208300
+
Glyma.20G001900
+
Glyma.03G176600
+
Glyma.19G177400
+
Glyma.03G262600
+
Glyma.06G202300
+
Glyma.05G021800
+
Glyma.17G077700
+
Glyma.05G022000
+
Glyma.09G234900
+
Glyma.19G025000
+
Glyma.10G224000
+
Glyma.02G081000
+
Glyma.20G167800
+
Glyma.14G072700
+
Glyma.17G252200
+
Glyma.17G050500
+
Glyma.07G038000
+
Glyma.13G109100
+
Glyma.16G007200
+
Glyma.19G105100
+
Glyma.09G283800
+
Glyma.20G172700
+
Glyma.02G076300
+
 
+
'''SNP typing among IT182932,IT1099098,Hwangkeum-Kong'''
+
 
+
1.Read mapping using bwa mem with default options (/home/hayasen/Workspace/Glycine/GlycineMax/ver275/Reads/)
+
 
+
2.mpileup
+
samtools mpileup -f /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa -v -t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -u -b bam_list | bcftools call -v -m -O v > Variant.vcf
+
 
+
===LIP short primer gradient PCR===
+
1~8 primer is tested with CS-12
+
 
+
Gradiant lower temp is 50 upper temp is 65
+
 
+
Sample is loaded on 1% agarose gel and It was run with 100 V on 1 hour.
+
 
+
{| class="wikitable"
+
|-
+
| 52.7 || 54.1 || 55.5 || 56.8 || 58.2 | 59.5 || 60.9 || 62.3
+
|}
+
 
+
<gallery>
+
File:17010301.jpeg|LIPshort01-04
+
</gallery>
+
1 2
+
 
+
3 4
+
 
+
 
+
LIPshort1 -> Error when it is loaded
+
 
+
LIPshort4 -> 55.5~59.5에서 증폭한 샘플만 로딩
+
 
+
<gallery>
+
File:17010302.jpg|Caption2
+
</gallery>
+
 
+
5 6
+
 
+
7 8
+
 
+
Estimated Tm:55.5~56.8
+
 
+
 
+
== 2017.1.4~2017.1.6 ==
+
 
+
농장 출장
+
 
+
===LIP Gradient PCR===
+
50~65 celsius degree
+
 
+
1% agar 100V 1h
+
 
+
 
+
 
+
<gallery>
+
File:2017010401.jpg|LIP01,09,10,11
+
File:2017010402.jpg|LIP12,13,14,15
+
</gallery>
+
 
+
All good
+
 
+
 
+
<gallery>
+
File:2017010601.jpg|LIP16,17,18,19
+
File:2017010602.jpg|LIP20,21,22,24
+
</gallery>
+
 
+
LIP16's lower band is our target
+
 
+
LIP20 did not show band
+
 
+
== 2017.1.9 ==
+
 
+
===AMORE(GK, IT182932, IT109098)===
+
==== VCF parsing ====
+
python ~/py/Reseq/filter.vcf.by.phred.hetero.depth.py Variant.vcf.SNP 5 > Variant.vcf.SNP.filtered.d5.Q30.homo
+
==== Typing ====
+
cat Variant.vcf.SNP.filtered.d5.Q30.homo.diff | python ~/py/Reseq/[Reseq]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type
+
 
+
===Jatropha OrthoMcl===
+
Retrieve complete pep only for OrthoMcl
+
 
+
== 2017.1.10 ==
+
===AMORE(GK, IT182932, IT109098)===
+
==== filtering SNPs on homologous ====
+
python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only
+
==== Syn or Nonsyn typing ====
+
python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS.SynNonsyn
+
===Lactuca Indica Cdhit===
+
  /alima9002/program/cd-hit-v4.6.4-2015-0603/cd-hit -i L.Trinity.fasta -o L.Trinity.fasta.cdhit -T 10 -M 10000
+
 
+
== 2017.1.11==
+
농장 출장
+
 
+
== 2017.1.12==
+
===Amore===
+
====Make SNP tables====
+
====INDEL analysis using snpEff====
+
java -jar /alima9002/program/snpEff/snpEff.jar ann -c /alima9002/program/snpEff/snpEff.config -ud 1000 gmax275 Variant.vcf.INDEL > Variant.vcf.INDEL.snpEff
+
 
+
==== homologs filtering using annotation file ====
+
==== VCF filter by homologs retrieved by ann file ====
+
python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.by.ann.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann
+
==== Determination Synonymous ====
+
python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS.SynNonsyn
+
 
+
===Jatropha KaKs===
+
python parsing.all.kaks.py all.kaks > all.kaks.ksonly
+
====Drawing graph using R====
+
require(ggplot2)
+
data<-read.table("all.kaks.ksonly",header=F)
+
colnames(data)<-c("Species","Ks")
+
Ks <- data$Ks
+
Species <- data$Species
+
ggplot(data,aes(Ks,colour=Species))+geom_freqpoly(binwidth=0.01)+scale_x_continuous(limits=c(0,0.8))
+
 
+
==2017.1.13==
+
농장 출장(꼬투리 lwt)
+
 
+
==2017.1.16==
+
===Jatropha Ks value using transcriptome===
+
Jat species which were not clustered by cdhit were used for TBLASTX
+
tblastx -db Jct.cds.fa.complete.fa -query Jgo.cds.fa.complete.fa -evalue 1e-10 -outfmt 6 -num_alignments 5 -out Jct.tblastx.nocdhit.Jgo.1e-10.out6 -num_threads 8
+
 
+
===Amore snpEff===
+
Split result as one by one lines
+
perl /alima9002/program/snpEff/scripts/vcfEffOnePerLine.pl Variant.vcf.INDEL.snpEff
+
 
+
===Amore SNP typing===
+
for check IT182932 mapping depth, SNP typing is performed in not filtered vcf file
+
  python ~/py/Reseq/\[Reseq\]SNP_counter.py Variant.vcf.SNP /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type
+
 
+
==2017.1.17==
+
===OrthoMcl for Jat Organ===
+
blastp -db goodProteins.fasta -query goodProteins.fasta -outfmt 6 -out goodProteins.fasta.allvall.jat.organ -num_threads 15 -evalue 1e-5 -seg yes -soft_masking true -max_target_seqs 999999999
+
===Drawing Venn diagram using JatSp orthomcl result===
+
D:\Lab work\Jatropha\JatSp_Orthomcl_Venn
+
===Re-SNP typing of amore study===
+
명령어가 잘못된 것을 발견
+
cat Variant.vcf.SNP | python ~/py/Reseq/\[Reseq\]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type
+

Latest revision as of 02:22, 18 December 2017

2017 Jan Taeyoung Lab note

2017 Feb Taeyoung Lab note

2017 Mar Taeyoung Lab note

2017 May Taeyoung Lab note

2017 Sep Taeyoung Lab note

2017 Oct Taeyoung Lab note

2017 Nov Taeyoung Lab note

2017 Dec Taeyoung Lab note