Difference between revisions of "2017 Taeyoung Lab note"

From Crop Genomics Lab.
Jump to: navigation, search
(2017.1.16)
(2017.1.16)
Line 285: Line 285:
 
Split result as one by one lines
 
Split result as one by one lines
 
  perl /alima9002/program/snpEff/scripts/vcfEffOnePerLine.pl Variant.vcf.INDEL.snpEff
 
  perl /alima9002/program/snpEff/scripts/vcfEffOnePerLine.pl Variant.vcf.INDEL.snpEff
 +
 +
===Amore SNP typing===
 +
for check IT182932 mapping depth, SNP typing is performed in not filtered vcf file
 +
  python ~/py/Reseq/\[Reseq\]SNP_counter.py Variant.vcf.SNP /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type

Revision as of 00:55, 17 January 2017

Contents

Ongoing

1. TBLASTX using Jat Species Transcriptome

2017 1.2

Jatropha transcriptome Trinity assemble

raw data : 244:/NGS/NGS/JatrophaCurcas/RNA
Jatropha species transcriptome assemble : Jct,Jcu, Jin, Jgo, Jci, Jpo, Jmu, Jma, Jac, Rco (listed in 244:/NGS/NGS/JatrophaCurcas/RNA/list). All done
Jatropha organ transcriptome assemble : Leaf, Root, Stem, Female flower, Male flower, LG, SG, Y, B. All done

Cdhit

193:/data2/alima90/program/cdhit/cd-hit -i Y.cds.fa -M 10000 -o Y.cds.fa.cdhit -T 5
193:/data2/alima90/program/cdhit/cd-hit -i LG.cds.fa -M 10000 -o LG.cds.fa.cdhit -T 5

UV GBS mapping (w/ joinmap)

244:python vcf.parsing.for.mandf.py UV.vcf.SNPonly 3 0.01 except_sample.txt > UV.vcf.SNPonly.except.LowDepthSample.d3.Q30.m0.1.loc
loc file is manually edited by excel
Genetic map is constructed using Joinmap 4.1

KaKs calculation using scripts provided by MCscanX

KaKs calculation between Jatropha species
244 :python /alima9002/63_backup/Jatropha/CDS/run.kaks.py

Large Insertion Prediction

LIP short primer preparation
Primer info
>LIP01short_F
AACTGAACACAGACAATGAA
>LIP01short_R
CAATTTATACACCACCTTAC
>LIP02short_F
CTCTTTGTATTTGGTGACAA
>LIP02short_R
GTATTAGCAGCTTTTGCTTA
>LIP03short_F
AATTGTAAGACATATCCCTC
>LIP03short_R
CTGCCCCACTAATAATTAAT
>LIP04short_F
TAAAAACAGAACTTGTCCAC
>LIP04short_R
ATCACAAGACTGAACAAGTA
>LIP05short_F
ATTGACATAAGGTTGCATAG
>LIP05short_R
CCTTAGCTCTTTTCTTTTGT
>LIP06short_F
GAAGGAAGGAAGCAATTATT
>LIP06short_R
TGACTTACCCTTTTTACCTT
>LIP07short_F
CACATGTTTGTCACTCTAAT
>LIP07short_R
GAAGTGAGGCCTAAAATAAA
>LIP08short_F
GAATGTATTGTCTTTGATCC
>LIP08short_R
GTTGGATTTTGTTCTTTCCA
>LIP09short_F
AGAAAAACGTCGATACCAAA
>LIP09short_R
CGATTTAGTAACCTTAGAAC
>LIP10short_F
ATCTTCAAAATGTCTCTAGG
>LIP10short_R
TACAGATATTCTTAGGCAGT
>LIP11short_F
TGTAACTCTCAATTAAGCAG
>LIP11short_R
ATCTTTCTGTAAGCACTTAG
>LIP12short_F
CTAGAACCGATTTGTTCAAA
>LIP12short_R
GCAGTTGTTTTGGATTAACA
>LIP13short_F
AAAGAGAAAGCAGAGAAATC
>LIP13short_R
ATGTATAGATTGGAGGAAAG
>LIP14short_F
ATTATGGAAAGGAATTGGAG
>LIP14short_R
CCATGTCTAGTATTTACTCA
>LIP15short_F
TTAATGACTGATCGTTAGTG
>LIP15short_R
CGGGAGTTATGAAAAATAGT
>LIP24short_F
AGTATGGTTTCAACATATGG
>LIP24short_R
GATATGAAGTTGACATGCTA
>LIP16short_F
ATTTAAAAGCTCGTAACTCC
>LIP16short_R
GGATAAGCAATTACAACACA
>LIP17short_F
CCCAAATTTTTAAATGCACC
>LIP17short_R
CTCTTGGAACGTGAAAAATT
>LIP18short_F
TTTTCTAGAAGGATTTGTGC
>LIP18short_R
CCATGCAAACCCAATTTTAA
>LIP19short_F
GTAAAACTAAGGTTGAGCTA
>LIP19short_R
CCACAAGTCACAACAATTTA
>LIP20short_F
TTATTTGTATGTTGGAGACC
>LIP20short_R
CATGGTATATAGGTTTAGGT
>LIP21short_F
CATAGAGAGTTTTGGATTAC
>LIP21short_R
AAAGAACTGATAGTGTCATG
>LIP22short_F
ATATGTACATGTATGGTGTG
>LIP22short_R
CCTAAATCTAGCAGAAGATT
>LIP23short_F
ATGTATGGAGAAATGGGTTA
>LIP23short_R
ATATAGAAATGGAGGTTGCT
(listed in BACKUP(J:)/박사/Indel Candidate/LIP_short_primer.fa)
Primer dilution

2017. 1. 3

AMORE work

python ~/py/ret_fasta_by_gene_name.py   /alima9002/ref/Athaliana/annotation/Athaliana_167_TAIR10.cds.fa gene_list.txt > gene_list.txt.fa
blastp -db /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.protein.fa -query gene_list.txt.pep.fa -evalue 1e-5 -num_alignments 1 -outfmt 6 -num_threads 6 -out gene_list.txt.pep.fablastp.Gm275.1e-5.out6

Homolog with Ath Glyma.08G014900 Glyma.05G208300 Glyma.20G001900 Glyma.03G176600 Glyma.19G177400 Glyma.03G262600 Glyma.06G202300 Glyma.05G021800 Glyma.17G077700 Glyma.05G022000 Glyma.09G234900 Glyma.19G025000 Glyma.10G224000 Glyma.02G081000 Glyma.20G167800 Glyma.14G072700 Glyma.17G252200 Glyma.17G050500 Glyma.07G038000 Glyma.13G109100 Glyma.16G007200 Glyma.19G105100 Glyma.09G283800 Glyma.20G172700 Glyma.02G076300

SNP typing among IT182932,IT1099098,Hwangkeum-Kong

1.Read mapping using bwa mem with default options (/home/hayasen/Workspace/Glycine/GlycineMax/ver275/Reads/)

2.mpileup

samtools mpileup -f /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa -v -t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -u -b bam_list | bcftools call -v -m -O v > Variant.vcf

LIP short primer gradient PCR

1~8 primer is tested with CS-12

Gradiant lower temp is 50 upper temp is 65

Sample is loaded on 1% agarose gel and It was run with 100 V on 1 hour.

52.7 54.1 55.5 56.8 59.5 60.9 62.3

1 2

3 4


LIPshort1 -> Error when it is loaded

LIPshort4 -> 55.5~59.5에서 증폭한 샘플만 로딩

5 6

7 8

Estimated Tm:55.5~56.8


2017.1.4~2017.1.6

농장 출장

LIP Gradient PCR

50~65 celsius degree

1% agar 100V 1h


All good


LIP16's lower band is our target

LIP20 did not show band

2017.1.9

AMORE(GK, IT182932, IT109098)

VCF parsing

python ~/py/Reseq/filter.vcf.by.phred.hetero.depth.py Variant.vcf.SNP 5 > Variant.vcf.SNP.filtered.d5.Q30.homo

Typing

cat Variant.vcf.SNP.filtered.d5.Q30.homo.diff | python ~/py/Reseq/[Reseq]SNP_counter.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type

Jatropha OrthoMcl

Retrieve complete pep only for OrthoMcl

2017.1.10

AMORE(GK, IT182932, IT109098)

filtering SNPs on homologous

python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only

Syn or Nonsyn typing

python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.homologs.only.CDS.SynNonsyn

Lactuca Indica Cdhit

 /alima9002/program/cd-hit-v4.6.4-2015-0603/cd-hit -i L.Trinity.fasta -o L.Trinity.fasta.cdhit -T 10 -M 10000

2017.1.11

농장 출장

2017.1.12

Amore

Make SNP tables

INDEL analysis using snpEff

java -jar /alima9002/program/snpEff/snpEff.jar ann -c /alima9002/program/snpEff/snpEff.config -ud 1000 gmax275 Variant.vcf.INDEL > Variant.vcf.INDEL.snpEff

homologs filtering using annotation file

VCF filter by homologs retrieved by ann file

python ret_homolog_only.py Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type homologs.by.ann.txt > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann

Determination Synonymous

python ~/py/Reseq/\[Reseq\]det_syn.py /alima9002/ref/Gmax/annotation/Gmax_275_Wm82.a2.v1.gene_exons.gff3 /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS > Variant.vcf.SNP.filtered.d5.Q30.homo.diff.type.hom.ann.CDS.SynNonsyn

Jatropha KaKs

python parsing.all.kaks.py all.kaks > all.kaks.ksonly

Drawing graph using R

require(ggplot2)
data<-read.table("all.kaks.ksonly",header=F)
colnames(data)<-c("Species","Ks")
Ks <- data$Ks
Species <- data$Species
ggplot(data,aes(Ks,colour=Species))+geom_freqpoly(binwidth=0.01)+scale_x_continuous(limits=c(0,0.8))

2017.1.13

농장 출장(꼬투리 lwt)

2017.1.16

Jatropha Ks value using transcriptome

Jat species which were not clustered by cdhit were used for TBLASTX

tblastx -db Jct.cds.fa.complete.fa -query Jgo.cds.fa.complete.fa -evalue 1e-10 -outfmt 6 -num_alignments 5 -out Jct.tblastx.nocdhit.Jgo.1e-10.out6 -num_threads 8

Amore snpEff

Split result as one by one lines

perl /alima9002/program/snpEff/scripts/vcfEffOnePerLine.pl Variant.vcf.INDEL.snpEff

Amore SNP typing

for check IT182932 mapping depth, SNP typing is performed in not filtered vcf file

 python ~/py/Reseq/\[Reseq\]SNP_counter.py Variant.vcf.SNP /alima9002/ref/Gmax/assembly/Gmax_275_v2.0.fa 30 > Variant.vcf.SNP.type