Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vBestIdentityPercent value NaN #1828

Open
kwuiee opened this issue Oct 16, 2024 · 6 comments
Open

vBestIdentityPercent value NaN #1828

kwuiee opened this issue Oct 16, 2024 · 6 comments

Comments

@kwuiee
Copy link

kwuiee commented Oct 16, 2024

Hi,

I am using MIXCR for some SHM calculations. One of my samples shows a clone with vBestIdentityPercent value NaN. Wondering how this happened. Is the value expected or not?

cloneId cloneCount cloneFraction targetSequences targetQualities allVHitsWithScore allDHitsWithScore allJHitsWithScore allCHitsWithScore allVAlignments allDAlignments allJAlignments allCAlignments nSeqFR1 minQualFR1 nSeqCDR1 minQualCDR1 nSeqFR2 minQualFR2 nSeqCDR2 minQualCDR2 nSeqFR3 minQualFR3 nSeqCDR3 minQualCDR3 nSeqFR4 minQualFR4 aaSeqFR1 aaSeqCDR1 aaSeqFR2 aaSeqCDR2 aaSeqFR3 aaSeqCDR3 aaSeqFR4 refPoints vBestIdentityPercent vIdentityPercents
3 209884 0.294703 CCCTGAGACTCTCCTGTGCAG,GGCGGATCCCTGAGACTCTCCTGTGCAGCCTCTGGATTCACCTTCAGTGACTACTACATGATTTGGATCCGCCAGGCTCCAGGGAAGGGGCTGGAGTGGGTTTCATACATTAGTAATGATTATATCAAATACTACGCAGACTCTGTGAAGGGCCGATTCACCATCTCCAGGGACAACGCCAAGAACTCACTGTATCTGCAAATGAACAGCCTGAGAGCCGAGGACACGGCCGTGTATTACTGTGCGACCGGGTGGGGACACGGTATGGACGTCTGGGGCCAAGGGACCACGGTCACCGTCTCCTCAGGTAACGTCAGCTG [[[[[[[[[[[[[[[[[[[[[,0.-3+++40212<;798<67:8?8@@>B=C::;@=:;:E>:=8;>?=;:8?:>=;=:?>=<::@C>:;96655<@3:6,2*4<++9,26/3:;9;-:27693:D?::9=<><7;,,,+113,;=88<:;3@<>>968<@;=;>:<:>::><<99<09<@<+?=:;9;<=@%B<B<==9<A==9<<:9C<7;:>>::;8AA>@@><;;=9<@;B<B;?1??=6C<=<<@aa>9;6?A==>;------------------------------------A?@@>=>C5;@<(%?9=5>:):4>));.).:;---3-,4;/117 IGHV3-11*00(1447.5) IGHD2-2100(31),IGHD6-1900(30),IGHD7-27*00(30) IGHJ6*00(339.2) IGHG100(0),IGHG200(0),IGHG300(0),IGHG400(0),IGHGP*00(0) 291|291|570|21|21||0.0,296|546|570|0|247|SA298CSG301ASG357TSC358TSG411ASA413GSG414ADG416DG417DG420SC423TST426A|2118.0 ,38|47|84|250|259|ST43G|31.0;,19|25|63|247|253||30.0;,16|22|33|252|258||30.0 ,40|83|83|264|307|SA55C|401.0 ,;,;,;,;,   GGATTCACCTTCAGTGACTACTAC 23 ATGATTTGGATCCGCCAGGCTCCAGGGAAGGGGCTGGAGTGGGTTTCATAC 9 ATTAGTAATGATTATATCAAA 10 TACTACGCAGACTCTGTGAAGGGCCGATTCACCATCTCCAGGGACAACGCCAAGAACTCACTGTATCTGCAAATGAACAGCCTGAGAGCCGAGGACACGGCCGTGTATTAC 4 TGTGCGACCGGGTGGGGACACGGTATGGACGTCTGG 12 GGCCAAGGGACCACGGTCACCGTCTCCTCAG 4   GFTFSDYY MIWIRQAPGKGLEWVSY ISNDYIK YYADSVKGRFTISRDNAKNSLYLQMNSLRAEDTAVYY CATGWGHGMDVW GQGTTVTVSS_ :::::::::::::::::::::,:::::33:57:108:129:240:-4:247:250:-10:-9:259:264:-20:276:307:: NaN NaN

best

@mizraelson
Copy link
Member

Hi, can you share the exact commands you used starting from analyze?

@kwuiee
Copy link
Author

kwuiee commented Oct 16, 2024

Hi, can you share the exact commands you used starting from analyze?

The command gave NaN shows below

java -Xmx15g -Xms15g  -Djava.io.tmpdir=/tmp -jar /bioapp/mixcr.jar align \
    -t 12 --species hsa -p kaligner2 -OallowPartialAlignments=false -OvParameters.geneFeatureToAlign=VGeneWithP  \
    --report sample.alignment.log \
    sampe_1.clean.fq.gz \
    sample_2.clean.fq.gz \
    sample.alignment.vdjca -f 

java -Xmx100g -Xms30g  -Djava.io.tmpdir=/tmp -jar /bioapp/mixcr.jar assemble \
    --write-alignments -f -t 12 \
    sample.alignment.vdjca \
    sample.alignment.clna

java -Xmx100g -Xms30g  -Djava.io.tmpdir=/tmp -jar /bioapp/mixcr.jar assembleContigs \
    -f -t 12 \
    sample.alignment.clna \
    sample.full.alignment.clns

java -Xmx100g -Xms30g  -Djava.io.tmpdir=/tmp -jar /bioapp/mixcr.jar exportClones \
    --chains IGH -f --preset full -vBestIdentityPercent -vIdentityPercents \
    sample.full.alignment.clns \
    sample_heavyChain_full.xls

By the way, my MIXCR version is 3.0.13, I think.

@mizraelson
Copy link
Member

Can you please try the latest version and confirm the issue is still present? also what type of data is it, bulk RNAseq?

@kwuiee
Copy link
Author

kwuiee commented Oct 16, 2024

It is bulk DNAseq data. It is weird to give a not NaN value when using almost half reads from fastq. The latter case targetSequence CGGCGGATCCCTGAGACTCTCCTGTGCAGCCTCTGGATTCACCTTCAGTGACTACTACATGATTTGGATCCGCCAGGCTCCAGGGAAGGGGCTGGAGTGGGTTTCATACATTAGTAATGATTATATCAAATACTACGCAGACTCTGTGAAGGGCCGATTCACCATCTCCAGGGACAACGCCAAGAACTCACTGTATCTGCAAATGAACAGCCTGAGAGCCGAGGACACGGCCGTGTATTACTGTGCGACCGGGTGGGGACACGGTATGGACGTCTGGGGCCAAGGGACCACGGTCACCGTCTCCTCAGGTAACGTCAGCTG differs a CCCTGAGACTCTCCTGTGCAG,. I will try to use the latest version.

@mizraelson
Copy link
Member

mizraelson commented Oct 16, 2024

with the latest version you can use the following command:

mixcr analyze exome-seq \
    --species hsa \
    --append-export-clones-field -vIdentityPercents \
    --append-export-clones-field -vBestIdentityPercent \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

@kwuiee
Copy link
Author

kwuiee commented Oct 16, 2024

with the latest version you can use the following command:

mixcr analyze exome-seq \
    --species hsa \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants