Common Errors ------------- Input VCF Sample Information ____________________________ **VCF contains more than one sample but sample_name is not set.** pVACseq supports running with a multi-sample VCF as input. However, in this case it requires the user to pick the sample to analyze, as only variants that are called in the specified sample will be processed. When running a multi-sample VCF the ``sample_name`` parameter is used to identify which sample to analyze. Take, for example, the following ``#CHROM`` VCF header: .. code-block:: none #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR This VCF contains two samples, ``NORMAL`` and ``TUMOR``. Use ``TUMOR`` as the ``sample_name`` parameter to process the tumor sample, and ``NORMAL`` to process the normal sample. If the input VCF only contains a single sample, the ``sample_name`` parameter does not need to match the sample name in the VCF. **sample_name not a sample ID in the #CHROM header of VCF** This error occurs when running a multi-sample VCF and the ``sample_name`` parameter doesn't match any of the sample IDs in the VCF ``#CHROM`` header. Take, for example, the following ``#CHROM`` header: .. code-block:: none #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR All columns after ``FORMAT`` are sample identifiers that can be used as the ``sample_name`` parameter when running pVACseq, depending on which sample the user wishes to process. Change the ``sample_name`` parameter of your ``pvacseq run`` command to match one of them. **normal_sample_name not a sample ID in the #CHROM header of VCF** Your ``pvacseq run`` command included the ``--normal-sample-name`` parameter. However, the argument chosen did not match any of the sample identifiers in the ``#CHROM`` header of the input VCF. Take, for example, the following ``#CHROM`` VCF header: .. code-block:: none #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR All columns after ``FORMAT`` are sample identifiers that can be used as the ``--normal-sample-name`` parameter when running pVACseq, depending on which sample is the normal sample in the VCF. Change the ``--normal-sample-name`` parameter of your ``pvacseq run`` command to match the appropriate sample identifier. **VCF doesn't contain any sample genotype information.** pVACseq uses the sample genotype to identified which variants were called. Therefore, while a VCF without a ``FORMAT`` and sample column(s) is valid, it cannot be used in pVACseq. You will need to manually edit your VCF and add a ``FORMAT`` and sample column with the ``GT`` genotype field. For more information on this formatting please see the `VCF specification `_ for your specific VCF version. Input VCF Compression and Indexing __________________________________ **Input VCF needs to be bgzipped when running with a proximal variants VCF.** When running pVACseq with the ``--proximal-variants-vcf`` argument, the main input VCF needs to be bgzipped and tabix indexed. See :ref:`the Input File Preparation section ` of the documentation for instructions on how to do so. **Proximal variants VCF needs to be bgzipped.** The VCF provided via the ``--proximal-variants-vcf`` argument needs to be bgzipped and tabix indexed. See :ref:`the Input File Preparation section ` of the documentation for instructions on how to do so. **No .tbi file found for input VCF. Input VCF needs to be tabix indexed if processing with proximal variants.** When running pVACseq with the ``--proximal-variants-vcf`` argument, the main input VCF needs to be bgzipped and tabix indexed. See :ref:`the Input File Preparation section ` of the documentation for instructions on how to do so. **No .tbi file found for proximal variants VCF. Proximal variants VCF needs to be tabix indexed.** The VCF provided via the ``--proximal-variants-vcf`` argument needs to be bgzipped and tabix indexed. See :ref:`the Input File Preparation section ` of the documentation for instructions on how to do so. Input VCF VEP Annotation ________________________ **Input VCF does not contain a CSQ header. Please annotate the VCF with VEP before running it.** pVACseq requires the input VCF to be annotated by VEP. The provided input VCF doesn't contain a ``CSQ`` ``INFO`` header. This indicates that it has not been annotated. :ref:`The Input File Preparation section ` of the documentation provides instructions on how to annotate your VCF with VEP. **VCF doesn't contain VEP FrameshiftSequence annotations. Please re-annotate the VCF with VEP and the Wildtype and Frameshift plugins.** Although the input VCF was annotated with VEP, it is missing the required annotations provided by the VEP Frameshift plugin. The input VCF will need to be reannotated using all of the required arguments as outlined in the :ref:`Input File Preparation section ` of the documentation. **VCF doesn't contain VEP WildtypeProtein annotations. Please re-annotate the VCF with VEP and the Wildtype and Frameshift plugins.** Although the input VCF was annotated with VEP, it is missing the required annotations provided by the VEP Wildtype plugin. The input VCF will need to be reannotated using all of the required arguments as outlined in the :ref:`Input File Preparation section ` of the documentation. **Proximal Variants VCF does not contain a CSQ header. Please annotate the VCF with VEP before running it.** When running pVACseq with the ``--proximal-variants-vcf`` argument, that proximal variants VCF needs to be annotated by VEP. The provided proximal variants VCF doesn't contain a ``CSQ`` ``INFO`` header. This indicates that it has not been annotated. :ref:`The Input File Preparation section ` of the documentation provides instructions on how to annotate your VCF with VEP. **There was a mismatch between the actual wildtype amino acid sequence and the expected amino acid sequence. Did you use the same reference build version for VEP that you used for creating the VCF?** This error occurs when the reference nucleotide at a specific position is different than the Ensembl transcript nucleotide at the same position. This results in the mutant amino acid in the ``Amino_acids`` VEP annotation being different from the amino acid of the transcript protein sequence as predicted by the Wildtype plugin. The ``Amino_acids`` VEP annotation is based on the reference and alternate nucleotides of the variant while the ``WildtypeProtein`` prediction is based on the Ensembl transcript nucleotide sequence. This points to a fundamental disagreement between the reference that was used during alignment and variant calling and the Ensembl reference. This mismatch cannot be resolved by pVACseq, which is why this error is fatal. Here are a few things that might resolve this error: - Checking that the build of the VEP cache matches the alignment build and downloading the correct cache if there is a build mismatch (such as a build 38 cache with a build 37 VCF, or vice versa) - Using the ``--assembly`` parameter during VEP annotation with the correct build version to match your VCF - Using the ``fasta`` parameter during VEP annotation with the reference used to create the VCF - Manually fixing the reference bases in your VCF to match the one expected by Ensembl - Realigning and redoing variant calling on your sample with a reference that matches what is expected by VEP If this mismatch cannot be resolved the VCF cannot be used by pVACseq. We create the ref-transcript-mismatch-reporter tool to identify and remove such variants from your VCF. The tool is available as part of `https://vatools.readthedocs.io/en/latest/ref_transcript_mismatch_reporter.html `_. Other _____ **The TSV file is empty. Please check that the input VCF contains missense, inframe indel, or frameshift mutations.** None of the variants in the VCF file are supported by pVACseq. This could be either because none of the variants have a protein-altering consequence or none of the variants are called in the sample (i.e. have a ``0/1`` or ``1/1`` genotype). If you are using the ``--pass-only`` option it might also be the case that all supported variants are filtered. **Illegal instruction (core dumped)** This issue may occur when you are trying to run the tensorflow-based prediction algorithms MHCnuggets and/or MHCflurry. This indicates that your computer's hardware does not support the version of tensorflow that is installed. Downgrading tensorflow manually to version 1.5.0 (``pip install tensorflow==1.5.0``) should solve this problem. .. A proximal variants TSV output path and peptide length need to be specified if a proximal variants input VCF is provided.") 'Failed to extract format string from info description for tag (CSQ)') "Warning: TSV index already exists: {}".format(index)) 'Duplicate TSV indexes')