Version 2.0

Version 2.0.0

This version adds the following features, outlined below. Please note that pVACtools 2.0 is not backwards-compatible and certain changes will break old workflows.

Breaking changes

  • pVACtools now supports variable epitope lengths for class II prediction algorithms. The previous option --epitope-length (-e) no longer exists. It has been replaced with --class-i-epitope-length (-e1) and --class-ii-epitope-length (-e2) for class I and class II epitope lengths, respectively. The defaults are [8, 9, 10, 11] and [12, 13, 14, 15, 16, 17, 18], respectively.

  • The --peptide-sequence-length option has been removed. The peptide sequence length is now determined by the epitope length(s) to determine the flanking sequence length before and after the mutation.

  • pVACtools no longer depends on conda. pVACtools remains compatible with Python 3.5 and above but users may chose any environment manager to set up an appropriate Python environment.

  • When using standalone IEDB, pVACtools is now only compatible with IEDB 3.1 and above. Please see Installation for instructions on installing the latest IEDB version.

  • pVACseq is no longer dependent on annotations with the VEP Downstream plugin. This dependency has been replaced with the VEP Frameshift plugin. This requires changes to your existing VEP installation in order to install the Frameshift plugin. Existing VCFs that were previously annotated to work with pVACtools 1.5 and below will no longer work with version 2.0 and above and will need to be reannotated. Please see our documentation on Annotating your VCF with VEP for more information.

  • The filtered.condensed.tsv report has been removed and replaced with the all_epitopes.aggregated.tsv report. We believe that this new report will provide a more comprehensive and easier to understand summary of your results. Please see the Output Files sections of each tool for more information on this new report.

New features

  • pVACtools now provides binding affinity percentile rank information, in addition to the raw ic50 binding affinity values. Users may filter on the percentile rank by using the new --percentile-threshold argument.

  • Users now have the option of calculating the reference proteome similarity of their filtered epitopes. For this, the peptide sequence for the remaining variants is mapped to the reference proteome using BLAST. Variants where this yields a hit to a reference proteome are marked accordingly and a .reference_matches file provides more information about the matches. This option can be enabled using the --run-reference-proteome-similarity option.

  • Users may now use the options all, all_class_i, or all_class_ii instead of specific prediction algorithms in order to run all prediction algorithms, all class I prediction algorithms, or all class II prediction algorithms, respectively.

  • For successful pVACvector runs, we now output a _results.dna.fa file with the most likely nucleic acid sequence for the predicted vector.

Minor Updates

  • When running pVACseq with a proximal variants VCF we would previously assume that your ran VEP with the --pick option and only process the first transcript annotation for a variant. With this update we will now associate the correct transcript for a proximal variant with the matching transcript of the main somatic variant of interest.

  • The pvacseq generate_protein_fasta command now allows users to provide a proximal variants VCF using the --phased-proximal-variants-vcf option.

  • The pvacseq generate_protein_fasta command now supports multi-sample VCFs. Users may use the --sample-name to provide the sample name of the sample they wish to process.

  • pVACseq and pVACfuse would previously error out if the intermediate TSV parsed from the input was empty. In 2.0 the tool will no longer error out but exit with an appropriate message.

  • pVACvector would previously error out when no valid path was found. In 2.0 pVACvector will not longer error out but exit with an appropriate message.

  • We now set consistent file permissions on all output files.

  • We’ve updated our license to BSD 3-Cause Clear. Please note that the individual licenses of our dependent tools remain in place. These can be viewed by on the Tools Used By pVACtools page.

Version 2.0.1

This is a bugfix release. It fixes the following problem(s):

  • NetMHCstabpan and NetCons have moved to a new server resulting in no results being returned from the old server URL. This results in empty filtered.tsv report files when either the --netmhc-stab or --net-chop-method were enabled. This release fixes our usage of these tools to use the new server URL.

Version 2.0.2

This is a bugfix release. It fixes the following problem(s):

  • There was a bug in the aggregate report creation. When parsing the all_epitopes file as input to the report creation any N/A amino acid changes would be parsed as null values. This release fixes this issue so that these amino acid change values are correctly parsed as string values.

  • This release removes support for Python 3.5 since that Python version has reached end of life and is no longer supported.

Version 2.0.3

This is a bugfix release. It fixes the following problem(s):

  • A bug in the reference proteome similarity step would cause this step to fail if the full wildtype peptide sequence of the frameshift was longer than its full mutant peptide sequence. This release fixes this issue.

  • A bug in the top score filter would cause this step to fail if it encountered transcripts that do not start with ENS. Support for transcripts that start with NM_ has been added in this release and a more descriptive error message will now be raised if an unsupported transcript name is encountered.

  • This release adds some minor improvements to the reference proteome similarity step. A wait of 10 seconds was added after calling the BLAST API to comply with their usage guidelines. Word size and gapcost parameters were also added to these calls to improve result specificity.

Version 2.0.4

This is a bugfix release. It fixes the following problem(s):

  • Failed calls to the NetChop and NetMHCstab API were not being caught correctly because failures would still result in a 200 return code. This would ultimately result in empty filtered report files. This release adds more error checking around the returned content from these APIs and will fail if the content is not formatted as expected.

  • This release adds handling of some more VCF edge cases that were previously unsupported. Variant transcripts that are annotated with * in the wildtype protein sequence or that have a stop_retained_variant consequence are now skipped. In addition, some variants may encode their postion as -/1234, which was previsouly not supported but has now been added.

  • When running pVACseq, pVACbind, or pVACfuse with the --run-reference-proteome-similarity option enable this step would create a reference matches file but the pipeline previously failed to copy this file into the output directory. This release fixes that issue.

  • keras is now pinned to version 2.4.3 since newer versions might not be compatible with the pinned tensorflow version.

Version 2.0.5

This is a bugfix release. It fixes the following problem(s):

  • Some users have reported “Cannot open file” errors when running NetMHCstabpan. This release adds a retry when this error in encountered.

  • This release adds stricter checking to pVACbind for unsupported amino acids. Sequences containing an unsupported amino acid will be skipped. The following amino acids are supported: A, R, N, D, C, E, Q, G, H, I, L, K, M, F, P, S, T, W, Y, V.

  • Some VEP predictions for supported variant types might not contain any protein position information, rendering pVACseq unable to parse such annotations. Annotations without protein position information will now be skipped.

Version 2.0.6

This is a bugfix release. It fixes the following problem(s):

  • When running pVAcseq with a proximal variants VCF, proximal DNPs affecting multiple amino acids were not handled correctly and would result in an error. This issue has now been fixed.

Version 2.0.7

This is a bugfix release. It fixes the following problem(s):

  • This releases fixes an edge case that would result in an error when the proximal variant VCF didn’t contain a region from the somatic VCF.