Version 2.0
===========

Version 2.0.0
-------------

This version adds the following features, outlined below. Please note that
pVACtools 2.0 is not backwards-compatible and certain changes will break old
workflows.

Breaking changes
________________

- pVACtools now supports variable epitope lengths for class II prediction algorithms. The previous option
  ``--epitope-length`` (``-e``) no longer exists. It has been replaced with
  ``--class-i-epitope-length`` (``-e1``) and ``--class-ii-epitope-length``
  (``-e2``) for class I and class II epitope lengths, respectively. The
  defaults are ``[8, 9, 10, 11]`` and ``[12, 13, 14, 15, 16, 17, 18]``,
  respectively.
- The ``--peptide-sequence-length`` option has been removed. The peptide
  sequence length is now determined by the epitope length(s) to determine the
  flanking sequence length before and after the mutation.
- pVACtools no longer depends on conda. pVACtools remains compatible with
  Python 3.5 and above but users may chose any environment manager to set up
  an appropriate Python environment.
- When using standalone IEDB, pVACtools is now only compatible with IEDB 3.1
  and above. Please see :ref:`install` for instructions on installing the
  latest IEDB version.
- pVACseq is no longer dependent on annotations with the VEP Downstream
  plugin. This dependency has been replaced with the VEP Frameshift plugin.
  This requires changes to your existing VEP installation in order to install
  the Frameshift plugin. Existing VCFs that were previously annotated to work
  with pVACtools 1.5 and below will no longer work with version 2.0 and above
  and will need to be reannotated. Please see our documentation on :ref:`vep`
  for more information.
- The filtered.condensed.tsv report has been removed and replaced with the
  all_epitopes.aggregated.tsv report. We believe that this new report will
  provide a more comprehensive and easier to understand summary of your
  results. Please see the Output Files sections of each tool for more
  information on this new report.

New features
____________

- pVACtools now provides binding affinity percentile rank information, in
  addition to the raw ic50 binding affinity values. Users may filter on the
  percentile rank by using the new ``--percentile-threshold`` argument.
- Users now have the option of calculating the reference proteome similarity
  of their filtered epitopes. For this, the peptide sequence for the
  remaining variants is mapped to the reference proteome using BLAST. Variants
  where this yields a hit to a reference proteome are marked accordingly and a
  ``.reference_matches`` file provides more information about the matches.
  This option can be enabled using the ``--run-reference-proteome-similarity``
  option.
- Users may now use the options ``all``, ``all_class_i``, or ``all_class_ii``
  instead of specific prediction algorithms in order to run all prediction
  algorithms, all class I prediction algorithms, or all class II prediction
  algorithms, respectively.
- For successful pVACvector runs, we now output a ``_results.dna.fa`` file
  with the most likely nucleic acid sequence for the predicted vector.

Minor Updates
_____________

- When running pVACseq with a proximal variants VCF we would previously assume
  that your ran VEP with the ``--pick`` option and only process the first transcript
  annotation for a variant. With this update we will now associate the correct
  transcript for a proximal variant with the matching transcript of the main
  somatic variant of interest.
- The ``pvacseq generate_protein_fasta`` command now allows users to provide a
  proximal variants VCF using the ``--phased-proximal-variants-vcf`` option.
- The ``pvacseq generate_protein_fasta`` command now supports multi-sample
  VCFs. Users may use the ``--sample-name`` to provide the sample name of the
  sample they wish to process.
- pVACseq and pVACfuse would previously error out if the intermediate TSV
  parsed from the input was empty. In 2.0 the tool will no longer
  error out but exit with an appropriate message.
- pVACvector would previously error out when no valid path was found. In 2.0
  pVACvector will not longer error out but exit with an appropriate message.
- We now set consistent file permissions on all output files.
- We've updated our license to BSD 3-Cause Clear. Please note that the
  individual licenses of our dependent tools remain in place. These can be
  viewed by on the :ref:`tools` page.

Version 2.0.1
-------------

This is a bugfix release. It fixes the following problem(s):

- NetMHCstabpan and NetCons have moved to a new server resulting in no results
  being returned from the old server URL. This results in empty filtered.tsv
  report files when either the ``--netmhc-stab`` or ``--net-chop-method`` were
  enabled. This release fixes our usage of these tools to use the new server URL.

Version 2.0.2
-------------

This is a bugfix release. It fixes the following problem(s):

- There was a bug in the aggregate report creation. When parsing the
  all_epitopes file as input to the report creation any ``N/A`` amino acid
  changes would be parsed as null values. This release fixes this issue so
  that these amino acid change values are correctly parsed as string values.
- This release removes support for Python 3.5 since that Python version has
  reached end of life and is no longer supported.

Version 2.0.3
-------------

This is a bugfix release. It fixes the following problem(s):

- A bug in the reference proteome similarity step would cause this step to
  fail if the full wildtype peptide sequence of the frameshift was longer
  than its full mutant peptide sequence. This release fixes this issue.
- A bug in the top score filter would cause this step to fail if it
  encountered transcripts that do not start with ``ENS``. Support for
  transcripts that start with ``NM_`` has been added in this release and a
  more descriptive error message will now be raised if an unsupported
  transcript name is encountered.
- This release adds some minor improvements to the reference proteome
  similarity step. A wait of 10 seconds was added after calling the BLAST API
  to comply with their usage guidelines. Word size and gapcost parameters were
  also added to these calls to improve result specificity.

Version 2.0.4
-------------

This is a bugfix release. It fixes the following problem(s):

- Failed calls to the NetChop and NetMHCstab API were not being caught
  correctly because failures would still result in a 200 return code. This
  would ultimately result in empty filtered report files. This
  release adds more error checking around the returned content
  from these APIs and will fail if the content is not formatted as expected.
- This release adds handling of some more VCF edge cases that were previously
  unsupported. Variant transcripts that are annotated with * in the wildtype
  protein sequence or that have a stop_retained_variant consequence are now
  skipped. In addition, some variants may encode their postion as ``-/1234``,
  which was previsouly not supported but has now been added.
- When running pVACseq, pVACbind, or pVACfuse with the
  ``--run-reference-proteome-similarity`` option enable this step would create
  a reference matches file but the pipeline previously failed to copy this
  file into the output directory. This release fixes that issue.
- keras is now pinned to version 2.4.3 since newer versions might not be compatible
  with the pinned tensorflow version.

Version 2.0.5
-------------

This is a bugfix release. It fixes the following problem(s):

- Some users have reported "Cannot open file" errors when running
  NetMHCstabpan. This release adds a retry when this error in encountered.
- This release adds stricter checking to pVACbind for unsupported amino acids.
  Sequences containing an unsupported amino acid will be skipped. The
  following amino acids are supported: ``A``, ``R``, ``N``, ``D``, ``C``, ``E``,
  ``Q``, ``G``, ``H``, ``I``, ``L``, ``K``, ``M``, ``F``, ``P``, ``S``, ``T``,
  ``W``, ``Y``, ``V``.
- Some VEP predictions for supported variant types might not contain any
  protein position information, rendering pVACseq unable to parse such
  annotations. Annotations without protein position information will now be skipped.

Version 2.0.6
-------------

This is a bugfix release. It fixes the following problem(s):

- When running pVAcseq with a proximal variants VCF, proximal DNPs affecting
  multiple amino acids were not handled correctly and would result in an error.
  This issue has now been fixed.

Version 2.0.7
-------------

This is a bugfix release. It fixes the following problem(s):

- This releases fixes an edge case that would result in an error when the proximal
  variant VCF didn't contain a region from the somatic VCF.