Optional Downstream Analysis Tools¶
Generate Aggregated Report¶
Creating converter from 7 to 5
Creating converter from 5 to 7
Creating converter from 7 to 5
Creating converter from 5 to 7
usage: pvacbind generate_aggregated_report [-h] [-b BINDING_THRESHOLD]
[--allele-specific-binding-thresholds]
[--binding-percentile-threshold BINDING_PERCENTILE_THRESHOLD]
[--immunogenicity-percentile-threshold IMMUNOGENICITY_PERCENTILE_THRESHOLD]
[--presentation-percentile-threshold PRESENTATION_PERCENTILE_THRESHOLD]
[--percentile-threshold-strategy {conservative,exploratory}]
[--aggregate-inclusion-binding-threshold AGGREGATE_INCLUSION_BINDING_THRESHOLD]
[--aggregate-inclusion-count-limit AGGREGATE_INCLUSION_COUNT_LIMIT]
[-m {lowest,median}]
[-m2 TOP_SCORE_METRIC2]
input_file output_file
Generate an aggregated report from a pVACbind .all_epitopes.tsv report file.
positional arguments:
input_file A pVACbind .all_epitopes.tsv report file
output_file The file path to write the aggregated report tsv to
optional arguments:
-h, --help show this help message and exit
-b BINDING_THRESHOLD, --binding-threshold BINDING_THRESHOLD
Tier epitopes in the "Pass" tier when the mutant
allele has ic50 binding scores below this value and in
the "Relaxed" tier when the mutant allele has ic50
binding scores below double this value. (default: 500)
--allele-specific-binding-thresholds
Use allele-specific binding thresholds. To print the
allele-specific binding thresholds run `pvacbind
allele_specific_cutoffs`. If an allele does not have a
special threshold value, the `--binding-threshold`
value will be used. (default: False)
--binding-percentile-threshold BINDING_PERCENTILE_THRESHOLD
Tier epitopes in the "Pass" tier when the mutant
allele has a binding percentile below this value.
(default: 2.0)
--immunogenicity-percentile-threshold IMMUNOGENICITY_PERCENTILE_THRESHOLD
Tier epitopes in the "Pass" tier when the mutant
allele has a immunogenicity percentile below this
value. (default: 2.0)
--presentation-percentile-threshold PRESENTATION_PERCENTILE_THRESHOLD
Tier epitopes in the "Pass" tier when the mutant
allele has a presentation percentile below this value.
(default: 2.0)
--percentile-threshold-strategy {conservative,exploratory}
Specify the candidate inclusion strategy. The
'conservative' option requires a candidate to pass the
binding threshold and all percentile thresholds
(default). The 'exploratory' option requires a
candidate to pass at the binding threshold or one of
the percentile thresholds. (default: conservative)
--aggregate-inclusion-binding-threshold AGGREGATE_INCLUSION_BINDING_THRESHOLD
Threshold for including epitopes when creating the
aggregate report (default: 5000)
--aggregate-inclusion-count-limit AGGREGATE_INCLUSION_COUNT_LIMIT
Limit neoantigen candidates included in the aggregate
report to only the best n candidates per variant.
(default: 15)
-m {lowest,median}, --top-score-metric {lowest,median}
The ic50 scoring metric to use when filtering epitopes
by binding-threshold or minimum fold change. lowest:
Use the best MT Score and Corresponding Fold Change
(i.e. the lowest MT ic50 binding score and
corresponding fold change of all chosen prediction
methods). median: Use the median MT Score and Median
Fold Change (i.e. the median MT ic50 binding score and
fold change of all chosen prediction methods).
(default: median)
-m2 TOP_SCORE_METRIC2, --top-score-metric2 TOP_SCORE_METRIC2
Which metrics to consider when selecting the best
peptide and when sorting candidates within a tier.
Each specified metric will be ranked and the sum of
these ranks will be used.Whether the lowest or median
is considered for each metric is controlled by the
--top-score-metric parameter. (default: ['ic50',
'combined_percentile'])
This tool produces an aggregated version of the all_epitopes TSV. It finds the best-scoring (lowest binding affinity) epitope for each variant, and outputs additional binding affinity for that epitope. It also gives information about the total number of well-scoring epitopes for each variant, as well as the HLA alleles that those epitopes are well-binding to. For a full overview of the output, see the pVACbind output file documentation.
Calculate Reference Proteome Similarity¶
Creating converter from 7 to 5
Creating converter from 5 to 7
Creating converter from 7 to 5
Creating converter from 5 to 7
usage: pvacbind calculate_reference_proteome_similarity [-h]
[--match-length MATCH_LENGTH]
[--species SPECIES]
[--blastp-path BLASTP_PATH]
[--blastp-db {refseq_select_prot,refseq_protein}]
[--peptide-fasta PEPTIDE_FASTA]
[-t N_THREADS]
input_file input_fasta
output_file
Identify which epitopes in a pVACseq|pVACfuse|pVACbind report file have
matches in the reference proteome using either BLASTp or a checking directly
against a reference proteome FASTA.
positional arguments:
input_file Input filtered, all_epitopes, or aggregated report
file with predicted epitopes.
input_fasta For pVACbind, the original input FASTA file. For
pVACseq, pVACfuse, and pVACsplice a FASTA file with
mutant peptide sequences for each variant isoform. For
pVACseq and pVACfuse, this file can be found in the
same directory as the input
filtered.tsv/all_epitopes.tsv file. For pVACsplice,
this file can be found in the main output directory.
Can also be generated by running
`pvacseq|pvacfuse|pvacsplice generate_protein_fasta`.
output_file Output TSV filename of report file with epitopes with
reference matches marked.
optional arguments:
-h, --help show this help message and exit
--match-length MATCH_LENGTH
The minimum number of consecutive amino acids that
need to match. (default: 8)
--species SPECIES The species of the data in the input file. (default:
human)
--blastp-path BLASTP_PATH
Blastp installation path. (default: None)
--blastp-db {refseq_select_prot,refseq_protein}
The blastp database to use. (default:
refseq_select_prot)
--peptide-fasta PEPTIDE_FASTA
A reference peptide FASTA file to use for finding
reference matches instead of blastp. (default: None)
-t N_THREADS, --n-threads N_THREADS
Number of threads to use for parallelizing BLAST
calls. (default: 1)
This tool will Blast peptides against the relative reference proteome and return the results in an output TSV & reference_match file pair, given a pVACbind run’s fasta and filtered/all_epitopes TSV. Typically, this can be done as part of the pVACbind run pipeline for the filtered output TSV if specified. This tool, however, provides a standalone way to run this on pVACbind’s generated filtered/all_epitopes TSV files. For instance, this may be desired if pvacbind was originally run without this specified and one wished to perform this additional step after the fact for the filtered TSV—or perhaps instead the results of this were desired for the all_epitopes TSV which does not have this step performed. For a closer look at the generated reference_match file, see the pVACbind output file documentation.
NetChop Predict Cleavage Sites¶
Creating converter from 7 to 5
Creating converter from 5 to 7
Creating converter from 7 to 5
Creating converter from 5 to 7
usage: pvacbind net_chop [-h] [--method {cterm,20s}] [--threshold THRESHOLD]
input_file input_fasta output_file
Predict cleavage sites for neoepitopes.
positional arguments:
input_file Input filtered file with predicted epitopes.
input_fasta The required fasta file.
output_file Output tsv filename for putative neoepitopes.
optional arguments:
-h, --help show this help message and exit
--method {cterm,20s} NetChop prediction method to use ("cterm" for C term
3.0, "20s" for 20S 3.0). (default: cterm)
--threshold THRESHOLD
NetChop prediction threshold. (default: 0.5)
This tool uses NetChop to predict cleavage sites for neoepitopes from a pVACbind run’s filtered/all_epitopes TSV. In its output, it adds to the TSV 3 columns: Best Cleavage Position, Best Cleavage Score, and a Cleavage Sites list. Typically this step is done in the pVACbind run pipeline for the filtered output TSV when specified. This tool provides a way to manually run this on pVACbind’s generated filtered/all_epitopes TSV files so that you can add this information when not present if desired. You can view more about these columns for pVACbind in the output file documentation.
NetMHCStab Predict Stability¶
Creating converter from 7 to 5
Creating converter from 5 to 7
Creating converter from 7 to 5
Creating converter from 5 to 7
usage: pvacbind netmhc_stab [-h] [-m {lowest,median}] [-m2 TOP_SCORE_METRIC2]
input_file output_file
Add stability predictions to predicted neoepitopes.
positional arguments:
input_file Input filtered file with predicted epitopes.
output_file Output TSV filename for putative neoepitopes.
optional arguments:
-h, --help show this help message and exit
-m {lowest,median}, --top-score-metric {lowest,median}
The ic50 scoring metric to use when sorting epitopes.
lowest: Use the best MT Score and Corresponding Fold
Change (i.e. the lowest MT ic50 binding score and
corresponding fold change of all chosen prediction
methods). median: Use the median MT Score and Median
Fold Change (i.e. the median MT ic50 binding score and
fold change of all chosen prediction methods).
(default: median)
-m2 TOP_SCORE_METRIC2, --top-score-metric2 TOP_SCORE_METRIC2
Which metrics to consider when sorting the results.
All listed metrics will be rank scored and the sum of
those rank scores will be used. Whether the lowest or
median is considered for each metric is controlled by
the --top-score-metric parameter. (default: ['ic50',
'combined_percentile'])
This tool uses NetMHCstabpan to add stability predictions for neoepitopes from a pVACbind run’s filtered/all_epitopes TSV. In its output, it adds to the TSV 4 columns: Predicted Stability, Half Life, Stability Rank, and NetMHCStab Allele. Typically this step is done in the pVACbind run pipeline for the filtered output TSV when specified. This tool provides a way to manually run this on pVACbind’s generated filtered/all_epitopes TSV files so that you can add this information when not present if desired. You can view more about these columns for pVACbind in the output file documentation.
Identify Problematic Amino Acids¶
Creating converter from 7 to 5
Creating converter from 5 to 7
Creating converter from 7 to 5
Creating converter from 5 to 7
usage: pvacbind identify_problematic_amino_acids [-h]
[--filter-type {soft,hard}]
input_file output_file
problematic_amino_acids
Mark problematic amino acid positions in each epitope or filter entries that have problematic amino acids.
positional arguments:
input_file Input filtered, all_epitopes, or aggregated file with predicted epitopes.
output_file Output .tsv file with identification of problematic amino acids or hard-filtered to remove epitopes with problematic amino acids.
problematic_amino_acids
A list of amino acids to consider as problematic. Each entry can be specified in the following format:
`amino_acid(s)`: One or more one-letter amino acid codes. Any occurrence of this amino acid string,
regardless of the position in the epitope, is problematic. When specifying more than
one amino acid, they will need to occur together in the specified order.
`amino_acid:position`: A one letter amino acid code, followed by a colon separator, followed by a positive
integer position (one-based). The occurrence of this amino acid at the position
specified is problematic., E.g. G:2 would check for a Glycine at the second position
of the epitope. The N-terminus is defined as position 1.
`amino_acid:-position`: A one letter amino acid code, followed by a colon separator, followed by a negative
integer position. The occurrence of this amino acid at the specified position from
the end of the epitope is problematic. E.g., G:-3 would check for a Glycine at the
third position from the end of the epitope. The C-terminus is defined as position -1.
optional arguments:
-h, --help show this help message and exit
--filter-type {soft,hard}, -f {soft,hard}
Set the type of filtering done. Choosing `soft` will add a new column "Problematic Positions" (for filtered or all_epitopes input files) or "Prob Pos" (for aggregated input files) that lists positions in the epitope with problematic amino acids. Choosing `hard` will remove epitope entries with problematic amino acids.
This tool is used to identify positions in an epitope with an amino acid that is problematic for downstream processing, e.g. vaccine manufacturing. Since this can differ from case to case, this tool requires the user to specify which amino acid(s) to consider problematic. This can be specified in one of three formats:
|
One or more one-letter amino acid codes. Any occurrence of this amino acid string, regardless of the position in the epitope, is problematic. When specifying more than one amino acid, they will need to occur together in the specified order. |
|
A one letter amino acid code, followed by a colon separator, followed by a positive integer position (one-based). The occurrence of this amino acid at the position specified is problematic., E.g. G:2 would check for a Glycine at the second position of the epitope. The N-terminus is defined as position 1. |
|
A one letter amino acid code, followed by a colon separator, followed by a negative integer position. The occurrence of this amino acid at the specified position from the end of the epitope is problematic. E.g., G:-3 would check for a Glycine at the third position from the end of the epitope. The C-terminus is defined as position -1. |
You may specify any number of these problematic amino acid(s), in any combination, by providing them as a comma-separated list.
This tool may be used with any filtered.tsv or all_epitopes.tsv pVACbind report file.
Update Tiers¶
Creating converter from 7 to 5
Creating converter from 5 to 7
Creating converter from 7 to 5
Creating converter from 5 to 7
usage: pvacseq update_tiers [-h] [-b BINDING_THRESHOLD]
[--allele-specific-binding-thresholds]
[--percentile-threshold PERCENTILE_THRESHOLD]
[--binding-percentile-threshold BINDING_PERCENTILE_THRESHOLD]
[--immunogenicity-percentile-threshold IMMUNOGENICITY_PERCENTILE_THRESHOLD]
[--presentation-percentile-threshold PRESENTATION_PERCENTILE_THRESHOLD]
[--percentile-threshold-strategy {conservative,exploratory}]
[-m2 TOP_SCORE_METRIC2] [--trna-vaf TRNA_VAF]
[--trna-cov TRNA_COV] [--expn-val EXPN_VAL]
[--transcript-prioritization-strategy TRANSCRIPT_PRIORITIZATION_STRATEGY]
[--maximum-transcript-support-level {1,2,3,4,5}]
[--allele-specific-anchors]
[--anchor-contribution-threshold ANCHOR_CONTRIBUTION_THRESHOLD]
input_file metrics_file vaf_clonal
Update tiers in an aggregated report in order to, for example, use different
thresholds or account for problematic position or reference match information
if run after initial pipeline run.
positional arguments:
input_file Input aggregated file with tiers to update. This file
will be overwritten with the output.
metrics_file metrics.json file corresponding to the input
aggregated file. This file will be overwritten to
update tiering parameters used by this command.
vaf_clonal The RNA VAF threshold to determine whether a candidate
is considered clonal. Any candidates with RNA VAF <
vaf_clonal/2 will be considered subclonal.
optional arguments:
-h, --help show this help message and exit
-b BINDING_THRESHOLD, --binding-threshold BINDING_THRESHOLD
IC50 binding threshold to consider when evaluting the
binding criteria. Candidates where the mutant allele
has ic50 binding scores below this value will be
considered good binders. (default: 500)
--allele-specific-binding-thresholds
Use allele-specific binding thresholds when evaluating
the binding criteria for tiering. To print the allele-
specific binding thresholds run `pvacseq
allele_specific_cutoffs`. If an allele does not have a
special threshold value, the `--binding-threshold`
value will be used. (default: False)
--percentile-threshold PERCENTILE_THRESHOLD
Account for the IC50 percentile rank when evaluating
the binding criteria for tiering. A candidate's
percentile rank must be below this value. (default:
None)
--binding-percentile-threshold BINDING_PERCENTILE_THRESHOLD
Tier epitopes in the "Pass" tier when the mutant
allele has a binding percentile below this value.
(default: 2.0)
--immunogenicity-percentile-threshold IMMUNOGENICITY_PERCENTILE_THRESHOLD
Tier epitopes in the "Pass" tier when the mutant
allele has a immunogenicity percentile below this
value. (default: 2.0)
--presentation-percentile-threshold PRESENTATION_PERCENTILE_THRESHOLD
Tier epitopes in the "Pass" tier when the mutant
allele has a presentation percentile below this value.
(default: 2.0)
--percentile-threshold-strategy {conservative,exploratory}
Specify the candidate inclusion strategy. The
'conservative' option requires a candidate to pass the
binding threshold and all percentile thresholds
(default). The 'exploratory' option requires a
candidate to pass EITHER the binding threshold or one
of the percentile thresholds. (default: conservative)
-m2 TOP_SCORE_METRIC2, --top-score-metric2 TOP_SCORE_METRIC2
Which metrics to consider when sorting candidates
within a tier. Each specified metric will be ranked
and the sum of these ranks will be used for
sorting.Whether the lowest or median is considered for
each metric is controlled by the --top-score-metric
parameter. (default: ['ic50', 'combined_percentile'])
--trna-vaf TRNA_VAF Tumor RNA VAF Cutoff in decimal format to consider
when evaluating the expression criteria. Only sites
above this cutoff will be considered. (default: 0.25)
--trna-cov TRNA_COV Tumor RNA Coverage Cutoff to consider when evaluating
the expression criteria. Only sites above this read
depth cutoff will be considered. (default: 10)
--expn-val EXPN_VAL Gene and Transcript Expression cutoff. Sites above
this cutoff will be considered. (default: 1.0)
--transcript-prioritization-strategy TRANSCRIPT_PRIORITIZATION_STRATEGY
Specify the criteria to consider when evaluating
transcripts of the neoantigen candidates. 'canonical'
will consider a candidate to come from a good
transcript if the transcript is a Ensembl canonical
transcript. 'mane_select' will consider a candidate to
come from a good transcript if the transcript is a
MANE select transcript. 'tsl' will consider a
candidate to come from a good transcript if the
transcript's support level (TSL) passes the --maximum-
transcript-support-level. When selecting more than one
criteria, a transcript meeting EITHER of the selected
criteria will be prioritized/selected. (default:
['canonical', 'mane_select', 'tsl'])
--maximum-transcript-support-level {1,2,3,4,5}
The threshold to use for filtering epitopes on the
Ensembl transcript support level (TSL). Keep all
epitopes with a transcript support level <= to this
cutoff. (default: 1)
--allele-specific-anchors
Use allele-specific anchor positions when evaluating
the anchor criteria for tiering epitopes in the
aggregate report. This option is available for 8, 9,
10, and 11mers and only for HLA-A, B, and C alleles.
If this option is not enabled or as a fallback for
unsupported lengths and alleles, the default positions
of 1, 2, epitope length - 1, and epitope length are
used. Please see
https://doi.org/10.1101/2020.12.08.416271 for more
details. (default: False)
--anchor-contribution-threshold ANCHOR_CONTRIBUTION_THRESHOLD
For determining allele-specific anchors, each position
is assigned a score based on how binding is influenced
by mutations. From these scores, the relative
contribution of each position to the overall binding
is calculated. Starting with the highest relative
contribution, positions whose scores together account
for the selected contribution threshold are assigned
as anchor locations. As a result, a higher threshold
leads to the inclusion of more positions to be
considered anchors. (default: 0.8)