Filtering Commands¶
pVACbind currently offers two filters: a binding filter and a top score filter.
These filters are always run automatically as part of the pVACbind pipeline using default cutoffs.
All filters can also be run manually on the filtered.tsv file to narrow the results down further, or they can be run on the all_epitopes.tsv file to apply different filtering thresholds.
The binding filter is used to remove neoantigen candidates that do not meet desired peptide:MHC binding criteria. The top score filter is used to select the most promising peptide candidate for each variant. Multiple candidate peptides from a single somatic variant can be caused by multiple peptide lengths, registers, HLA alleles, and transcript annotations.
Further details on each of these filters is provided below.
Note
The default values for filtering thresholds are suggestions only. While they are based on review of the literature and consultation with our clinical and immunology colleagues, your specific use case will determine the appropriate values.
Binding Filter¶
usage: pvacbind binding_filter [-h] [-b BINDING_THRESHOLD]
[-m {lowest,median}] [--exclude-NAs] [-a]
input_file output_file
positional arguments:
input_file The final report .tsv file to filter.
output_file Output .tsv file containing list of filtered epitopes
based on binding affinity.
optional arguments:
-h, --help show this help message and exit
-b BINDING_THRESHOLD, --binding-threshold BINDING_THRESHOLD
Report only epitopes where the mutant allele has ic50
binding scores below this value. (default: 500)
-m {lowest,median}, --top-score-metric {lowest,median}
The ic50 scoring metric to use when filtering epitopes
by binding-threshold or minimum fold change. lowest:
Use the Best MT Score and corresponding Fold Change
(i.e. use the lowest MT ic50 binding score and
corresponding fold change of all chosen prediction
methods). median: Use the Median MT Score and Median
Fold Change (i.e. use the median MT ic50 binding score
and fold change of all chosen prediction methods).
(default: median)
--exclude-NAs Exclude NA values from the filtered output. (default:
False)
-a, --allele-specific-binding-thresholds
Use allele-specific binding thresholds. To print the
allele-specific binding thresholds run `pvacbind
allele_specific_cutoffs`. If an allele does not have a
special threshold value, the `--binding-threshold`
value will be used. (default: False)
The binding filter removes variants that don’t pass the chosen binding threshold.
The user can chose whether to apply this filter to the lowest
or the median
binding
affinity score by setting the --top-score-metric
flag. The lowest
binding
affinity score is recorded in the Best MT Score
column and represents the lowest
ic50 score of all prediction algorithms that were picked during the previous pVACseq run.
The median
binding affinity score is recorded in the Median MT Score
column and
corresponds to the median ic50 score of all prediction algorithms used to create the report.
Be default, the binding filter runs on the median
binding affinity.
By default, entries with NA
values will be included in the output. This
behavior can be turned off by using the --exclude-NAs
flag.
Top Score Filter¶
usage: pvacbind top_score_filter [-h] [-m {lowest,median}]
input_file output_file
positional arguments:
input_file The final report .tsv file to filter.
output_file Output .tsv file containing only the list of the top
epitope per variant.
optional arguments:
-h, --help show this help message and exit
-m {lowest,median}, --top-score-metric {lowest,median}
The ic50 scoring metric to use for filtering. lowest:
Use the best MT Score (i.e. the lowest MT ic50 binding
score of all chosen prediction methods). median: Use
the median MT Score (i.e. the median MT ic50 binding
score of all chosen prediction methods). (default:
median)
This filter picks the top epitope for a variant. By default the
--top-score-metric
option is set to median
which will apply this
filter to the Median MT Score
column and pick the epitope with the lowest
median mutant ic50 score for each variant. If the --top-score-metric
option is set to lowest
, the Best MT Score
column is instead used to
make this determination.