pVACfuse logo

Filtering Commands

pVACfuse currently offers two filters: a binding filter and a top score filter.

The binding filter and top score filter are always run automatically as part of the pVACfuse pipeline.

All filters can also be run manually to narrow the final results down further or to redefine the filters entirely and produce a new candidate list from the all_epitopes.tsv file.

Note

The default values for filtering thresholds are suggestions only. While they are based on review of the literature and consultation with our clinical and immunology colleagues, your specific use case will determine the appropriate values.

Binding Filter

usage: pvacfuse binding_filter [-h] [-b BINDING_THRESHOLD]
                               [-m {lowest,median}] [--exclude-NAs] [-a]
                               input_file output_file

positional arguments:
  input_file            The final report .tsv file to filter.
  output_file           Output .tsv file containing list of filtered epitopes
                        based on binding affinity.

optional arguments:
  -h, --help            show this help message and exit
  -b BINDING_THRESHOLD, --binding-threshold BINDING_THRESHOLD
                        Report only epitopes where the mutant allele has ic50
                        binding scores below this value. (default: 500)
  -m {lowest,median}, --top-score-metric {lowest,median}
                        The ic50 scoring metric to use when filtering epitopes
                        by binding-threshold or minimum fold change. lowest:
                        Use the Best MT Score and corresponding Fold Change
                        (i.e. use the lowest MT ic50 binding score and
                        corresponding fold change of all chosen prediction
                        methods). median: Use the Median MT Score and Median
                        Fold Change (i.e. use the median MT ic50 binding score
                        and fold change of all chosen prediction methods).
                        (default: median)
  --exclude-NAs         Exclude NA values from the filtered output. (default:
                        False)
  -a, --allele-specific-binding-thresholds
                        Use allele-specific binding thresholds. To print the
                        allele-specific binding thresholds run `pvacfuse
                        allele_specific_cutoffs`. If an allele does not have a
                        special threshold value, the `--binding-threshold`
                        value will be used. (default: False)

The binding filter filters out variants that don’t pass the chosen binding threshold. The user can chose whether to apply this filter to the lowest or the median binding affinity score by setting the --top-score-metric flag. The lowest binding affinity score is recorded in the Best MT Score column and represents the lowest ic50 score of all prediction algorithms that were picked during the previous pVACseq run. The median binding affinity score is recorded in the Median MT Score column and corresponds to the median ic50 score of all prediction algorithms used to create the report. Be default, the binding filter runs on the median binding affinity.

By default, entries with NA values will be included in the output. This behavior can be turned off by using the --exclude-NAs flag.

Top Score Filter

usage: pvacfuse [-h]
                {run,binding_filter,top_score_filter,generate_protein_fasta,valid_alleles,allele_specific_cutoffs,download_example_data}
                ...

positional arguments:
  {run,binding_filter,top_score_filter,generate_protein_fasta,valid_alleles,allele_specific_cutoffs,download_example_data}
    run                 Runs the pVACfuse pipeline
    binding_filter      Filters variants processed by IEDB by binding score
    top_score_filter    Pick the best neoepitope for each variant
    generate_protein_fasta
                        Generate an annotated fasta file from Integrate-Neo or
                        AGFusion output
    valid_alleles       Shows a list of valid allele names
    allele_specific_cutoffs
                        Show the allele specific cutoffs
    download_example_data
                        Downloads example input and output files

optional arguments:
  -h, --help            show this help message and exit
Error: No command specified

This filter picks the top epitope for a variant. Epitopes with the same Chromosome - Start - Stop - Reference - Variant are identified as coming from the same variant.

In order to account for different splice sites among the transcripts of a variant that would lead to different peptides, this filter also takes into account the different transcripts returned by Integrate-Neo/AGFusion and will return the top epitope for all transcripts if they are non-identical. If the resulting list of top epitopes for the transcripts of a variant is identical, the epitope for the transcript with the lowest Ensembl ID is returned.

By default the --top-score-metric option is set to median which will apply this filter to the Median MT Score column and pick the epitope with the lowest median mutant ic50 score for each variant. If the --top-score-metric option is set to lowest, the Best MT Score column is instead used to make this determination.

If there are multiple top epitopes for a variant with the same ic50 score, the first one is chosen.