pVACfuse logo

Output Files

The pVACfuse pipeline will write its results in separate folders depending on which prediction algorithms were chosen:

  • MHC_Class_I: for MHC class I prediction algorithms

  • MHC_Class_II: for MHC class II prediction algorithms

  • combined: If both MHC class I and MHC class II prediction algorithms were run, this folder combines the neoeptiope predictions from both

Each folder will contain the same list of output files (listed in the order created):

File Name

Description

<sample_name>.tsv

An intermediate file with variant and transcript information parsed from the input file(s).

<sample_name>.tsv_<chunks> (multiple)

The above file but split into smaller chunks for easier processing with IEDB.

<sample_name>.all_epitopes.tsv

A list of all predicted epitopes and their binding affinity scores, with additional variant information from the <sample_name>.tsv.

<sample_name>.filtered.tsv

The above file after applying all filters, with cleavage site and stability predictions added.

<sample_name>.filtered.condensed.ranked.tsv

A condensed version of the filtered TSV with only the most important columns remaining, with a priority score for each neoepitope candidate added.

all_epitopes.tsv and filtered.tsv Report Columns

In order to keep the outputs consistent, pVACfuse uses the same output columns as pVACseq but some of the values will be NA if a column doesn’t apply to pVACfuse.

Column Name

Description

Chromosome

The chromosome of the 5p and 3p portion of the fusion, separated by ” / “

Start

The start position of the 5p and 3p portion of the fusion, separated by ” / “

Stop

The stop position of the 5p and 3p portion of the fusion, separated by ” / “

Reference

fusion

Variant

fusion

Transcript

The Ensembl IDs of the affected transcripts

Transcript Support Level

NA

Ensembl Gene ID

NA

Variant Type

The type of fusion. inframe_fusion for inframe fusions, frameshift_fusion for frameshift fusions

Mutation

NA

Protein Position

The position of the fusion in the fusion protein sequence

Gene Name

The Ensembl gene names of the affected genes

HGVSc

NA

HGVSp

NA

HLA Allele

The HLA allele for this prediction

Peptide Length

The peptide length of the epitope

Sub-peptide Position

The one-based position of the epitope in the protein sequence used to make the prediction

Mutation Position

NA

MT Epitope Seq

Mutant epitope sequence

WT Epitope Seq

NA

Best MT Score Method

Prediction algorithm with the lowest mutant ic50 binding affinity for this epitope

Best MT Score

Lowest ic50 binding affinity of all prediction algorithms used

Corresponding WT Score

NA

Corresponding Fold Change

NA

Tumor DNA Depth

NA

Tumor DNA VAF

NA

Tumor RNA Depth

NA

Tumor RNA VAF

NA

Normal Depth

NA

Normal VAF

NA

Gene Expression

NA

Transcript Expression

NA

Median MT Score

Median ic50 binding affinity of the mutant epitope of all prediction algorithms used

Median WT Score

NA

Median Fold Change

NA

Individual Prediction Algorithm WT and MT Scores (multiple)

ic50 scores for the MT Epitope Seq and WT Epitope Seq for the individual prediction algorithms used

cterm_7mer_gravy_score

Mean hydropathy of last 7 residues on the C-terminus of the peptide

max_7mer_gravy_score

Max GRAVY score of any kmer in the amino acid sequence. Used to determine if there are any extremely hydrophobic regions within a longer amino acid sequence.

difficult_n_terminal_residue (T/F)

Is N-terminal amino acid a Glutamine, Glutamic acid, or Cysteine?

c_terminal_cysteine (T/F)

Is the C-terminal amino acid a Cysteine?

c_terminal_proline (T/F)

Is the C-terminal amino acid a Proline?

cysteine_count

Number of Cysteines in the amino acid sequence. Problematic because they can form disulfide bonds across distant parts of the peptide

n_terminal_asparagine (T/F)

Is the N-terminal amino acid a Asparagine?

asparagine_proline_bond_count

Number of Asparagine-Proline bonds. Problematic because they can spontaneously cleave the peptide

Best Cleavage Position (optional)

Position of the highest predicted cleavage score

Best Cleavage Score (optional)

Highest predicted cleavage score

Cleavage Sites (optional)

List of all cleavage positions and their cleavage score

Predicted Stability (optional)

Stability of the pMHC-I complex

Half Life (optional)

Half-life of the pMHC-I complex

Stability Rank (optional)

The % rank stability of the pMHC-I complex

NetMHCstab allele (optional)

Nearest neighbor to the HLA Allele. Used for NetMHCstab prediction

filtered.condensed.ranked.tsv Report Columns

Column Name

Description

Gene Name

The Ensembl gene names of the affected genes

Mutation

NA

Protein Position

The position of the fusion in the fusion protein sequence

HGVSc

NA

HGVSp

NA

HLA Allele

The HLA allele for this prediction.

Mutation Position

NA

MT Epitope Seq

Mutant epitope sequence.

Median MT Score

Median ic50 binding affinity of the mutant epitope across all prediction algorithms used

Median WT Score

NA

Median Fold Change

NA

Best MT Score

Lowest ic50 binding affinity of all prediction algorithms used

Corresponding WT Score

NA

Corresponding Fold Change

NA

Tumor DNA Depth

NA

Tumor DNA VAF

NA

Tumor RNA Depth

NA

Tumor RNA VAF

NA

Gene Expression

NA

Rank

A priority rank for the neoepitope (best = 1).

The pVACfuse Neoeptiope Priority Rank

The underlying formula for calculating the pVACfuse rank is the same as it is for The pVACseq Neoeptiope Priority Rank. However, since only the binding affinity is available for fusion predictions, the pVACfuse simply ranks the neoeptiopes according to their binding affinity, with the lowest being the best. If the --top-score-metric is set to median (default) the Median MT Score is used. If it is set to lowest the Best MT Score is used.