Output Files¶

The pVACfuse pipeline will write its results in separate folders depending on which prediction algorithms were chosen:

MHC_Class_I: for MHC class I prediction algorithms
MHC_Class_II: for MHC class II prediction algorithms
combined: If both MHC class I and MHC class II prediction algorithms were run, this folder combines the neoeptiope predictions from both

Each folder will contain the same list of output files (listed in the order created):

File Name	Description
`<sample_name>.tsv`	An intermediate file with variant and transcript information parsed from the input file(s).
`<sample_name>.tsv_<chunks>` (multiple)	The above file but split into smaller chunks for easier processing with IEDB.
`<sample_name>.all_epitopes.tsv`	A list of all predicted epitopes and their binding affinity scores, with additional variant information from the `<sample_name>.tsv`.
`<sample_name>.filtered.tsv`	The above file after applying all filters, with cleavage site and stability predictions added.
`<sample_name>.filtered.condensed.ranked.tsv`	A condensed version of the filtered TSV with only the most important columns remaining, with a priority score for each neoepitope candidate added.

all_epitopes.tsv and filtered.tsv Report Columns¶

In order to keep the outputs consistent, pVACfuse uses the same output columns as pVACseq but some of the values will be NA if a column doesn’t apply to pVACfuse.

Column Name	Description
`Chromosome`	The chromosome of the 5p and 3p portion of the fusion, separated by ” / “
`Start`	The start position of the 5p and 3p portion of the fusion, separated by ” / “
`Stop`	The stop position of the 5p and 3p portion of the fusion, separated by ” / “
`Reference`	`fusion`
`Variant`	`fusion`
`Transcript`	The Ensembl IDs of the affected transcripts
`Transcript Support Level`	`NA`
`Ensembl Gene ID`	`NA`
`Variant Type`	The type of fusion. `inframe_fusion` for inframe fusions, `frameshift_fusion` for frameshift fusions
`Mutation`	`NA`
`Protein Position`	The position of the fusion in the fusion protein sequence
`Gene Name`	The Ensembl gene names of the affected genes
`HGVSc`	`NA`
`HGVSp`	`NA`
`HLA Allele`	The HLA allele for this prediction
`Peptide Length`	The peptide length of the epitope
`Sub-peptide Position`	The one-based position of the epitope in the protein sequence used to make the prediction
`Mutation Position`	`NA`
`MT Epitope Seq`	Mutant epitope sequence
`WT Epitope Seq`	`NA`
`Best MT Score Method`	Prediction algorithm with the lowest mutant ic50 binding affinity for this epitope
`Best MT Score`	Lowest ic50 binding affinity of all prediction algorithms used
`Corresponding WT Score`	`NA`
`Corresponding Fold Change`	`NA`
`Tumor DNA Depth`	`NA`
`Tumor DNA VAF`	`NA`
`Tumor RNA Depth`	`NA`
`Tumor RNA VAF`	`NA`
`Normal Depth`	`NA`
`Normal VAF`	`NA`
`Gene Expression`	`NA`
`Transcript Expression`	`NA`
`Median MT Score`	Median ic50 binding affinity of the mutant epitope of all prediction algorithms used
`Median WT Score`	`NA`
`Median Fold Change`	`NA`
`Individual Prediction Algorithm WT and MT Scores` (multiple)	ic50 scores for the `MT Epitope Seq` and `WT Epitope Seq` for the individual prediction algorithms used
`cterm_7mer_gravy_score`	Mean hydropathy of last 7 residues on the C-terminus of the peptide
`max_7mer_gravy_score`	Max GRAVY score of any kmer in the amino acid sequence. Used to determine if there are any extremely hydrophobic regions within a longer amino acid sequence.
`difficult_n_terminal_residue` (T/F)	Is N-terminal amino acid a Glutamine, Glutamic acid, or Cysteine?
`c_terminal_cysteine` (T/F)	Is the C-terminal amino acid a Cysteine?
`c_terminal_proline` (T/F)	Is the C-terminal amino acid a Proline?
`cysteine_count`	Number of Cysteines in the amino acid sequence. Problematic because they can form disulfide bonds across distant parts of the peptide
`n_terminal_asparagine` (T/F)	Is the N-terminal amino acid a Asparagine?
`asparagine_proline_bond_count`	Number of Asparagine-Proline bonds. Problematic because they can spontaneously cleave the peptide
`Best Cleavage Position` (optional)	Position of the highest predicted cleavage score
`Best Cleavage Score` (optional)	Highest predicted cleavage score
`Cleavage Sites` (optional)	List of all cleavage positions and their cleavage score
`Predicted Stability` (optional)	Stability of the pMHC-I complex
`Half Life` (optional)	Half-life of the pMHC-I complex
`Stability Rank` (optional)	The % rank stability of the pMHC-I complex
`NetMHCstab allele` (optional)	Nearest neighbor to the `HLA Allele`. Used for NetMHCstab prediction

filtered.condensed.ranked.tsv Report Columns¶

Column Name	Description
`Gene Name`	The Ensembl gene names of the affected genes
`Mutation`	`NA`
`Protein Position`	The position of the fusion in the fusion protein sequence
`HGVSc`	`NA`
`HGVSp`	`NA`
`HLA Allele`	The HLA allele for this prediction.
`Mutation Position`	`NA`
`MT Epitope Seq`	Mutant epitope sequence.
`Median MT Score`	Median ic50 binding affinity of the mutant epitope across all prediction algorithms used
`Median WT Score`	`NA`
`Median Fold Change`	`NA`
`Best MT Score`	Lowest ic50 binding affinity of all prediction algorithms used
`Corresponding WT Score`	`NA`
`Corresponding Fold Change`	`NA`
`Tumor DNA Depth`	`NA`
`Tumor DNA VAF`	`NA`
`Tumor RNA Depth`	`NA`
`Tumor RNA VAF`	`NA`
`Gene Expression`	`NA`
`Rank`	A priority rank for the neoepitope (best = 1).

The pVACfuse Neoeptiope Priority Rank¶

The underlying formula for calculating the pVACfuse rank is the same as it is for The pVACseq Neoeptiope Priority Rank. However, since only the binding affinity is available for fusion predictions, the pVACfuse simply ranks the neoeptiopes according to their binding affinity, with the lowest being the best. If the --top-score-metric is set to median (default) the Median MT Score is used. If it is set to lowest the Best MT Score is used.

Table of Contents

Previous topic

Next topic

Output Files¶

all_epitopes.tsv and filtered.tsv Report Columns¶

filtered.condensed.ranked.tsv Report Columns¶

The pVACfuse Neoeptiope Priority Rank¶