pVACbind logo

Output Files

The pVACbind pipeline will write its results in separate folders depending on which prediction algorithms were chosen:

  • MHC_Class_I: for MHC class I prediction algorithms
  • MHC_Class_II: for MHC class II prediction algorithms
  • combined: If both MHC class I and MHC class II prediction algorithms were run, this folder combines the neoeptiope predictions from both

Each folder will contain the same list of output files (listed in the order created):

File Name Description
<sample_name>.tsv An intermediate file with variant information parsed from the input files.
<sample_name>.tsv_<chunks> (multiple) The above file but split into smaller chunks for easier processing with IEDB.
<sample_name>.all_epitopes.tsv A list of all predicted epitopes and their binding affinity scores, with additional variant information from the <sample_name>.tsv.
<sample_name>.filtered.tsv The above file after applying all filters, with cleavage site and stability predictions added.

all_epitopes.tsv and filtered.tsv Report Columns

Column Name Description
Mutation The FASTA ID of the peptide sequence the epitope belongs to
HLA Allele The HLA allele for this prediction
Sub-peptide Position The one-based position of the epitope in the protein sequence used to make the prediction
Epitope Seq The epitope sequence
Median Score Median ic50 binding affinity of the epitope of all prediction algorithms used
Best Score Lowest ic50 binding affinity of all prediction algorithms used
Best Score Method Prediction algorithm with the lowest ic50 binding affinity for this epitope
Individual Prediction Algorithm Scores (multiple) ic50 scores for the Epitope Seq for the individual prediction algorithms used
cterm_7mer_gravy_score Mean hydropathy of last 7 residues on the C-terminus of the peptide
max_7mer_gravy_score Max GRAVY score of any kmer in the amino acid sequence. Used to determine if there are any extremely hydrophobic regions within a longer amino acid sequence.
difficult_n_terminal_residue (T/F) Is N-terminal amino acid a Glutamine, Glutamic acid, or Cysteine?
c_terminal_cysteine (T/F) Is the C-terminal amino acid a Cysteine?
c_terminal_proline (T/F) Is the C-terminal amino acid a Proline?
cysteine_count Number of Cysteines in the amino acid sequence. Problematic because they can form disulfide bonds across distant parts of the peptide
n_terminal_asparagine (T/F) Is the N-terminal amino acid a Asparagine?
asparagine_proline_bond_count Number of Asparagine-Proline bonds. Problematic because they can spontaneously cleave the peptide
Best Cleavage Position (optional) Position of the highest predicted cleavage score
Best Cleavage Score (optional) Highest predicted cleavage score
Cleavage Sites (optional) List of all cleavage positions and their cleavage score
Predicted Stability (optional) Stability of the pMHC-I complex
Half Life (optional) Half-life of the pMHC-I complex
Stability Rank (optional) The % rank stability of the pMHC-I complex
NetMHCstab allele (optional) Nearest neighbor to the HLA Allele. Used for NetMHCstab prediction