Output Files¶
The pVACbind pipeline will write its results in separate folders depending on which prediction algorithms were chosen:
MHC_Class_I
: for MHC class I prediction algorithmsMHC_Class_II
: for MHC class II prediction algorithmscombined
: If both MHC class I and MHC class II prediction algorithms were run, this folder combines the neoepitope predictions from both
Each folder will contain the same list of output files (listed in the order created):
File Name |
Description |
---|---|
|
A list of all predicted epitopes and their binding affinity scores, with
additional variant information from the |
|
The above file after applying all filters, with cleavage site and stability predictions added. |
|
A file outlining details of reference proteome matches |
|
An aggregated version of the |
Filters applied to the filtered.tsv file¶
The filtered.tsv file is the all_epitopes file with the following filters applied (in order):
Binding Filter
Top Score Filter
Please see the Standalone Filter Commands documentation for more information on each individual filter. The standalone filter commands may be useful to reproduce the filtering or to chose different filtering thresholds.
all_epitopes.tsv and filtered.tsv Report Columns¶
Column Name |
Description |
---|---|
|
The FASTA ID of the peptide sequence the epitope belongs to |
|
The HLA allele for this prediction |
|
The one-based position of the epitope in the protein sequence used to make the prediction |
|
The epitope sequence |
|
Median ic50 binding affinity of the epitope of all prediction algorithms used |
|
Lowest ic50 binding affinity of all prediction algorithms used |
|
Prediction algorithm with the lowest ic50 binding affinity for this epitope |
|
Median binding affinity percentile rank of the epitope of all prediction algorithms used (those that provide percentile output) |
|
Lowest binding affinity percentile rank of all prediction algorithms used (those that provide percentile output) |
|
Prediction algorithm with the lowest binding affinity percentile rank for this epitope |
|
ic50 binding affinity scores and percentiles for the |
|
Mean hydropathy of last 7 residues on the C-terminus of the peptide |
|
Max GRAVY score of any kmer in the amino acid sequence. Used to determine if there are any extremely hydrophobic regions within a longer amino acid sequence. |
|
Is N-terminal amino acid a Glutamine, Glutamic acid, or Cysteine? |
|
Is the C-terminal amino acid a Cysteine? |
|
Is the C-terminal amino acid a Proline? |
|
Number of Cysteines in the amino acid sequence. Problematic because they can form disulfide bonds across distant parts of the peptide |
|
Is the N-terminal amino acid a Asparagine? |
|
Number of Asparagine-Proline bonds. Problematic because they can spontaneously cleave the peptide |
|
Position of the highest predicted cleavage score |
|
Highest predicted cleavage score |
|
List of all cleavage positions and their cleavage score |
|
Stability of the pMHC-I complex |
|
Half-life of the pMHC-I complex |
|
The % rank stability of the pMHC-I complex |
|
Nearest neighbor to the |
|
Was there a BLAST match of the mutated peptide sequence to the reference proteome? |
filtered.tsv.reference_matches Report Columns¶
This file is only generated when the --run-reference-proteome-similarity
option is chosen.
Column Name |
Description |
---|---|
|
The chromosome of this variant |
|
The start position of this variant in the zero-based, half-open coordinate system |
|
The stop position of this variant in the zero-based, half-open coordinate system |
|
The reference allele |
|
The alt allele |
|
The Ensembl ID of the affected transcript |
|
The peptide sequence submitted to BLAST |
|
The BLAST alignment hit ID (reference proteome sequence ID) |
|
The BLAST alignment hit definition (reference proteome sequence name) |
|
The BLAST query sequence |
|
The BLAST match sequence |
|
The match start position in the matched reference proteome sequence |
|
The match stop position in the matched reference proteome sequence |