Optional Downstream Analysis Tools¶
Generate Protein Fasta¶
usage: pvacfuse generate_protein_fasta [-h] [--input-tsv INPUT_TSV]
[-d DOWNSTREAM_SEQUENCE_LENGTH]
input_file flanking_sequence_length
output_file
Generate an annotated fasta file from Integrate-Neo or AGFusion output.
positional arguments:
input_file An INTEGRATE-Neo annotated bedpe file with fusions or
a AGfusion output directory.
flanking_sequence_length
Number of amino acids to add on each side of the
mutation when creating the FASTA.
output_file The output fasta file.
optional arguments:
-h, --help show this help message and exit
--input-tsv INPUT_TSV
A pVACfuse all_epitopes or filtered TSV file with
epitopes to use for subsetting the input file to
peptides of interest. Only the peptide sequences for
the epitopes in the TSV will be used when creating the
FASTA. (default: None)
-d DOWNSTREAM_SEQUENCE_LENGTH, --downstream-sequence-length DOWNSTREAM_SEQUENCE_LENGTH
Cap to limit the downstream sequence length for
frameshift fusion when creating the fasta file. Use
'full' to include the full downstream sequence.
(default: 1000)
This tool will extract protein sequences surrounding fusion variant in an by parsing Integrate-Neo or AGFusion output. One use case for this tool is to help select long peptides that contain short neoepitope candidates. For example, if pvacfuse was run to predict nonamers (9-mers) that are good binders and the user wishes to select long peptide (e.g. 24-mer) sequences that contain the nonamer for synthesis or encoding in a DNA vector. The fusion position will be centered in the protein sequence returned (if possible). If the fusion causes a frameshift, the full downstream protein sequence will be returned unless the user specifies otherwise as described above.