Optional Downstream Analysis Tools¶
Generate Protein Fasta¶
usage: pvacfuse generate_protein_fasta [-h] [--input-tsv INPUT_TSV] [-d DOWNSTREAM_SEQUENCE_LENGTH] input_file peptide_sequence_length output_file positional arguments: input_file An INTEGRATE-Neo annotated bedpe file with fusions or a AGfusion output directory. peptide_sequence_length Length of the peptide sequence to use when creating the FASTA. output_file The output fasta file. optional arguments: -h, --help show this help message and exit --input-tsv INPUT_TSV A pVACfuse all_epitopes or filtered TSV file with epitopes to use for subsetting the input file to peptides of interest. Only the peptide sequences for the epitopes in the TSV will be used when creating the FASTA. (default: None) -d DOWNSTREAM_SEQUENCE_LENGTH, --downstream-sequence-length DOWNSTREAM_SEQUENCE_LENGTH Cap to limit the downstream sequence length for frameshift fusion when creating the fasta file. Use 'full' to include the full downstream sequence. (default: 1000)
This tool will extract protein sequences surrounding fusion variant in an by parsing Integrate-Neo or AGFusion output. One use case for this tool is to help select long peptides that contain short neoepitope candidates. For example, if pvacfuse was run to predict nonamers (9-mers) that are good binders and the user wishes to select long peptide (e.g. 24-mer) sequences that contain the nonamer for synthesis or encoding in a DNA vector. The fusion position will be centered in the protein sequence returned (if possible). If the fusion causes a frameshift, the full downstream protein sequence will be returned unless the user specifies otherwise as described above.