Functional Segregation of Overlapping Genes in HIV.pdf
Using alanine scanning and deep mutational scanning found segregated organization in overlapping genes where functionally important residues in one gene tended to overlap with non functional or highly mutable regions in another gene
Overlapping genes in natural and engineered genomes
I need further steps on the pipeline!
Synthetic sequence entanglement-min.pdf
Pseudocode for algorithm with one overlapping region
CODON_TABLE = {
'TCA': 'S', 'TCC': 'S', 'TCG': 'S', 'TCT': 'S', 'TTC': 'F', 'TTT': 'F', 'TTA': 'L', 'TTG': 'L',
'TAC': 'Y', 'TAT': 'Y', 'TAA': '*', 'TAG': '*', 'TGC': 'C', 'TGT': 'C', 'TGA': '*', 'TGG': 'W',
'CTA': 'L', 'CTC': 'L', 'CTG': 'L', 'CTT': 'L', 'CCA': 'P', 'CCC': 'P', 'CCG': 'P', 'CCT': 'P',
'CAC': 'H', 'CAT': 'H', 'CAA': 'Q', 'CAG': 'Q', 'CGA': 'R', 'CGC': 'R', 'CGG': 'R', 'CGT': 'R',
'ATA': 'I', 'ATC': 'I', 'ATT': 'I', 'ATG': 'M', 'ACA': 'T', 'ACC': 'T', 'ACG': 'T', 'ACT': 'T',
'AAC': 'N', 'AAT': 'N', 'AAA': 'K', 'AAG': 'K', 'AGC': 'S', 'AGT': 'S', 'AGA': 'R', 'AGG': 'R',
'GTA': 'V', 'GTC': 'V', 'GTG': 'V', 'GTT': 'V', 'GCA': 'A', 'GCC': 'A', 'GCG': 'A', 'GCT': 'A',
'GAC': 'D', 'GAT': 'D', 'GAA': 'E', 'GAG': 'E', 'GGA': 'G', 'GGC': 'G', 'GGG': 'G', 'GGT': 'G',
}
REVERSE_CODON_TABLE = {
'A': ['GCA', 'GCC', 'GCG', 'GCT'],
'C': ['TGC', 'TGT'],
'D': ['GAC', 'GAT'],
'E': ['GAA', 'GAG'],
'F': ['TTC', 'TTT'],
'G': ['GGA', 'GGC', 'GGG', 'GGT'],
'H': ['CAC', 'CAT'],
'I': ['ATA', 'ATC', 'ATT'],
'K': ['AAA', 'AAG'],
'L': ['TTA', 'TTG', 'CTA', 'CTC', 'CTG', 'CTT'],
'M': ['ATG'],
'N': ['AAC', 'AAT'],
'P': ['CCA', 'CCC', 'CCG', 'CCT'],
'Q': ['CAA', 'CAG'],
'R': ['CGA', 'CGC', 'CGG', 'CGT', 'AGA', 'AGG'],
'S': ['TCA', 'TCC', 'TCG', 'TCT', 'AGC', 'AGT'],
'T': ['ACA', 'ACC', 'ACG', 'ACT'],
'V': ['GTA', 'GTC', 'GTG', 'GTT'],
'W': ['TGG'],
'Y': ['TAC', 'TAT'],
'*': ['TAA', 'TAG', 'TGA'],
}
FUNCTION get_degenerate_codons(codon):
amino_acid = CODON_TABLE[codon]
degenerate_codons = REVERSE_CODON_TABLE[amino_acid]
return degenerate_codons
FUNCTION generate_variant_graph(dna_sequence, overlap_start, overlap_end):
# Step 1: Initialize the graph structure
varient_graph = []
# Step 2: Build the graph
FOR i FROM 0 TO LENGTH(dna_sequence) - 1:
IF i IS BETWEEN overlap_start AND overlap_end - 1:
# Inside overlap region: handle codons for Gene 2
IF (i - overlap_start) MOD 3 == 0:
# Start of a new codon in Gene 2
codon = dna_sequence[i TO i + 2] # Extract the codon
degenerate_codons = get_degenerate_codons(codon)
graph.APPEND(synonymous_codons) # Add to graph
ELSE:
# Outside overlap region: handle individual nucleotides
graph.APPEND([dna_sequence[i]]) # Add single nucleotide to graph
RETURN varient_graph
Lysis Protein: METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT
First 49 overlap
Last 144 overlap
Total Length: 227
(0,48) (82, 226)
If I truncate the lys gene I'm may affect the expression of downstream genes
She has a link on her presentation.
Coat protein lys protein interaction.
Come up with 5 or 3 mutants that I think are promising.
I have done this overlap, of those, these couple of mutants because of bla bla bla.
Another constraint: codon distribution
Which parts of the RNA secondary structure are important? Which are not?
Which attributes of lys protein must be conserved? Which not so much?
How does the DNA pack into cp? Would a change in sequence affect a change in DNA packing?
I need to go through the entire virjs lifecycle to determine which may be affected by a change in DNA sequence