AMIA Documentation Index

Project Information

Title: The development of an automated computational workflow to prioritize potential resistance variants identified in HIV Integrase Subtype C
Author: Keaghan Brown (3687524) - MSc Bioinformatics Candidate
Supervisor: Ruben Cloete - Lecturer, SANBI
Institution: South African National Bioinformatics Institute, University of the Western Cape
Funding: Poliomyelitis Research Foundation and UWC Ada & Bertie Levenstein Bursary Programme

Overview

The mutintro.py script is a critical component of the automated computational workflow for introducing mutations into protein structures. It leverages PyMOL’s mutagenesis wizard to introduce mutations individually or simultaneously and optionally applies energy minimization using FoldX.

This script is designed to streamline the mutation introduction process, ensuring consistency and accuracy in preparing mutant structures for downstream analysis.

Usage

python mutintro.py --pdb_file path/to/file.pdb --output_dir path/to/output --mutations path/to/mutations.csv --mode [single|multiple]

Arguments

--pdb_file: Path to the original PDB file that will undergo mutations.
--output_dir: Directory where modified PDB files will be saved.
--mutations: CSV file with mutation data. Each column represents a system, and each row contains mutations like Q148R.
--mode: Mutation mode. Options:
- single: Each mutation is introduced individually.
- multiple: All mutations for a system are introduced together.

Mutation Format

Mutations are written in the format OriginalResiduePositionNewResidue, e.g., Q148R, which means:

Original residue: Q (Glutamine)
Position: 148
New residue: R (Arginine)

Main Functionalities

Class: `MutationIntro`

`mutant_processing(mutant_list)`

Reads a CSV mutation table and returns dictionaries for both individual and grouped mutations.

`individual_introduction(pdb_file, output_dir, mutant_data)`

Introduces each mutation into a separate PDB file using PyMOL’s mutagenesis wizard and saves them to the output directory.

`simultaneous_introduction(pdb_file, output_dir, mutant_data)`

Introduces all mutations for a given system into a single structure and saves the final mutated PDB file.

`foldx_emin(foldx_exe, output_dir)`

(Optional) Runs FoldX’s "Optimize" command on each mutated structure to minimize energy and resolve potential steric clashes.

FoldX Integration

The script searches for a folder named foldx in the current working directory and automatically locates the FoldX binary within it.

Example command executed:

./foldx --command=Optimize --pdb=Q148R_auto.pdb --output-file=Q148R_auto.pdb

Dependencies

pymol
biopython (PDBParser, PPBuilder)
pandas
argparse, os, sys, warnings (Standard Library)

Example

python mutintro.py --pdb_file HIV_Integrase.pdb --output_dir ./output --mutations variant_list.csv --mode multiple

Prerequisites

Ensure Python 3.x is installed and added to your system PATH.
Install required libraries using pip install pymol biopython pandas.
Ensure PyMOL is installed and can be invoked in script mode.
Download and place the FoldX executable in a folder named foldx.

Notes

The mutation wizard automatically selects the rotamer with the least steric clashes.
Make sure your PDB file matches the expected format (chain and residue numbers must match mutation data).
All output files are saved in the specified output directory.

Error Handling

To improve error handling, consider implementing logging and exception handling in your script. For example:

import logging

                
                    logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
                try:
                    # Example subprocess call
                    subprocess.run(["python", "mutintro.py", "--pdb_file", "HIV_Integrase.pdb", "--output_dir", "./output", "--mutations", "variant_list.csv", "--mode", "multiple"], check=True)
                except subprocess.CalledProcessError as e:
                    logging.error(f"Error occurred: {e}")
                    exit(1)

mutintro.py - Mutation Introduction Script Documentation

Table of Contents

Project Information

Overview

Usage

Arguments

Mutation Format

Main Functionalities

Class: `MutationIntro`

`mutant_processing(mutant_list)`

`individual_introduction(pdb_file, output_dir, mutant_data)`

`simultaneous_introduction(pdb_file, output_dir, mutant_data)`

`foldx_emin(foldx_exe, output_dir)`

FoldX Integration

Dependencies

Example

Prerequisites

Notes

Error Handling

mutintro.py - Mutation Introduction Script Documentation

Table of Contents

Project Information

Overview

Usage

Arguments

Mutation Format

Main Functionalities

Class: MutationIntro

mutant_processing(mutant_list)

individual_introduction(pdb_file, output_dir, mutant_data)

simultaneous_introduction(pdb_file, output_dir, mutant_data)

foldx_emin(foldx_exe, output_dir)

FoldX Integration

Dependencies

Example

Prerequisites

Notes

Error Handling

Class: `MutationIntro`

`mutant_processing(mutant_list)`

`individual_introduction(pdb_file, output_dir, mutant_data)`

`simultaneous_introduction(pdb_file, output_dir, mutant_data)`

`foldx_emin(foldx_exe, output_dir)`