foldxana.py - FoldX Stability Analysis Documentation
Table of Contents
Project Information
Title: The development of an automated computational workflow to prioritize potential resistance variants identified in HIV Integrase Subtype C
Author: Keaghan Brown (3687524) - MSc Bioinformatics Candidate
Supervisor: Ruben Cloete - Lecturer, SANBI
Institution: South African National Bioinformatics Institute, University of the Western Cape
Funding: Poliomyelitis Research Foundation and UWC Ada & Bertie Levenstein Bursary Programme
Overview
The foldxana.py
script is a key component of the automated computational workflow for analyzing structural stability changes in HIV Integrase Subtype C. It utilizes the FoldX software to calculate the stability of wild-type and mutant protein structures.
The script outputs a detailed HTML report summarizing the ΔΔG changes induced by mutations, which provides insights into the potential impact of mutations on protein stability.
Usage
python foldxana.py --pdb_file path/to/original.pdb --output_dir path/to/mutants/
Arguments
--pdb_file
: Path to the wild-type PDB structure.--output_dir
: Directory containing the mutated PDB files (must end in_auto.pdb
).
Outputs
- stability_index.html: Table showing:
- Wild-type structure energy
- Each variant structure energy
- Stability difference (ΔΔG)
Main Functionalities
Class: FoldXAna
foldx_stability(output_dir, pdb_file)
Automatically locates the FoldX executable and runs stability calculations:
- First for the wild-type PDB file
- Then for each mutant PDB file in the output directory
stability_changes(output_dir, pdb_file)
Parses the resulting FoldX .fxout
files and calculates:
- Stability of wild-type system
- Stability of each variant system
- ΔΔG = WT Stability - Variant Stability
The results are exported to stability_index.html
with HTML table formatting.
Workflow Summary
- FoldX executable is located automatically within the current folder.
- Stability is calculated for WT and each variant using the
Stability
command. - ΔΔG is computed and compiled into an HTML report.
Dependencies
FoldX
(executable must be in current directory)Python
standard librariespandas, numpy
Example
python foldxana.py --pdb_file HIV_WT.pdb --output_dir ./variants/
Output: stability_index.html
Prerequisites
- Ensure Python 3.x is installed and added to your system PATH.
- Install required libraries using
pip install pandas numpy
. - Download and place the FoldX executable in the script's working directory.
Notes
- Ensure FoldX is downloaded and available in the script's working directory.
- All variant structures must be named using the format:
VariantName_auto.pdb
. - Only the second value (energy) from FoldX output is used.
- HTML output is styled with basic CSS for readability.
Error Handling
To improve error handling, consider implementing logging and exception handling in your script. For example:
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
try:
# Example subprocess call
subprocess.run(["python", "foldxana.py", "--pdb_file", "HIV_WT.pdb", "--output_dir", "./variants"], check=True)
except subprocess.CalledProcessError as e:
logging.error(f"Error occurred: {e}")
exit(1)