AMIA Logo
UWC Logo SANBI Logo

TrajStat Documentation

Overview

TrajStat is a Python script designed for analyzing molecular dynamics (MD) trajectories using the MDAnalysis library. It provides a comprehensive suite of analysis methods, including RMSD, RMSF, radius of gyration, PCA, hydrogen bonds, salt bridges, and more.

This script is a critical component of molecular dynamics workflows, enabling researchers to extract meaningful insights from trajectory data and visualize structural dynamics.

Usage

The script can be executed from the command line with the following arguments:

python trajstat.py --systems  --output_dir 

Replace <path_to_systems> with the directory containing the trajectory files and <path_to_output_directory> with the directory where the results will be stored.

Features

  • RMSD Calculation: Calculates the root mean square deviation (RMSD) for proteins and nucleic acids.
  • RMSF Calculation: Computes the root mean square fluctuation (RMSF) for protein chains.
  • Radius of Gyration: Determines the compactness of the system.
  • PCA: Performs principal component analysis (PCA) on backbone atoms.
  • Salt Bridges: Identifies ionic interactions between acidic and basic residues.
  • Hydrogen Bonds: Analyzes hydrogen bonds between proteins, nucleic acids, and other molecules.
  • Visualization: Generates plots for RMSD, RMSF, PCA, salt bridges, and hydrogen bonds.

Functions

rmsd_calc

Calculates RMSD for proteins.

rmsd_calc(top_file, traj_file)

rmsf_calc

Calculates RMSF for protein chains.

rmsf_calc(top_file, traj_file, start_fr)

rgyr_calc

Calculates the radius of gyration for proteins.

rgyr_calc(top_file, traj_file, start_fr)

pca_calc

Performs PCA on backbone atoms.

pca_calc(top_file, traj_file, start_fr)

saltbridges

Identifies salt bridges between acidic and basic residues.

saltbridges(top_file, traj_file)

hbond_calc

Analyzes hydrogen bonds between proteins and other molecules.

hbond_calc(top_file, traj_file, start_fr)

Dependencies

The script requires the following Python libraries:

  • MDAnalysis
  • Pandas
  • Matplotlib
  • Seaborn
  • Plotly
  • Numpy

Output

The script generates the following output files:

  • CSV files containing analysis results (e.g., RMSD, RMSF, PCA).
  • Plots in TIFF or PNG format for visualization.
  • Text files with statistical summaries.

Example

To analyze trajectories in the systems folder and save results in the output folder:

python trajstat.py --systems /path/to/systems --output_dir /path/to/output

Prerequisites

  • Ensure Python 3.x is installed and added to your system PATH.
  • Install required libraries using pip install MDAnalysis pandas matplotlib seaborn plotly numpy.
  • Ensure trajectory files are in a compatible format (e.g., DCD, PDB).

Notes

  • Ensure trajectory files and topology files are correctly paired.
  • Use consistent naming conventions for input files to avoid errors.
  • Output plots are saved in high-resolution formats for publication purposes.

Error Handling

To improve error handling, consider implementing logging and exception handling in your script. For example:

import logging

        logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

        try:
            # Example function call
            rmsd_calc("topology.pdb", "trajectory.dcd")
        except Exception as e:
            logging.error(f"Error occurred: {e}")
            exit(1)

Author

This script was developed for molecular dynamics trajectory analysis as part of the MSc Bioinformatics project by Keaghan Brown.