-
Notifications
You must be signed in to change notification settings - Fork 78
Parallel Analysis Using PBS Job Scheduler
This tutorial is specific to the PBS Job Scheduler but can be used as a template and adapted to work with other job scheduling systems.
Dynamically generate an example PBS script named parallel_analysis_using_PBS_example.pbs
:
network_size = 10
# Define PBS script
bash_lines = '\n'.join([
'#! /bin/bash',
# set project name
'#PBS -P ProjectName',
# set job name
'#PBS -N JobName',
# choose number of cores and memory
'#PBS -l select=1:ncpus=1:mem=1GB',
# set walltime hh:mm:ss
'#PBS -l walltime=01:00:00',
# set job array numbers to match network size
'#PBS -J 0-{}'.format(network_size),
# load Python
'module load python/3.7.3',
# if necessary, activate local environment where IDTxl is installed
'source /ProjectName/idtxl_env/bin/activate',
# run analysis on single target
'python analyse_single_target.py $PBS_ARRAY_INDEX'
])
# Generate and save PBS script file
bash_script_name = 'parallel_analysis_using_PBS_example.pbs'
with open(bash_script_name, 'w', newline='\n') as bash_file:
bash_file.writelines(bash_lines)
The job array can be submitted directly on the cluster from the command line interface using the command qsub parallel_analysis_using_PBS_example.pbs
. It is also possible to submit jobs dynamically from python as follows:
from subprocess import call
call(('qsub {0}').format(bash_script_name), shell=True, timeout=None)
The PBS script will call the python script analyse_single_target.py
multiple times (one time for each target). On each call, the target number will be passed as an argument. This is a template for the python script analyse_single_target.py
:
# analyse_single_target.py
import sys
from idtxl.multivariate_te import MultivariateTE
from idtxl.data import Data
import pickle
# Read parameters from shell call
target_id = int(sys.argv[1])
# Load time series
time_series = ...
# Initialise Data object and set dim_order to reflect your data
dat = Data(time_series, dim_order='psr')
# Initialise analysis object and define settings
network_analysis = MultivariateTE()
settings = ...
# Run analysis
res = network_analysis.analyse_single_target(settings, dat, target_id)
# Save results dictionary using pickle
path = 'my_directory/res.{}.pkl'.format(str(target_id))
pickle.dump(res, open(path , 'wb'))
The single target results can then be combined as shown in the Combine Single Target tutorial.