RnaChipIntegrator is a utility that performs integrated analyses of RNA-Seq and ChIP-Seq data, identifying the nearest ChIP peaks to each transcript, and vice versa. It can also be used with any set of genomic features e.g. canonical genes, CpG islands etc, or expression data e.g. microarrays.
Download the latest version:
See the change log for the most recent changes, and the downloads page for all available versions.
This release addresses a bug that was discovered in the previous version of the code (v0.3.3), which meant that peaks within transcripts or promoter regions on the negative strand might sometimes be incorrectly flagged as not within those regions.
This bug did not affect the reporting of the peaks and transcripts
A number of installation options are available:
You can use Python's setuptools to install the RnaChipIntegrator code. To do this, download the archive (either .tar.gz or .zip), unpack it and install as follows:
$ cd RnaChipIntegrator-0.4.0 $ sudo python setup.py install
Alternatively if you're using pip (see www.pip-installer.org/en/latest/index.html) to manage your Python installations then you can install as follows:
$ sudo pip install RnaChipIntegrator.tar.gz
If you don't want to do a system install then you can just unpack the archive and run the Python code directly using:
$ /path/to/RnaChipIntegrator.py [options] ...
(Note you will need to add the '.py' extension to the program name when running the examples below.)
See the INSTALL document for more detailed information about the how to install and uninstall the utility.
General usage syntax is:
$ RnaChipIntegrator [options] <rna-data-file> <chip-data-file>
Use -h to produce a list of options and descriptions of their functions:
$ RnaChipIntegrator -h
Specific examples using example data included in the downloads:
$ RnaChipIntegrator --window=130000 ExpressionData.txt ChIP_edges.txt $ RnaChipIntegrator --window=130000 ExpressionData.txt ChIP_peaks.txt
See the manual for more detailed information about the analyses.
The expression data (RNA-seq) file must be a tab-delimited file with 5 columns of data for each genomic feature (one per line):
ID chr start end strand
chr is the chromosome name, start and end define the limits of the feature, and strand must be either + or -. GeneID is a name which is used to identify the genomic feature in the output. Genes in between a reported gene and the ChIP site are also reported.
Optionally there can be a 6th column, indicating whether the gene was differentially expressed (= 1) or not (= 0).
(See the example ExpressionData.txt file.)
The ChIP-seq data file must be a tab-delimited file with 3 columns of data for each ChIP peak (one per line):
chr start end
chr is the chromosome name (must match those in the expression data file), and start and end define the ChIP peaks - these can either be summits (in which case end - start = 1), or regions (with start and end indicating the extent).
(See the example ChIP_edges.txt and ChIP_peaks.txt files.)
Note that different analyses will be selected depending on whether the ChIP peaks are defined as summits or regions.
RnaChipIntegrator was written for Python 2.7, but has also been used with Python 2.4. It uses the xlwt, xlrd and xlutils Python libraries to write its XLS output files. If these are not available then the program will still run but won't produce an XLS file.
Artistic Licence 2.0
Ian Donaldson, Leo Zeef and Peter Briggs, University of Manchester, Faculty of Life Sciences Bioinformatics Core Facility Facility
Peter Briggs (peter.briggs [at] manchester.ac.uk)
See the source code on GitHub: https://github.com/fls-bioinformatics-core/RnaChipIntegrator
You can clone the project with Git by running:
$ git clone git://github.com/fls-bioinformatics-core/RnaChipIntegrator