Installation
Requirements
Following a self imposed guideline, most things written to handle nanopore data or bioinformatics in general, will use as little 3rd party libraries as possible, aiming for only core libraries, or have all included files in the package.
In the case of fast5_fetcher.py
and batch_tater.py
, only core python libraries are used. So as long as Python 2.7+ is present, everything should work with no extra steps.
There is one catch. Everything is written primarily for use with Linux. Due to MacOS running on Unix, so long as the GNU tools are installed (see below), there should be minimal issues running it. Windows however may require more massaging.
SquiggleKit tools were not made to be executable to allow for use with varying python environments on various operating systems. To make them executable, add #!
paths, such as #!/usr/bin/env python2.7
as the first line of each of the files, then add the SquiggleKit directory to the PATH variable in ~/.bashrc
, export PATH="$HOME/path/to/SquiggleKit:$PATH"
Install
git clone https://github.com/Psy-Fer/SquiggleKit.git
pip install numpy h5py sklearn matplotlib
Quick start
fast5_fetcher
If using MacOS, and NOT using homebrew, install it here:
homebrew installation instructions
then install gnu-tar with:
brew install gnu-tar
Basic use on a local computer
fastq
python fast5_fetcher.py -q my.fastq.gz -s sequencing_summary.txt.gz -i name.index.gz -o ./fast5
paf
python fast5_fetcher.py -p my.paf -s sequencing_summary.txt.gz -i name.index.gz -o ./fast5
flat
python fast5_fetcher.py -f my_flat.txt.gz -s sequencing_summary.txt.gz -i name.index.gz -o ./fast5
sequencing_summary.txt only
python fast5_fetcher.py -s sequencing_summary.txt.gz -i name.index.gz -o ./fast5
SquigglePull
All raw data:
python SquigglePull.py -rv -p ~/data/test/reads/1/ -f all > data.tsv
Positional event data:
python SquigglePull.py -ev -p ./test/ -t 50,150 -f pos1 > data.tsv
SquigglePlot
Individual File full signal
python SquigglePlot.py -i ~/data/test.fast5
Plot all from top folder in green
python SquigglePlot.py -p ~/data/ --plot_colour -g
Plot first 2000 data points of each read from signal file and save at 300dpi pdf*
python SquigglePlot.py -s signals.tsv.gz --plot_colour teal -n 2000 --dpi 300 --no_show o--save test.pdf --save_path ./test/plots/
segmenter
Stall identification
python segmenter.py -s signals.tsv.gz -ku -j 100 > signals_stall_segments.tsv
MotifSeq
(see full requirements for MLPY installation instructions)
Nanopore adapter identification
python MotifSeq.py -s signals.tsv.gz --segs signals_stall_segments.tsv -a adapter.model -t 120 -d 120 > signals_adapters.tsv
Full requirements
fast5_fetcher,py
:
- core python libraries
SquigglePull.py
:
- numpy
- h5py
- sklearn
pip install numpy h5py sklearn
SquigglePlot.py
:
- numpy
- matplotlib
- h5py
pip install numpy h5py matplotlib
segmenter.py
:
- numpy
- matplotlib
- h5py
- sklearn
pip install numpy h5py sklearn matplotlib
MotifSeq.py
:
- numpy
- h5py
- sklearn
- matplotlib
- mlpy 3.5.0 (don't use pip for this)
pip install numpy h5py sklearn matplotlib
Installing mlpy:
- Download the Files
- Install Instructions