# Analysis workflow¶

Here we present a full pipeline example for running CoCoNet from the raw reads to final bins. We assume that you have paired-end illumina reads available in the data/ directory. Each sample is named with the convention <sample>_R{1,2}.fastq.gz.

## Trimming¶

The first step consists in quality trimming and filtering your reads. Many tools can be used for this step, such as fastp or trimmomatic. Below is a trimming example with fastp. Note that the parameters need to be tuned to your specific case. FastQC and MultiQC can be very useful in that regard.

#!/usr/bin/env bash

# SAMPLES is an environment variable containing the sample names
# (e.g.: export SAMPLES="sample1 sample2")

for sample in $SAMPLES; do fastp \ -i data/${sample}_R1.fastq.gz -I data/${sample}_R2.fastq.gz \ -o${sample}-trimmed_R1.fastq.gz -O {sample}-trimmed_R2.fastq.gz done ## Assembly¶ To assemble your reads, you have many choices. One of the most accurate for metagenomics is metaSPAdes. However if you have a lot of samples and/or high coverage, metaSPAdes will require a significant amount of time and memory, in which case Megahit can be a good alternative. # combine all samples cat data/*-trimmed_R1.fastq.gz > forward.fastq.gz cat data/*-trimmed_R2.fastq.gz > reverse.fastq.gz # run metaspades metaspades.py \ -1 forward.fastq.gz -2 reverse.fastq.gz \ -o assembly-outputs ## Coverage¶ To compute contig coverage, we can align the trimmed reads on the assembled contigs with tools such as bwa or bowtie2: mkdir coverage # Build index bwa index -p coverage/index assembly-outputs/scaffolds.fasta for sample inSAMPLES; do
bwa mem -p coverage/index data/${sample}-trimmed_R*.fastq.gz \ | samtools view -bh \ | samtools sort -o coverage/${sample}.bam
samtools index coverage/${sample}.bam coverage/${sample}.bai
done

## Binning¶

You can finally use CoCoNet and bin your contigs:

coconet run --fasta assembly-outputs/scaffolds.fasta --bam coverage/*.bam --output binning

The binning assignment are then available in the file binning/bins-*.csv.