How the new Meta-CAMP pipeline handles everything from error correction to MAG binning seamlessly inside Via Foundry
Imagine you’re flying from Lima to Florence. No direct flight means multiple stopovers. At each one, you drag your suitcases through terminals, stand in long security lines, unpack for inspection, and recheck your bags. It’s slow, stressful, and exhausting.
Now, compare that to a multi-leg itinerary where your baggage is checked through to the final destination: no carrying bags, no rechecking, no delays. You enjoy your flights, arrive at your destination and claim your luggage that’s safely arrived.
Historically, metagenomics analysis was like the first trip. Researchers had to navigate a fragmented landscape, hopping between disconnected tools, manually transferring data, and wrestling with technical obstacles along the way.
Today, we have the luxury of the second trip, with a pipeline that changes everything: Meta-CAMP. It connects all essential processes, unifying the entire metagenomic workflow from error correction to annotated MAGs within a single, reproducible workflow. It manages your data automatically across modules, removes repetitive work, and lets you focus on results instead of technical hurdles.
What is Meta-CAMP
Meta-CAMP is a pipeline built for researchers working with real-world metagenomic sequencing data. This includes studies focused on human and mouse microbiomes such as gut, skin, or saliva, as well as environmental samples like soil, wastewater, marine ecosystems, and everything in between. It also supports preclinical and translational research in pharmaceutical and biotech settings, along with comparative analyses of microbial diversity across hosts, locations, or timepoints.
Example use cases include monitoring microbiome shifts in disease models, characterizing soil microbiota for sustainable agriculture, and identifying biomarkers in environmental samples. Meta-CAMP is designed to support these workflows from start to finish, offering flexibility, reproducibility, and ease of use for teams working across diverse research contexts.
Meta-CAMP is based on the open-source pipeline developed by the MetaSUB consortium, and is now available on Via Foundry. It has already been tested at scale, powering the MetaSUB Consortium’s global initiative to analyze thousands of urban microbiome samples. Its effectiveness is detailed in the 2023 bioRxiv preprint, “A Modular, End-to-End Workflow for Urban Metagenomics” (Mak et al., 2023).
Why Meta-CAMP Changes the Game for Metagenomics
The Problem with Disconnected Tools
Running a metagenomics workflow manually, tool by tool, is like trying to reach Florence without flying at all: piecing together buses, ferries, and trains across continents.
Without a proper bioinformatics pipeline, metagenomics workflows are fragmented and labor-intensive. Researchers switch between tools by hand, handling file conversions, data transfers, and technical hurdles at every stage. There is no built-in reproducibility, versioning, or tracking, undermining the consistency and traceability of results. Visualizations often require additional separate tools or custom scripts, adding yet another layer of complexity.
What Sets Meta-CAMP Apart from Other Pipelines
Other pipelines might get you on a plane, but you are still left figuring out all the transfers. Some pipelines cover only portions of the workflow, leaving gaps that require manual fixes.
Meta-CAMP integrates essential steps such as error correction, host removal, taxonomy, binning, and quality control into one streamlined process. Unlike solutions that limit flexibility or lack collaboration features, Meta-CAMP allows users to choose between tools such as Kraken2, Bracken, and MetaPhlAn, while maintaining a centralized workspace that supports teamwork and version control.
Some steps in Meta-CAMP are highly parameter and database-driven, so generating accurate results may require iteration and expertise. Via Foundry makes that process easier by letting users run multiple tools and versions in parallel, compare outputs, and refine workflows, all in one place.
The Full Itinerary: What Meta-CAMP Automates for You
Once your data is checked in, Meta-CAMP carries it through each leg of the itinerary without manual transfers or tool-hopping. The pipeline runs as a modular, containerized workflow inside Via Foundry, integrating essential steps from raw reads to annotated genomes in one reproducible environment. Here’s how it works:
1. Set Up Your Run
Every analysis begins with defining the environment and inputs. Researchers upload raw sequencing reads and accompanying metadata, either manually or from cloud storage platforms such as Google Cloud or Amazon S3. Host genomes can be selected from human, mouse, or a custom option. Adapter sequences are also configurable, using built-in defaults or user-provided files.
Workflow settings, including optional outputs like assembly statistics, can be adjusted before launch. No scripting is required.
2. Clean and Filter the Data
This stage includes both the Short Read Error Correction and Host Removal modules, and prepares the data for downstream analysis. Meta-CAMP performs all early-stage cleaning steps automatically, including:
- Adapter trimming
- Removal of low-quality reads
- Host genome decontamination using a selected reference (human, mouse, or custom)
- Error correction using Tadpole or BayesHammer
The pipeline ensures that only high-confidence, non-host reads progress to the next stage. All filtering is handled automatically within Via Foundry, removing the need for manual file management or format conversions.
3. Profile the Microbial Community
Taxonomic profiling identifies the organisms present in the sample and estimates their relative abundance. Meta-CAMP automatically runs several established tools, including:
- Kraken2 for rapid classification
- Bracken for abundance refinement
- MetaPhlAn for detailed taxonomic classification
A stricter strain-level classification can be achieved using StrainPhlAn, which is also embedded in the MetaPhlAn framework.
Since results from all tools are available within the platform, users can review the outputs that best align with their goals, whether that’s speed, specificity, or compatibility with legacy datasets.
4. Assemble the Reads
Assembly begins in parallel with taxonomic profiling. While the profiling tools classify the organisms present, this step reconstructs longer contigs from the cleaned reads. These assembled sequences are critical for downstream analysis, especially when investigating unknown organisms or performing gene prediction.
Meta-CAMP handles this step automatically; no command-line work or manual sorting required.
5. Annotate the Contigs
Once the reads are assembled into contigs, Meta-CAMP performs gene cataloging to identify coding regions and functional elements. This allows researchers to move beyond taxonomic composition and explore what the microbial community is capable of doing, from metabolic pathways to antibiotic resistance.
Annotation is fully integrated into the pipeline and runs immediately after assembly completes.
6. Bin and Evaluate Metagenome-Assembled Genomes (MAGs)
Contigs are grouped into MAGs using integrated tools based on machine learning and deep learning:
- MetaBAT2, which uses contig depth, GC contents, tetranucleotide frequency (TNF), abundance, etc.
- MaxBin2, which determines draft genomes from contig TNF and coverage
- MetaBinner, which utilizes contig coverage, TNF and single copy marker genes
- SemiBin2, which is based on self-supervised learning
- VAMB, which uses variational autoencoders for deep-learning
- CONCOCT, which applies clustering based on sequence composition and coverage
Each bin is evaluated for completeness and contamination, helping researchers prioritize high-quality genomes for further analysis. MAGs are essential for understanding population structure and function, especially in environments with unknown or unculturable species.
7. Visualize and Share Results
Meta-CAMP includes built-in visualization tools, eliminating the need for extra scripts or external software.
- Pavian for taxonomic exploration
- MicroViz and Animalcules for statistical and diversity plots
- Krona for hierarchical views of MAG structure and quality
All outputs, including logs, parameters, and reports, are stored in a reproducible workspace. Researchers can review, share, or export results without leaving the platform.
A Pipeline That Works the Way You Do
Meta-CAMP was built for real-world research conditions. It runs smoothly on standard lab machines, without the need for expensive infrastructure. Like flying economy and still getting the speed, service, and confidence of first class, it delivers a streamlined experience that removes friction without sacrificing power.
It supports multiple versions of commonly used databases like NCBI and RDP. That means you can revisit older projects, collaborate across teams using different setups, and keep results reproducible over time.
Instead of locking you into one tool at each step, Meta-CAMP runs multiple trusted options in parallel. You can review different outputs side by side, compare results, and move forward without second-guessing or setting up anything twice.
Everything is seamlessly orchestrated, from beginning to end. Meta-CAMP, powered by Via Foundry, does the heavy lifting, so your research can move faster, with fewer bottlenecks and better outcomes.
Ready to board?
We’d be happy to show you how Meta-CAMP can support your next research journey. Book a demo here.
Learn More About Meta-CAMP and the Research Behind It:
🔗 MetaSUB Consortium GitHub Repository
🔗 A Modular, End-to-End Workflow for Urban Metagenomics (bioRxiv Preprint)