gratools importο
Pre-process your GFA file once for fast, repeated access to all graph data.
Pre-processes your GFA to allow near-instant access to segments and walks. Essential for large-scale pangenome graphs.
Creates an optimized import directory. Other GraTools commands will automatically detect and use these files.
The import command is a critical first step for using GraTools efficiently. It pre-processes a GFA file to create several auxiliary files that allow other commands to access graph data much faster. It is highly recommended to run import on your GFA file before using other analysis or extraction commands, especially with large graphs.
π οΈ Optionsο
View Command Line Options
$ gratools import
Welcome to GraTools version: '1.2.0.dev19'
@author: GraTools team's
____ __________ ____
6MMMMMb/ MMMMMMMMMM `MM
8P YM / MM \ MM
6M Y ___ __ ___ MM _____ _____ MM ____
MM `MM 6MM 6MMMMb MM 6MMMMMb 6MMMMMb MM 6MMMMb\
MM MM69 " 8M' `Mb MM 6M' `Mb 6M' `Mb MM MM' `
MM ___ MM' ,oMM MM MM MM MM MM MM YM.
MM `M' MM ,6MM9'MM MM MM MM MM MM MM YMMMMb
YM M MM MM' MM MM MM MM MM MM MM `Mb
8b d9 MM MM. ,MM MM YM. ,M9 YM. ,M9 MM L ,MM
YMMMMM9 _MM_ `YMMM9'Yb_MM_ YMMMMM9 YMMMMM9 _MM_MYMMMM9
\ / /
/''A''\ /''''''\ / /''''A'''''\
...GC| |..ATG...C...CG...T....TAG..'..GC.| |...
\..C../ \.............../ \...TATA.../
Please cite our gitlab: https://forge.ird.fr/diade/gratools.git\
Usage: gratools import [OPTIONS]
The 'import' command parses a GFA file and creates several auxiliary files
(e.g., a BAM representation of segments, BED files for walks per sample, and a
statistics summary). These imported files allow subsequent GraTools commands
to operate much more quickly on the GFA data.
It is highly recommended to run 'import' on your GFA file before using other
analysis or extraction commands for optimal performance, especially with large
GFA files. If an import already exists, GraTools will typically use it; this
command can be used to explicitly (re)generate the import.
For more details, see the full documentation:
https://gratools.readthedocs.io/en/latest/commands/import.html
import Generation Options:
-g, --gfa PATH
Path to the input GFA file (e.g., myGraph.gfa or myGraph.gfa.gz).
[required]
--import-links / --no-import-links
import links on DB [default: no-import-links]
--disable-progress
Disable progress bars, which may improve performance for large GFA files.
Logging Options:
-vv, --verbosity [DEBUG|INFO|ERROR]
Set the logging verbosity level. [default: INFO]
-l, --log-path DIRECTORY
Directory where the log files will be saved. If not specified, logs will be
placed in the main output directory (or in a default GraTools log
location).
Performance Options:
-t, --threads INTEGER
Number of threads to be used for parallelizable operations. [default: 1]
Other options:
-h, --help
Show this message and exit.
βΆοΈ Usage Examplesο
The simplest way to prepare your graph. GraTools creates a folder named [GFA_NAME]_Gratools-IMPORT/.
$ gratools import --gfa Og_cactus.gfa.gz
Speed up the process with multiple threads.
$ gratools import --gfa Og_cactus.gfa.gz --threads 8
Include connectivity (links) for deeper analysis be more slow but stats command will be display more value.
$ gratools import --gfa Og_cactus.gfa.gz --import-links --threads 8
β
βοΈ How It Worksο
The importing process parses the GFA file once and stores the information in optimized formats. This avoids re-parsing the entire GFA for every subsequent command, leading to significant performance gains. When you run import, GraTools performs the following actions:
BAM Conversion
Segments are converted into a specialized BAM format. This allows GraTools to perform random access and retrieve specific sequence fragments without reading the whole file.
BED Mapping
Walks (paths of samples through the graph) are mapped into individual BED files per sample. This enables extremely fast coordinate-based queries.
Connectivity Database
If --import-links is used, all edges between segments are stored in a database. This is required for topology-heavy commands like stats.
Summary Generation
A baseline summary of the graph properties is calculated once, preventing redundant calculations in future sessions.
All these generated files are stored together in the GraTools import directory. If an import already exists, other GraTools commands will automatically detect and use it.
The import directory can be quite large (especially the BAM and Link files). Ensure you have enough disk space before importing massive graphs. If you update your GFA file, remember to delete the existing import directory and re-run the command to avoid data mismatch.
π Next Steps
Explore your graph: gratools stats
List content: gratools list_samples
Full Workflow: π¦ Installation