gratools core_dispensable_ratio
Options
Usage Examples
Displays the ratio of core and dispensable segments
The following example calculates the ratio for a GFA containing 5 samples. Segments are considered “core” if they are shared by at least 4 samples (–shared-min 4) and “dispensable” if they are present in 2 or fewer samples (–specific-max 2). A length filter is also applied to only consider segments of 50 bp or longer (–filter-len 50).
$ gratools core_dispensable_ratio -g Og_cactus.gfa.gz --input-as-number \
--shared-min 4 --specific-max 2 --filter-len 50 --threads 4
─────────────────────────────────────────── Summary ──────────────────────────────────────
Total segments in GFA: 2,354,995
Total segments analyzed: 2,354,995
Total segments passing length filter (≥ 50bp): 210,046 (8.92%)
Core vs. Dispensable Segments — Og_cactus
╭────────────────────────────────────────────────────┬───────────┬───────────┬────────────╮
│ Category │ Count │ Total │ Percentage │
├────────────────────────────────────────────────────┼───────────┼───────────┼────────────┤
│ Shared (Core) - Raw │ 1,162,062 │ 2,354,995 │ 49.34% │
│ Specific (Dispensable) - Raw │ 891,345 │ 2,354,995 │ 37.85% │
│ Shared (Core) - Filtered (Length >= 50bp) │ 195,840 │ 210,046 │ 93.24% │
│ Specific (Dispensable) - Filtered (Length >= 50bp) │ 8,602 │ 210,046 │ 4.10% │
│ Segments Filtered Out by Length │ 2,144,949 │ 2,354,995 │ 91.08% │
╰────────────────────────────────────────────────────┴───────────┴───────────┴────────────╯
Understanding the Output Table
Shared (Core) - Raw: The number of core segments as a percentage of all segments in the GFA.
Specific (Dispensable) - Raw: The number of dispensable segments as a percentage of all segments in the GFA.
Shared (Core) - Filtered: The number of core segments that also meet the length filter, as a percentage of only the filtered segments. This shows the composition of the longer segments.
Specific (Dispensable) - Filtered: The number of dispensable segments that meet the length filter, as a percentage of only the filtered segments.
Segments Filtered Out by Length: The total number and percentage of segments that were excluded from the “Filtered” analysis because they were shorter than the –filter-len value.
Illustrated Example
Understanding the Process
The `shared-min` and `specific-max` options: The
--shared-min(-sm) and--specific-max(-spm) options specify the number of samples a segment must be found in to be considered part of the “core” or “dispensable” genome, respectively.Input mode: The user must specify whether the thresholds are absolute numbers or percentages.
With
--input-as-percentage, the user can specify--shared-min 90%and--specific-max 10%. Segments present in 90% or more of the samples will be considered “core,” and those in 10% or less will be considered “dispensable.”With
--input-as-number, the user specifies the exact number of samples.
Example: For the command
gratools core_dispensable_ratio --input-as-number --specific-max 2 --shared-min 4, the minimum number of samples for a segment to be “core” is 4, and the maximum number of samples for it to be “dispensable” is 2.
Length filter: The
--filter-len(-fl) option applies a filter to exclude segments shorter than the specified value from the “filtered” analysis.