AsyncGfaDatabase

class gratools.AsyncGfaDatabase(db_file, timeout=30.0)[source]

Bases: object

Manage an asynchronous SQLite database for storing and querying GFA link data. It uses aiosqlite for non-blocking operations within an asyncio event loop and serializes writes via an internal FIFO queue to prevent SQLite lock contention.

Attributes

db_filePath

Path to the SQLite database file.

timeoutfloat

Maximum timeout (in seconds) for SQLite lock acquisition.

loggerlogging.Logger

Logger instance.

_connOptional[aiosqlite.Connection]

Shared SQLite connection (or None if not connected).

_write_queueasyncio.Queue

Asynchronous queue for batches of links to be inserted. Max size 100.

_sql_taskOptional[asyncio.Task]

Background task consuming the queue and writing to the database.

_shutdownbool

Flag to signal shutdown to the writer task.

Methods Summary

batch_insert_links(links)

Enqueue a batch of links for non-blocking insertion.

close()

Properly shut down the database: - Signal the SQL writer task to stop.

connect()

Connect to the SQLite db (if not already connected), configure PRAGMA settings, create the 'links' table schema (if it doesn't exist), and start the SQL writer task.

create_indexes()

Create indexes on seg_id_1 and seg_id_2 to accelerate queries.

find_children_and_grandchildren(node_id)

Find direct successors (children) and second-degree successors (grandchildren) of a segment.

query_links_by_segment(segment_id)

Retrieves all links where segment_id appears as seg_id_1 or seg_id_2.

test_query_links(segment_id)

Retrieve and categorize links related to a given segment.

Methods Documentation

Enqueue a batch of links for non-blocking insertion. Must be called after await connect().

Parameters:

links (List[Tuple[str, int, int, str, int, int]]) – List of tuples, each representing a link: (seg_id_1, orient_seg_1, orient_key_seg_1, seg_id_2, orient_seg_2, orient_key_seg_2).

Return type:

None

async close()[source]

Properly shut down the database: - Signal the SQL writer task to stop. - Wait for the writer task to finish processing its queue (with timeout). - Cancel the task if it doesn’t finish in time. - Create indexes (important to do this after all writes). - Close the SQLite connection.

Return type:

None

async connect()[source]

Connect to the SQLite db (if not already connected), configure PRAGMA settings, create the ‘links’ table schema (if it doesn’t exist), and start the SQL writer task. This method is idempotent: if already connected, it does nothing.

Return type:

None

async create_indexes()[source]

Create indexes on seg_id_1 and seg_id_2 to accelerate queries. Should ideally be called after all data insertions are complete.

Return type:

None

async find_children_and_grandchildren(node_id)[source]

Find direct successors (children) and second-degree successors (grandchildren) of a segment. A child is seg_id_2 where node_id is seg_id_1. A grandchild is a child of a child.

Parameters:

node_id (str) – The starting segment ID.

Returns:

A dictionary – {“children”: [IDs], “grandchildren”: [IDs]}.

Return type:

Dict[str, List[str]]

Retrieves all links where segment_id appears as seg_id_1 or seg_id_2.

Parameters:

segment_id (str) – The ID of the target segment.

Returns:

List of tuples, each representing a full link row from the database.

Return type:

List[Tuple[Any, …]]

Retrieve and categorize links related to a given segment. - “before”: links where segment_id is seg_id_2. - “after”: links where segment_id is seg_id_1.

Parameters:

segment_id (str) – The segment to analyze.

Returns:

List of tuples – (connected_segment_id, position_type, orient_seg_1, orient_seg_2).

Return type:

List[Tuple[str, str, int, int]]

Parameters: