API Reference

This section documents the Python API for ggblab, automatically generated from docstrings.

Main Interface

ggblab: Interactive geometric scene construction with Python and GeoGebra.

This package provides a JupyterLab extension that opens a GeoGebra applet and enables bidirectional communication between Python and GeoGebra through a dual-channel architecture (IPython Comm + Unix socket/TCP WebSocket).

Main Components:

GeoGebra: Primary interface for controlling GeoGebra applets
ggb_comm: Communication layer (IPython Comm + out-of-band socket)
ggb_construction: GeoGebra file (.ggb) loader and saver
ggb_parser: Dependency graph parser for GeoGebra constructions

Example

>>> from ggblab import GeoGebra
>>> ggb = await GeoGebra().init()
>>> await ggb.command("A=(0,0)")
>>> value = await ggb.function("getValue", ["A"])

GeoGebra Control

class ggblab.ggbapplet.GeoGebra[source]

Bases: object

Main interface for controlling GeoGebra applets from Python.

This class implements a singleton pattern to ensure only one GeoGebra instance per kernel session. It provides async methods for sending commands and calling GeoGebra API functions.

The communication uses a dual-channel architecture: - IPython Comm: Primary control channel - Unix socket/TCP WebSocket: Out-of-band response delivery during cell execution

Semantic Validation: - check_syntax: Validates command strings can be tokenized - check_semantics: Validates referenced objects exist in applet - Future: Type checking, scope/visibility validation

construction

File loader/saver for .ggb files

Type:: ggb_construction

parser

Dependency graph parser with command learning

Type:: ggb_parser

comm

Communication layer (initialized after init())

Type:: ggb_comm

kernel_id

Current Jupyter kernel ID

Type:: str

app

ipylab frontend interface

Type:: JupyterFrontEnd

check_syntax

Enable syntax validation (default: False)

Type:: bool

check_semantics

Enable semantic validation (default: False)

Type:: bool

_applet_objects

Cached object names from applet (updated by command/function)

Type:: set

Example

>>> ggb = GeoGebra()
>>> await ggb.init()
>>> await ggb.command("A=(0,0)")
>>> result = await ggb.function("getValue", ["A"])

>>> # With validation
>>> ggb.check_syntax = True
>>> ggb.check_semantics = True
>>> await ggb.command("Circle(A, B)")

async command(c)[source]

Execute a GeoGebra command with optional validation.

Parameters:

c (str) – GeoGebra command string (e.g., “A=(0,0)”, “Circle(A, 2)”).

Returns:

Response from GeoGebra (typically includes object label).

Return type:

dict

Raises:

GeoGebraSyntaxError – If syntax check is enabled and command has syntax errors.
GeoGebraSemanticsError – If semantics check is enabled and validation fails.

Example

>>> await ggb.command("A=(0,0)")
>>> await ggb.command("B=(3,4)")
>>> await ggb.command("Circle(A, Distance(A, B))")

>>> # With validation
>>> ggb.check_syntax = True
>>> ggb.check_semantics = True
>>> await ggb.command("Circle(A, B)")  # Validates syntax and references

async function(f, args=None)[source]

Call a GeoGebra API function.

Parameters:

f (str) – GeoGebra API function name (e.g., “getValue”, “getXML”).
args (list, optional) – Function arguments. Defaults to None.

Returns:

Function return value from GeoGebra.

Return type:

Any

Example

>>> value = await ggb.function("getValue", ["A"])
>>> xml = await ggb.function("getXML", ["A"])
>>> all_objs = await ggb.function("getAllObjectNames")

async init()[source]

Initialize the GeoGebra widget and communication channels.

This method: 1. Starts the out-of-band socket server (Unix socket on POSIX, TCP WebSocket on Windows) 2. Registers the IPython Comm target (‘ggblab-comm’) 3. Opens the GeoGebra widget panel via ipylab with communication settings 4. Initializes the object cache

The widget is launched programmatically to pass kernel-specific settings (Comm target, socket path) before initialization, avoiding the limitations of fixed arguments from Launcher/Command Palette.

Returns:: Self reference for method chaining.
Return type:: GeoGebra

Example

>>> ggb = await GeoGebra().init()
>>> # GeoGebra panel opens in split-right position

async refresh_object_cache()[source]

Refresh the cached set of known objects from the applet.

Called automatically during init() and can be called manually to synchronize the object cache with current applet state.

Communication Layer

class ggblab.comm.ggb_comm[source]

Bases: object

Dual-channel communication layer for kernel↔widget messaging.

Implements a combination of IPython Comm (primary) and out-of-band socket (Unix domain socket on POSIX, TCP WebSocket on Windows) to enable message delivery during cell execution when IPython Comm is blocked.

IPython Comm cannot receive messages while a notebook cell is executing, which breaks interactive workflows. The out-of-band socket solves this by providing a secondary channel for GeoGebra responses.

Architecture:

IPython Comm: Command dispatch, event notifications, heartbeat
Out-of-band socket: Response delivery during cell execution

Comm target is fixed at ‘ggblab-comm’ because multiplexing via multiple targets would not solve the IPython Comm receive limitation.

target_comm: IPython Comm object

target_name

Comm target name (‘ggblab-comm’)

Type:: str

server_handle: WebSocket server handle

server_thread: Background thread running the socket server

clients

Currently connected WebSocket clients

Type:: set

socketPath

Unix domain socket path (POSIX)

Type:: str

wsPort

TCP port number (Windows)

Type:: int

recv_logs

Response storage keyed by message ID

Type:: dict

recv_events

Event queue for frontend notifications

Type:: queue.Queue

See:: docs/architecture.md for detailed communication architecture.

async client_handle(client_id)[source]

handle_recv(msg)[source]

logs = []

mid = None

recv_events = <queue.Queue object>

recv_logs = {}

recv_msgs = {}

register_target()[source]

register_target_cb(comm, msg)[source]

send(msg)[source]

async send_recv(msg)[source]

Send a message via IPython Comm and wait for response via out-of-band socket.

This method: 1. Generates a unique message ID (UUID) 2. Sends the message via IPython Comm to the frontend 3. Waits for the response to arrive via the out-of-band socket 4. Returns the response payload

The 3-second timeout is sufficient for interactive operations. For long-running operations, decompose into smaller steps.

Parameters:: msg (dict or str) – Message to send (will be JSON-serialized).
Returns:: Response payload from GeoGebra.
Return type:: dict
Raises:: TimeoutError – If no response arrives within 3 seconds.

Example

>>> response = await comm.send_recv({
...     "type": "command",
...     "payload": "A=(0,0)"
... })

async server()[source]

start()[source]

Start the out-of-band socket server in a background thread.

Creates a Unix domain socket (POSIX) or TCP WebSocket server (Windows) and runs it in a daemon thread. The server listens for GeoGebra responses.

stop()[source]: Stop the out-of-band socket server.

thread = None

unregister_target_cb(comm, msg)[source]

Construction File Handler

class ggblab.construction.ggb_construction[source]

Bases: object

GeoGebra construction file (.ggb) loader and saver.

Handles multiple file formats: - .ggb files (base64-encoded ZIP archives) - Plain ZIP archives - JSON format - Plain XML (geogebra.xml)

The loader automatically detects file type from magic bytes and extracts the construction XML. The geogebra_xml is automatically stripped to the <construction> element and scientific notation is normalized.

ggb_schema: XML schema for validation

source_file

Path to the loaded file

Type:: str

base64_buffer

Base64-encoded .ggb archive (if applicable)

Type:: bytes

geogebra_xml

Extracted construction XML

Type:: str

Example

>>> construction = ggb_construction()
>>> construction.load('myfile.ggb')
>>> construction.save('output.ggb')

load(file)[source]

Load a GeoGebra construction from file.

Supports multiple formats: - Base64-encoded .ggb (starts with ‘UEsD’) - ZIP archive (starts with ‘PK’) - JSON format (starts with ‘{’ or ‘[‘) - Plain XML

The construction XML is automatically extracted and normalized: - Stripped to <construction> element only - Scientific notation fixed (e-1 → E-1)

Parameters:

file (str) – Path to the .ggb, .zip, .json, or .xml file.

Returns:

Self reference for method chaining.

Return type:

ggb_construction

Raises:

FileNotFoundError – If the file does not exist.
RuntimeError – If file loading fails.

Example

>>> c = ggb_construction().load('circle.ggb')
>>> print(c.geogebra_xml[:100])

save(overwrite=False, file=None)[source]

Save the construction to a file.

Saving behavior: - If base64_buffer is set: writes decoded archive (.ggb format) - If base64_buffer is None: writes plain XML (geogebra_xml) - Target extension does not enforce format (e.g., saving to .ggb with

no base64_buffer will write plain XML bytes)

Parameters:

overwrite (bool) – If True, overwrite source_file. Defaults to False.
file (str, optional) – Target file path. If None, auto-generates next available filename (name_1.ggb, name_2.ggb, …).

Returns:

Self reference for method chaining.

Return type:

ggb_construction

Example

>>> c = ggb_construction().load('circle.ggb')
>>> c.save()  # Saves to circle_1.ggb
>>> c.save(overwrite=True)  # Overwrites circle.ggb
>>> c.save(file='output.ggb')  # Saves to output.ggb

Note

getBase64() from the applet may not include non-XML artifacts (thumbnails, etc.) from the original archive. Saving after API changes produces a leaner .ggb file.

Dependency Parser

class ggblab.parser.ggb_parser(cache_path=None, cache_enabled=True)[source]

Bases: object

Dependency graph parser for GeoGebra constructions.

Analyzes object relationships in GeoGebra constructions by building directed graphs using NetworkX. Provides two graph representations:

G (full dependency graph): Complete construction dependencies
G2 (simplified subgraph): Minimal construction sequences (DEPRECATED)

The parse() method builds the forward/backward dependency graph (G). The parse_subgraph() method attempts minimal extraction but has critical performance limitations (see method docstring and ARCHITECTURE.md).

Command learning: - Automatically extracts and caches GeoGebra commands from construction protocols - Persists command names to a shelve database for cross-project learning - Supports enable/disable of persistence via cache_enabled flag

df

Construction protocol dataframe

Type:: polars.DataFrame

G

Full dependency graph

Type:: nx.DiGraph

G2

Simplified subgraph (from parse_subgraph)

Type:: nx.DiGraph

roots

Objects with no dependencies (in-degree = 0)

Type:: list

leaves

Terminal objects (out-degree = 0)

Type:: list

rd

Reverse mapping from object name to DataFrame row number

Type:: dict

ft

Tokenized function definitions, flattened

Type:: dict

command_cache

Persistent command database

Type:: shelve.DbfilenameShelf

cache_enabled

Enable/disable automatic persistence

Type:: bool

Example

>>> parser = ggb_parser()
>>> parser.df = construction_dataframe
>>> parser.parse()
>>> print(parser.roots)  # Independent objects
>>> print(parser.leaves)  # Terminal constructions
>>> commands = parser.get_known_commands()  # Retrieved cached commands

See:: docs/architecture.md § Dependency Parser Architecture

COLUMNS = ['Type', 'Command', 'Value', 'Caption', 'Layer']

SHAPES = ['point', 'segment', 'vector', 'ray', 'line', 'circle', 'polygon', 'triangle', 'quadrilateral']

fbd(k, recursive=True)[source]

ffd(k, recursive=True)[source]

initialize_dataframe(df=None, file=None)[source]

parse()[source]

Build the full dependency graph (G) from construction protocol.

Analyzes the construction dataframe (self.df) and builds: - Forward dependencies: Object A depends on B (B → A edge) - Backward dependencies: Object A is used by B (A → B edge)

The graph nodes are GeoGebra object names; edges represent dependencies.

Attributes set:

self.G: NetworkX DiGraph of dependencies
self.roots: Objects with no dependencies (starting points)
self.leaves: Objects with no dependents (endpoints)
self.rd: Reverse dict (name → DataFrame row index)
self.ft: Tokenized function calls for each object

Also extracts and persists command names if caching is enabled.

Example

>>> parser.df = polars.DataFrame(construction_protocol)
>>> parser.parse()
>>> print(list(parser.G.edges()))  # [(A, B), (B, C), ...]

parse_subgraph()[source]

Extract a simplified dependency subgraph (G2) from the full graph (G).

WARNING: This implementation has significant performance limitations and should be replaced in v1.0. See ARCHITECTURE.md for details.

Algorithm: - Enumerates all combinations of root objects (O(2^n) combinations) - For each combination, identifies dependent objects that exclusively depend on that combination - Adds edges to G2 when dependencies are uniquely determined

KNOWN LIMITATIONS (Critical): 1. Combinatorial Explosion: O(2^n) time complexity where n = number of root objects.

With 15 roots: ~32,000 paths (manageable)

With 20 roots: ~1,000,000 paths (slow)

With 25+ roots: computation becomes intractable

Infinite Loop Risk: The while loop may not terminate under certain graph topologies where _nodes1 is not updated in each iteration.
Limited N-ary Dependency Support: Only handles 1-2 parents. Constructions where 3+ objects jointly create one output (e.g., polygon from 3+ points) have incomplete representation in G2 (these edges are silently skipped).
Redundant Computation: Neighbor lists are recomputed on every iteration of inner loops, causing O(n) redundant work.
Debug Output: Contains print() statements that should be removed for production.

WORKAROUND: - Use with constructions having <15 independent root objects - For larger constructions, consider implementing the optimized algorithm

described in ARCHITECTURE.md § Dependency Parser Architecture

FUTURE: Replace with topological sort + reachability pruning in v1.0 for O(n(n+m)) complexity.

See: https://github.com/[repo]/ARCHITECTURE.md#dependency-parser-architecture

reconstruct_from_tokens(parsed_tokens)[source]

Reconstruct the original command string from tokenized structured list.

Takes a nested list structure produced by tokenize_with_commas() and reconstructs the original command string with proper parentheses, commas, and spacing.

Parameters:: parsed_tokens (list or str) – Tokenized structured list, or a single token as a string.
Returns:: Reconstructed command string matching the original input structure.
Return type:: str
Raises:: ValueError – If parsed_tokens contains unexpected types.

Examples

>>> parser.reconstruct_from_tokens(['Circle', ['A', ',', '2']])
'Circle(A, 2)'

>>> parser.reconstruct_from_tokens(['Distance', [['Point', ['1', ',', '2']], ',', 'B']])
'Distance(Point(1, 2), B)'

Note

This function is the inverse of tokenize_with_commas(). It handles proper spacing around operators and parentheses.

The ‘register_expr’ parameter (commented out) was intended for register expressions, where applet-assigned labels could be replaced with construction-order-based abstract expressions like ‘${n}’, since GeoGebra may reassign object labels but construction order remains stable.

tokenize_with_commas(cmd_string, extract_commands=False)[source]

Tokenize a GeoGebra command string into a structured list representation.

Parses a mathematical or GeoGebra-like command string and converts it into a nested list structure that preserves parentheses, brackets, and commas. This is useful for analyzing GeoGebra command syntax and extracting object dependencies.

=== COMMA PRESERVATION AND GEOGEBRA’S IMPLICIT MULTIPLICATION ===

This tokenizer preserves commas as explicit tokens for a critical reason: GeoGebra outputs commands with implicit multiplication operators omitted.

Example

Internal representation: Circle(2 * a, b) GeoGebra output: Circle(2a, b) <- Information loss!

The ‘*’ operator is completely omitted, destroying information. This is a one-way transformation: we can’t reliably reconstruct “2*a” from “2a” without external context (is it “2 times a” or “variable named 2a”?).

BUT: GeoGebra ALWAYS uses comma-separation for parameter lists. We exploit this invariant. By preserving commas in the token stream, we can: 1. Identify parameter boundaries (comma = separator) 2. Use whitespace/context to infer where implicit multiplication occurred

This is a workaround for GeoGebra’s poor design. So the question becomes:

BLAME GeoGebra for being a one-way encoder (lose the *? Why?)
PRAISE the developer who recognized the comma-separation invariant

Engineering lesson: deal with imperfect systems and find creative solutions. GeoGebra didn’t help us. We had to be smarter than it.

Parameters:

cmd_string (str) – Input command string (e.g., “Circle(A, Distance(A, B))”).
extract_commands (bool, optional) – If True, also extract command name candidates (tokens preceding ‘(’ or ‘[‘). Returns a dict with ‘tokens’ and ‘commands’ keys. If False (default), returns only the token list for backward compatibility. Default: False
register_expr (#) – Future feature - if True, replace object references
${0} (# with abstract labels like)
${1}
on (etc. based)
protocol. (# generation order in the construction)
rename (# This is useful because GeoGebra applets may)
runtime (# objects at)
remains (but the generation order)
implemented. (# stable within a construction. Not yet)

Returns:

If extract_commands=False (default): Nested list structure with tokens. Parentheses/brackets create nested lists; commas are preserved as ‘,’.
If extract_commands=True: Dict with keys: - ‘tokens’: Nested list structure (as above) - ‘commands’: Set of command name candidates (tokens preceding ‘(’ or ‘[‘)

Return type:

list or dict

Raises:

ValueError – If parentheses/brackets are mismatched.

Examples

>>> tokenize_with_commas("Circle(A, 2)")
['Circle', ['A', ',', '2']]

>>> tokenize_with_commas("Circle(A, 2)", extract_commands=True)
{'tokens': ['Circle', ['A', ',', '2']], 'commands': {'Circle'}}

>>> tokenize_with_commas("Distance(Point(1, 2), B)")
['Distance', [['Point', ['1', ',', '2']], ',', 'B']]

>>> tokenize_with_commas("Distance(Point(1, 2), B)", extract_commands=True)
{'tokens': ['Distance', [['Point', ['1', ',', '2']], ',', 'B']], 'commands': {'Distance', 'Point'}}

Note

Empty or non-string input returns an empty list (or empty dict if extract_commands=True) without raising an error.

Commas are INTENTIONALLY preserved as tokens to work around GeoGebra’s implicit multiplication. This is not a quirk; it’s the core design decision.

Future (register_expr parameter): When implemented, would enable stable object references by using construction order indices instead of runtime labels. Example output: [‘Circle’, [‘${0}’, ‘,’, ‘${1}’]] if register_expr=True and the objects were the 0th and 1st in the protocol.

vertex_on_regular_polygon(v)[source]

write_parquet(file=None)[source]

Parser Utilities

ggblab.parser.flatten(items)[source]

Recursively flatten nested iterables.

Converts nested structures like [[1, [2, 3]], 4] into [1, 2, 3, 4]. Strings and bytes are treated as atomic elements (not iterated).

Note: This function exists because Python refuses to standardize it.: Yes, we have to explicitly check for str/bytes because Python decided strings should be iterable. Thanks for that footgun.

Parameters:: items – Any iterable that may contain nested iterables.
Yields:: Flattened items from the nested structure.

Examples

>>> list(flatten([1, [2, 3], [[4], 5]]))
[1, 2, 3, 4, 5]

>>> list(flatten(['a', ['b', 'c'], 'd']))
['a', 'b', 'c', 'd']

>>> list(flatten([1, [2, [3, [4]]]]))
[1, 2, 3, 4]

# Without the str check, this would break: >>> list(flatten([‘hello’, ‘world’])) [‘hello’, ‘world’] # Not [‘h’, ‘e’, ‘l’, ‘l’, ‘o’, ‘w’, ‘o’, ‘r’, ‘l’, ‘d’]

Schema Loader

class ggblab.schema.ggb_schema[source]

Bases: object

GeoGebra XML schema loader and validator.

Manages the GeoGebra XML schema (XSD) for validating and parsing .ggb construction files. The schema is automatically downloaded from the official GeoGebra site and cached locally for offline use.

The schema enables: - XML validation of GeoGebra constructions - Conversion between XML and Python dictionaries - Type-safe parsing of construction elements

url

URL of the GeoGebra common.xsd schema file

Type:: str

local_path

Local cache path for the downloaded schema

Type:: str

schema_content

Raw XSD content as string

Type:: str

schema

Compiled schema object for validation

Type:: xmlschema.XMLSchema

Example

>>> schema = ggb_schema()
>>> # Schema is loaded and ready for use
>>> data_dict = schema.schema.to_dict(xml_string)

Note

The schema is downloaded once and cached in xsd/common.xsd. Delete the cache to force re-download on next instantiation.

local_path = 'xsd/common.xsd'

url = 'http://www.geogebra.org/apps/xsd/common.xsd'

ggblab.schema.cache_schema_locally(schema_url, local_file_path)[source]

Download and cache a schema file from URL.

Downloads an XML schema from the specified URL and saves it to a local file for offline use. If the file already exists, uses the cached version instead of re-downloading.

Parameters:

schema_url (str) – URL of the schema file to download.
local_file_path (str) – Path where the schema should be cached.

Returns:

Content of the schema file, or None if download fails.

Return type:

str

Examples

>>> content = cache_schema_locally(
...     'http://example.com/schema.xsd',
...     'cache/schema.xsd'
... )
Using local cached file: cache/schema.xsd

Note

Future enhancement: Add logic to check file age or Last-Modified header to refresh stale cached schemas.