LoginRegisterCommercial SupportContact Us


POD documentation > Data Handling > Tree.pm

Tree.pm

Handling tree data structures
posted on 3:29 PM, July 12, 2009


ExSite::Tree

ExSite::Tree is a generic tool for managing heirarchical tree structures such as are found in ExSite. It assumes that each tree node is essentially an ExSite datahash, and each node has a single parent node.

Each tree node is a hash containing the following keys:

    data - reference to a hash containing the node data
    id_key - the key in %$data that looks up the node ID
    parent_key - the key in %$data that looks up the node parent ID
    parent - the value in %$data that is the node parent ID
    child - an array of child nodes under this node

The first value, data, is a datahash containing arbitrary node data formatted as key/value pairs. The remainder are used for indexing and navigating the tree nodes.

Typically data is an ExSite datahash, which contains foreign key values pointing to other ExSite datahashes. The primary key of the datahash is used as id_key, and the foreign key is used as parent_key. The foreign key value is parent, which presumably is the id of another node in the tree.

Nodes retain the order in which they were inserted into the tree. That means that if several nodes are inserted as children of a particular node X, then when the children of X are fetched, they will be returned in the same order they were added to the tree.

Tree Contruction

Each tree node is a datahash. One of the hash keys is used as a node ID, and another is used as a parent ID (collectively, these are the index keys). The index keys can be specifed uniquely for each node, or for a group of nodes. You can also set default index keys to be used for all nodes if nothing specific is given.

When building a tree, consider whether the tree is properly rooted, meaning that it starts from nodes that have no parents. If so, then the tree will automatically detect these root nodes and make them accessible using the get_topnodes methods.

Sometimes, however, it is necessary to create trees from a partial data set which is actually just a branch of a larger tree. In this case, the topmost node(s) will still have a parent, but the parent will not be included in the tree data. In this case, the tree will not have a natural root, and may not be able to automatically identify where to begin. This may not be a problem if you do not need to query the tree for the topnodes (eg. if you only query by specific node IDs). If you do need to query for topnodes, then you should advise the tree where your partial tree is rooted, either by adding the topnodes manually (addtopnode()) or by specifying which parent IDs can be ignored (treated as zero) because they are not included in your data set.

ExSite Trees are designed to be self-constructing with typical ExSite data structures. See the example at the end for more info.

Create a new tree object
    my $tree = new ExSite::Tree("id_key", "parent_key", @data);

``id_key'' and ``parent_key'' are the index keys for the datahashes in @data. They are also the default index keys for all other nodes added to the tree. @data is optional.

Add nodes to a tree
    $tree->addnode($data);

    $tree->addnode($data,$id_key,$parent_key,$ignore_parent);

Adds one node whose data is in %$data. If the index keys are different from the default for the tree, they can be specified as extra parameters.

The optional $ignore_parent parameter can be used to specify a tree root other than the natural root. The natural root would occur where the parent ID is zero. However, there are cases where you want to extract a subset of the natural tree, starting at some set of branches. In these cases you want to ignore the parents of those starting branches, so that the branches are treated as root nodes. To do this, you can pass an arrayref of parent IDs to ignore.

    $tree->addnodes($data);
    $tree->addnodes($data,$id_key,$parent_key,$ignore_parent);

Same as above, except $data is taken to be a reference to an array of datahashes.

    $tree->addtopnode($data);
    $tree->addtopnode($data,$id_key,$parent_key);

Same as addnode(), except that the node is explictly understood to be a top-level node with no parent, despite what that parent ID suggests.

Remove nodes from a tree
    $tree->delnode($id);

This removes a node (along with its descendants) from the tree.

$tree->splice($id);

This removes a node from the tree, but remaps its children to take its place.

    $tree->replacenode($id,$data);

    $tree->replacenode($id,$data,$id_key,$parent_key);

This replaces the data in the node indexed under $id, with the contents of %$data. The replaced node will be indexed under the original key as well as under a new key determined from the replacement data.

Querying the tree

Tree nodes can be fetched as nodes (including the tree indexing parameters) or as data (ie. the original datahash). Fetching as nodes is typical for internal use; fetching as data is typical for calls from other packages. If you fetch the node, the data can be accessed as $node->{data}{...}.

Fetch a node
    $tree->getnode($id,$create_flag);

    $tree->getnode_data($id,$create_flag);

Fetches the node indexed under $id. If $create_flag is true, a dummy version of the node will be created if it is not found.

Fetch the parent of a node
    $tree->get_parent_node($id);
    $tree->get_parent_data($id);

Fetches the parent of the node indexed under $id.

Fetch the children of a node
    $tree->get_child_nodes($id);

    $tree->get_child_data($id);
    $tree->get_child($id);  # synonym for get_child_data()

Fetches the children nodes under node $id. Returns an array, or array ref, depending on the list context.

Fetch the top-level of the tree
    $tree->get_topnodes();
    $tree->get_topnodes_data();

Fetches the nodes that have no parents. Returns an array, or array ref, depending on the list context.

Fetch the top-level ancestor of a node
    $tree->get_ancestor_node($id);

$tree->get_ancestor_data($id);

Search up through the parent links for a node without a parent.

Dump the tree
    print $tree->dump();    # dump IDs

    print $tree->dump($id); # dump IDs, starting from $id
    print $tree->dump($id,"key1","key2"); # dump certain keys, starting from $id

Returns a string that illustrates the tree as a nested list of text strings. Each tree node is shown as its ID, by default, but if alternate keys are given, the values under those keys in the node's data will be listed instead.

See if the tree contains a particular node
    $tree->exists($id);

This returns true (1) if the node $id exists in the tree.

Count nodes
    $tree->count();             # count all nodes
    $tree->count($pattern);     # count nodes that match $pattern

    $tree->count($pattern,$id); # count nodes matching $pattern starting at $id

$pattern is a match hash that limts counted nodes to those whose data contains the same key/value pairs as the match hash. $id sets a starting node to count from; only this and descendant nodes are included in the count.

Tree Transforms

You can convert the tree to other data structures:

Collapse the tree to a list
    my @list = $tree->collapse($id);

The tree is compressed to a flat list, ordered depth-first. If $id is given, only that branch of the tree is collapsed.

Alternatively, you can collapse the tree to a list of leaf nodes only:

    my @list = $tree->get_leaves($id);

Branch nodes (those with child nodes) are excluded from this list.

Extract a sub-tree
    my $subtree = $tree->subtree($id,$match);

The subtree is another tree object, comprising a branch of the current tree starting at node $id.

If the match hash $match is provided, the nodes of the subtree will be filtered to include only those whose data match the key/value pairs in $match. WARNING: if the tree contains closed loops in its node relationships, it is possible for an infinite loop to result when copying to the subtree. For this reason, we prune the subtree at $config{max_tree_nodes}. Legitimate subtrees larger than this size will get truncated.

Example

This example creates a site map in a tree structure:

    # fetch a list of pages, ordered by rank and id
    my @pages = $db->fetch_match("page", {section_id=>$my_section, type=>"page"}, ["rank","page_id"]);

# pass the page data to the Tree class, using page.page_id and page.parent_id as the index keys my $map = new ExSite::Tree("page_id","parent_id",@data);

That is sufficient to create the tree. This tree consists of one node for each page, with the nodes nested exactly as they do in site menus and site maps. At each level of the tree, the pages are ordered correctly according to their page rank.

To fetch the top-level pages from the site, use:

    my @toppages = $map->get_topnodes_data();

This retrieves an ordered array of datahashes, representing the top-level pages of the site. For each of these pages, you can fetch the child pages (submenus) beneath them with:

    my @subpages = $map->get_child_data($page->{page_id});

...and so on, to any depth.

To get a quick view of the site map, use:

    print $map->dump(0,"filename");