beancount.tools

Standalone tools that aren't linked to Beancount but that are useful with it.

The beancount.scripts package contains the implementation of scripts which invoke the Beancount library code. This beancount.tools package implements other tools which aren't directly invoking Beancount library code and which could be theoretically copied and used independently. However, these are to be distributed with Beancount and in order to maintain all the source code together they are put in this package and invokes from stubs under beancount/bin/, just like the other scripts.

beancount.tools.treeify

Identify a column of text that contains hierarchical id and treeify that column.

This script will inspect a text file and attempt to find a vertically left-aligned column of text that contains identifiers with multiple components, such as "Assets:US:Bank:Checking", and replace those by a tree-like structure rendered in ASCII, inserting new empty lines where necessary to create the tree.

Note: If your paths have spaces in them, this will not work. Space is used as a delimiter to detect the end of a column. You can customize the delimiter with an option.

beancount.tools.treeify.Node (list)

A node with a name attribute, a list of line numbers and a list of children (from its parent class).

beancount.tools.treeify.Node.__repr__(self) special

Return str(self).

Source code in beancount/tools/treeify.py
def __str__(self):
    return '<Node {} {}>'.format(self.name, [node.name for node in self])

beancount.tools.treeify.create_tree(column_matches, regexp_split)

Build up a tree from a list of matches.

Parameters:
  • column_matches – A list of (line-number, name) pairs.

  • regexp_split – A regular expression string, to use for splitting the names of components.

Returns:
  • An instance of Node, the root node of the created tree.

Source code in beancount/tools/treeify.py
def create_tree(column_matches, regexp_split):
    """Build up a tree from a list of matches.

    Args:
      column_matches: A list of (line-number, name) pairs.
      regexp_split: A regular expression string, to use for splitting the names
        of components.
    Returns:
      An instance of Node, the root node of the created tree.
    """
    root = Node('')
    for no, name in column_matches:
        parts = re.split(regexp_split, name)
        node = root
        for part in parts:
            last_node = node[-1] if node else None
            if last_node is None or last_node.name != part:
                last_node = Node(part)
                node.append(last_node)
            node = last_node
        node.nos.append(no)
    return root

beancount.tools.treeify.dump_tree(node, file=<_io.StringIO object at 0x759d43ba4880>, prefix='')

Render a tree as a tree.

Parameters:
  • node – An instance of Node.

  • file – A file object to write to.

  • prefix – A prefix string for each of the lines of the children.

Source code in beancount/tools/treeify.py
def dump_tree(node, file=sys.stdout, prefix=''):
    """Render a tree as a tree.

    Args:
      node: An instance of Node.
      file: A file object to write to.
      prefix: A prefix string for each of the lines of the children.
    """
    file.write(prefix)
    file.write(node.name)
    file.write('\n')
    for child in node:
        dump_tree(child, file, prefix + '... ')

beancount.tools.treeify.enum_tree_by_input_line_num(tree_lines)

Accumulate the lines of a tree until a line number is found.

Parameters:
  • tree_lines – A list of lines as returned by render_tree.

Yields: Pairs of (line number, list of (line, node)).

Source code in beancount/tools/treeify.py
def enum_tree_by_input_line_num(tree_lines):
    """Accumulate the lines of a tree until a line number is found.

    Args:
      tree_lines: A list of lines as returned by render_tree.
    Yields:
      Pairs of (line number, list of (line, node)).
    """
    pending = []
    for first_line, cont_line, node in tree_lines:
        if not node.nos:
            pending.append((first_line, node))
        else:
            line = first_line
            for no in node.nos:
                pending.append((line, node))
                line = cont_line
                yield (no, pending)
                pending = []
    if pending:
        yield (None, pending)

beancount.tools.treeify.find_column(lines, pattern, delimiter)

Find a valid column with hierarchical data in the text lines.

Parameters:
  • lines – A list of strings, the contents of the input.

  • pattern – A regular expression for the hierarchical entries.

  • delimiter – A regular expression that dictates how we detect the end of a column. Normally this is a single space. If the patterns contain spaces, you will need to increase this.

Returns:
  • A tuple of matches – A list of (line-number, name) tuples where 'name' is the hierarchical string to treeify and line-number is an integer, the line number where this applies. left: An integer, the leftmost column. right: An integer, the rightmost column. Note that not all line numbers may be present, so you may need to skip some. However, they are in guaranteed in sorted order.

Source code in beancount/tools/treeify.py
def find_column(lines, pattern, delimiter):
    """Find a valid column with hierarchical data in the text lines.

    Args:
      lines: A list of strings, the contents of the input.
      pattern: A regular expression for the hierarchical entries.
      delimiter: A regular expression that dictates how we detect the
        end of a column. Normally this is a single space. If the patterns
        contain spaces, you will need to increase this.
    Returns:
      A tuple of
        matches: A list of (line-number, name) tuples where 'name' is the
          hierarchical string to treeify and line-number is an integer, the
          line number where this applies.
        left: An integer, the leftmost column.
        right: An integer, the rightmost column.
      Note that not all line numbers may be present, so you may need to
      skip some. However, they are in guaranteed in sorted order.
    """
    # A mapping of the line beginning position to its match object.
    beginnings = collections.defaultdict(list)
    pattern_and_whitespace = "({})(?P<ws>{}.|$)".format(pattern, delimiter)
    for no, line in enumerate(lines):
        for match in re.finditer(pattern_and_whitespace, line):
            beginnings[match.start()].append((no, line, match))

    # For each potential column found, verify that it is valid. A valid column
    # will have the maximum of its content text not overlap with any of the
    # following text. We assume that a column will have been formatted to full
    # width and that no text following the line overlap with the column, even in
    # its trailing whitespace.
    #
    # In other words, the following example is a violation because "10,990.74"
    # overlaps with the end of "Insurance" and so this would not be recognized
    # as a valid column:
    #
    # Expenses:Food:Restaurant     10,990.74 USD
    # Expenses:Health:Dental:Insurance   208.80 USD
    #
    for leftmost_column, column_matches in sorted(beginnings.items()):

        # Compute the location of the rightmost column of text.
        rightmost_column = max(match.end(1) for _, _, match in column_matches)

        # Compute the leftmost location of the content following the column text
        # and past its whitespace.
        following_column = min(match.end() if match.group('ws') else 10000
                               for _, _, match in column_matches)

        if rightmost_column < following_column:
            # We process only the very first match.
            return_matches = [(no, match.group(1).rstrip())
                              for no, _, match in column_matches]
            return return_matches, leftmost_column, rightmost_column

beancount.tools.treeify.render_tree(root)

Render a tree of nodes.

Returns:
  • A list of tuples of (first_line, continuation_line, node) where first_line – A string, the first line to render, which includes the account name. continuation_line: A string, further line to render if necessary. node: The Node instance which corresponds to this line. and an integer, the width of the new columns.

Source code in beancount/tools/treeify.py
def render_tree(root):
    """Render a tree of nodes.

    Returns:
      A list of tuples of (first_line, continuation_line, node) where
        first_line: A string, the first line to render, which includes the
          account name.
        continuation_line: A string, further line to render if necessary.
        node: The Node instance which corresponds to this line.
      and an integer, the width of the new columns.
    """
    # Compute all the lines ahead of time in order to calculate the width.
    lines = []

    # Start with the root node. We push the constant prefix before this node,
    # the account name, and the RealAccount instance. We will maintain a stack
    # of children nodes to render.
    stack = [('', root.name, root, True)]
    while stack:
        prefix, name, node, is_last = stack.pop(-1)

        if node is root:
            # For the root node, we don't want to render any prefix.
            first = cont = ''
        else:
            # Compute the string that precedes the name directly and the one below
            # that for the continuation lines.
            #  |
            #  @@@ Bank1    <----------------
            #  @@@ |
            #  |   |-- Checking
            if is_last:
                first = prefix + PREFIX_LEAF_1
                cont = prefix + PREFIX_LEAF_C
            else:
                first = prefix + PREFIX_CHILD_1
                cont = prefix + PREFIX_CHILD_C

        # Compute the name to render for continuation lines.
        #  |
        #  |-- Bank1
        #  |   @@@       <----------------
        #  |   |-- Checking
        if len(node) > 0:
            cont_name = PREFIX_CHILD_C
        else:
            cont_name = PREFIX_LEAF_C

        # Add a line for this account.
        if not (node is root and not name):
            lines.append((first + name,
                          cont + cont_name,
                          node))

        # Push the children onto the stack, being careful with ordering and
        # marking the last node as such.
        if node:
            child_items = reversed(node)
            child_iter = iter(child_items)
            child_node = next(child_iter)
            stack.append((cont, child_node.name, child_node, True))
            for child_node in child_iter:
                stack.append((cont, child_node.name, child_node, False))

    if not lines:
        return lines

    # Compute the maximum width of the lines and convert all of them to the same
    # maximal width. This makes it easy on the client.
    max_width = max(len(first_line) for first_line, _, __ in lines)
    line_format = '{{:{width}}}'.format(width=max_width)
    return [(line_format.format(first_line),
             line_format.format(cont_line),
             node)
            for (first_line, cont_line, node) in lines], max_width