Left, Middle, Right Iterator

Posted in youfolio-interview-questions by Christopher R. Wirz on Fri Jul 20 2012


Note: This post actually went live after my NDA with YouFolio expired and has been modified after the original authorship date. I doubt these interview questions will still be used as the architecture has probably changed a lot.

YouFolio is an online portfolio (e-portfolio) company working with universities and students to showcase their skills and help them land their dream jobs by making their content search engine optimized and verifiable. As an early stage company, it has a strong focus on problem solving and therefore bases its interview questions on real-life YouFolio solutions. If an interviewee solves a problem more efficiently than YouFolio's current implementation, they are often invited to join the team.

Note: YouFolio has had a strong focus on proving concepts work and getting Search Engine Optimized (no database hits and minimal disk I/O for all SEO-d pages!) over algorithmic efficiency. For this reason, YouFolio is seeking talent that can help the company (and its code base) scale.

YouFolio uses a lot of data streams and never wants to go above O(n) time complexity if possible. Iterators are a great way of doing that. Because our data is very sparse, we had two options:

  • Have an array with many empty values and take up more space than needed
  • Use tree structures to efficiently access that data
We picked the later.

Specifically, we can now insert items ad-hoc and they are automatically sorted because our tree structure can be sparse. For example, users sign up in any order, but we want to have them sorted by name without constantly re-making the index. Basically, what we do is take a node on the tree, and

  • If the value is higher, it goes to the right
  • If the value is lower, it goes to the left
  • Every so often, rebalnace such that the median is at the top of the tree

This approach is even more applicable for portfolio content, in which the title often changes and we want to easily add and remove without re-working the index too much. Now, the challenge is accessing this collection in O(n) time complexity as if it were a list. This is done using an iterator.

An iterator is an object that enables traversal of a collection, in order. Here is the API we currently use. This API is in Python because all the big tech companies conduct their interviews in Python.


class Node:    
    left: Node
	right: Node

    
class Iterator: 
    "the iterator will iterate left, middle, right"
	
    def __init__(self, node: Node):
        "constructor to initiate an Iterator, usually taking the top of the tree"
		
    def next(self) -> bool:
        "moves to the next item in the iteration"

    def exists(self) -> bool:
        "checks if there is a value"

    def get_value(self) -> int:
        "gets the current value (an integer)    
            

The goal is to implement this API in a manner that balances insert penalty and read speed. For any given tree following the rules described above, the iterator should list every element in order.

Now let's implement this API.


#!/usr/bin/python2.7

class Node:    
    "holds left and right values, forming a tree"
    def __init__(self, data: int):
        "constructor to initiate a node"
        
        # store data
        self.data = data
        
        # store reference (next item)
        self.left = None
        self.right = None
        return

    def to_string(self) -> str:
        "returns the node as a string"
        s = ""
        s += str(self.data)
        if self.left != None or self.right != None:
            s += " ["
            if self.left != None:
                s += self.left.to_string()
            else:
                s += "null"
            s += " "
            if self.right != None:
                s += self.right.to_string()
            else:
                s += "null"
            s += "]"
        return s

# Define an enum
LEFT = 0
MIDDLE = 1
RIGHT = 2
    
class Iterator: 
    "the iterator will iterate left, middle, right"
    def __init__(self, node: Node):
        "constructor to initiate an Iterator"
        self.node = node
        self.current = None
        self.checking = LEFT
        self.next_called = False
        
    def next(self) -> bool:
        "moves to the next item in the iteration"
        self.next_called = True
        if self.checking == LEFT:
            if self.node.left == None:
                self.checking = MIDDLE
                return True
            else:
                if self.current == None:
                    self.current = Iterator(self.node.left)
                    return True
                else:
                    if self.current.next():
                        return True
                    else:
                        self.current = None
                        self.checking = MIDDLE
                        return True
            
        if self.checking == MIDDLE:
            self.checking = RIGHT
            if self.node.right == None:
                return False
            self.current = Iterator(self.node.right)
            return True

        # LEFT and MIDDLE have been checked, we're on the RIGHT now
        if self.current == None:
            return False
        
        # if we can go next, return true
        if self.current.next():
            return True

        # otherwise, we're done here
        self.current = None
        return False

    def exists(self) -> bool:
        "checks if there is a value"
        if self.next_called == False:
            self.next()
        if self.checking == RIGHT or self.checking == LEFT:
            if self.current == None:
                return False      
        return True

    def get_value(self) -> int:
        "gets the value (an integer)"
        if self.next_called == False:
            self.next()
        if self.checking == RIGHT or self.checking == LEFT:
            if self.current != None:
                return self.current.get_value()
            return

        return self.node.data     
            

Time to invoke the code...


# make the tree
c = Node(4)
c.left = Node(2)
c.left.left = Node(1)
c.left.right = Node(3)
c.right = Node(6)
c.right.left = Node(5)
c.right.right = Node(7)

# make an iterator
i = Iterator(c)

while i.exists():
    print(i.get_value())
    i.next()

print("The value is ", c.to_string())

Now let's see the output...

1
2
3
4
5
6
7
The value is  4 [2 [1 3] 6 [5 7]]

Success! This traverses the collection in O(n) complexity and O(n) space complexity (one node for each element). This is about as fast as you can get for a collection, but it is ordered and the insert penalty is only as bad as the depth of the tree. In other words, this scales horizontally well.

Want to try for a more efficient algorithm? Here's a link to get started.