# On Determination of Balance Ratio for Some Tree Structures

- 641 Downloads

## Abstract

In this paper, we studies the problem to find the maximal number of red nodes of a kind of balanced binary search tree. We have presented a dynamic programming formula for computing *r*(*n*), the maximal number of red nodes of a red-black tree with *n* keys. The first dynamic programming algorithm uses \(O(n^2\log n)\) time and uses \(O(n\log n)\) space. The basic algorithm is then improved to a more efficient *O*(*n*) time algorithm. The time complexity of the new algorithm is finally reduced to *O*(*n*) and the space is reduced to only \(O(\log n)\).

## Keywords

Balanced Ratio Dynamic Programming Formula Complete Binary Search Tree Time Complexity Black Height## 1 Introduction

This paper studies the worst case balance ratio of the red-black tree structure. A red-black tree is a kind of self-balancing binary search tree. Each node of the binary tree has an extra bit, and that bit is often interpreted as the color (red or black) of the node. These color bits are used to ensure the tree remains approximately balanced during insertions and deletions. The data structure was originally presented by Rudolf Bayer in 1972 with its name ’symmetric binary B-tree’ [2]. Guibas and Sedgewick named the data structure red-black tree in 1978, [4]. They introduced the red/black color convention and the properties of a red-black tree at length. A simpler-to-code variant of red-black trees was presented in [1, 5]. This variant of red-black trees was called the variant AA-trees [8]. The left-leaning red-black tree [6] was introduced in 2008 by Sedgewick. It is a new version of red-black tree which eliminated a previously unspecified degree of freedom. Either 2–3 trees or 2–4 trees can also be made isometric to red-black trees for any sequence of operations [6].

## 2 The Basic Properties and Algorithms

A red-black tree of *n* keys is denoted by *T* in this paper. In a red-black tree *T* of *n* keys, *r*(*n*) and *s*(*n*), are defined as the maximal and the minimal number of red internal nodes respectively. It is readily seen that in this case of \(n=2^k-1\), the number of red nodes of *T* achieves its maximum, if the node from the bottom to the top are colored alternately red and black. the number of red nodes of *T* achieves its minimum, if the node from the bottom to the top are all colored black.

*b*(

*n*) can then be,

*n*keys can be denoted by \(\gamma (n,0)\) if root is red, and by \(\gamma (n,1)\) if root is black. We then have,

*j*and size

*i*can be denoted by

*a*(

*i*,

*j*, 0) if root is red and by

*a*(

*i*,

*j*, 1) if root is black. It follows from \(\frac{1}{2}\log n\le j\le 2\log n\) that

### Theorem 1

*a*(

*i*,

*j*, 0) and

*a*(

*i*,

*j*, 1) can be formulated for each \(1\le i\le n, \frac{1}{2}\log (i+1)\le j\le 2\log (i+1)\), as follows.

### Proof

Let *i*, *j* be two indices such that \(1\le i\le n, \frac{1}{2}\log (i+1)\le j\le 2\log (i+1)\). *T*(*i*, *j*, 0) is defined to be a red-black tree of *i* keys and black-height *j*, and its root red. *T*(*i*, *j*, 1) is defined similarly if its root black. The number of red nodes in *T*(*i*, *j*, 0) and *T*(*i*, *j*, 1) can be denoted respectively by *a*(*i*, *j*, 0) and *a*(*i*, *j*, 1).

*T*(

*i*,

*j*, 0) first. The two children of

*T*(

*i*,

*j*, 0) must be black, since its root red. The two subtrees

*L*and

*R*must both have a black-height of

*j*. The subtrees

*T*(

*t*,

*j*, 1) and \(T(i-t-1,j,1)\) which connected to a red node must be a red-black tree of black-height

*j*and

*i*keys. The number of red nodes is thus \(1+a(t,j,1)+a(i-t-1,j,1)\).

*T*(

*i*,

*j*, 0) have a maximal number of red nodes, and thus,

*L*and

*R*are denoted by

*r*(

*L*) and

*r*(

*R*), and the sizes of

*L*and

*R*are

*t*and \(i-t-1\), then we can conclude that \(r(L)\le a(t,j,1)\) and \(r(R)\le a(i-t-1,j,1)\). Therefore we have,

*T*(

*i*,

*j*, 1). The two subtrees

*L*and

*R*must both have a black-heights \(j-1\).

*L*and

*R*have a black root, then \(T(t,j-1,1)\) and \(T(i-t-1,j-1,1)\) must have black-height

*j*and

*i*keys. The number of red nodes must be \(a(t,j-1,1)+a(i-t-1,j-1,1)\). It follows that

*L*and

*R*are denoted by

*r*(

*L*) and

*r*(

*R*), and the sizes of

*L*and

*R*are

*t*and \(i-t-1\), then we can conclude that \(r(L)\le a(t,j-1,1)\) and \(r(R)\le a(i-t-1,j-1,1)\). Therefore we have,

## 3 The Improvement of Time Complexity

Some special pictures of the maximal red-black trees are listed in Fig. 2. It can be observed from these pictures of the maximal red-black trees that some properties of *r*(*n*) are useful.

(1) The maximal red-black tree with *r*(*n*) red nodes of *n* keys as shown above can be realized by a complete binary search tree.

(2) In the maximal red-black tree of *n* keys, the nodes along the left spine are colored red, black, \(\cdots \), alternatively from the bottom to the top. In such a red-black tree, its black-height must be \(\frac{1}{2}\log n\).

The dynamic programming formula in Theorem 1 we can be improved further from above observations. The second loop for *j* can be reduced to \(j=\frac{1}{2}\log i\) to \(1+\frac{1}{2}\log i\), since the black-height of *i* keys must be \(1+\frac{1}{2}\log i\). The time complexity of the algorithm can thus be reduced immediately to \(O(n^2)\).

*T*is a complete binary tree. If

*T*has a size

*n*, then its left subtree must have a size of

*O*(

*n*).

Another efficient algorithm for *r*(*n*) need only \(O(\log n)\) space can be built as follows.

### Theorem 2

*T*of

*n*keys, let

*r*(

*n*) be the maximal number of red nodes in

*T*. The values of \(d(1)=r(n)\) can be formulated as follows.

### Proof

We can label the nodes of a maximal red-black tree like a heap. The root of the tree is labeled 1. The left child of a node *i* is labeled 2*i* and the right child is labeled \(2i+1\). Let *d*(*i*) denote the maximal number of red nodes of *T* and *h*(*i*), denote the height at node *i*. Then we have \(r(n)=d(1)\). It is easy to verify that if \(\frac{i}{2^{\lfloor \log n\rfloor -\lfloor \log i\rfloor }}>n\), then we have \(h(i)=\lfloor \log n\rfloor -\lfloor \log i\rfloor \), otherwise, \(h(i)=1+\lfloor \log n\rfloor -\lfloor \log i\rfloor \).

It is obvious that if \(h(i)\le 1\), then \(d(i)=h(i)\).

*h*(

*i*) is even then the left subtree rooted at node 2

*i*and the right subtree rooted at node \(2i+1\) are both black. If

*h*(

*i*) odd, the four nodes rooted at nodes 4

*i*, \(4i+1\), \(4i+2\) and \(4i+3\) can be all maximal red-black trees. Therefore, if \(h(i)>1\), we have

*r*(

*n*) can be build as the following algorithm.

It can be seen that the time complexity of the algorithm is *O*(*n*), since each node is visited at most once in the algorithm. The space complexity of the algorithm is the stack space used in recursive calls. The depth of recursive is at most \(\log n\), and the space complexity of the algorithm is thus \(O(\log n)\).

*r*(

*n*), the algorithm can be reformulated in a non-recursive form as follows.

It is clear that the time complexities of above two algorithms are both \(O\log n\).

*n*can fit into a computer word, then

*a*(

*n*) and

*o*(

*n*) can be computed in

*O*(1) time.

## 4 Concluding Remarks

We have presented a dynamic programming formula for computing *r*(*n*), the maximal number of red nodes of a red-black tree with *n* keys. The time complexity of the new algorithm is finally reduced to *O*(*n*) and the space is reduced to only \(O(\log n)\).

## Notes

### Acknowledgement

This work was supported by Intelligent Computing and Information Processing of Fujian University Laboratory and Data-Intensive Computing of Fujian Provincial Key Laboratory.

## References

- 1.Andersson, A.: Balanced search trees made simple. In: Dehne, F., Sack, J.-R., Santoro, N., Whitesides, S. (eds.) WADS 1993. LNCS, vol. 709, pp. 60–71. Springer, Heidelberg (1993). doi: 10.1007/3-540-57155-8_236 CrossRefGoogle Scholar
- 2.Bayer, R.: Symmetric binary B-trees, Data structure and maintenance algorithms. Acta Informatica
**1**(4), 290–306 (1972)MathSciNetCrossRefzbMATHGoogle Scholar - 3.Goodrich, M.T., Tamassia, R.: Algorithm Design and Applications. Wiley, Hoboken (2015)zbMATHGoogle Scholar
- 4.Guibas, L.J., Sedgewick, R.: A dichromatic framework for balanced trees. In: 19th FOCS, pp. 8–21 (1978)Google Scholar
- 5.Heejin, P., Kunsoo, P.: Parallel algorithms for redCblack trees. Theor. Comput. Sci.
**262**(1–2), 415–435 (2001)Google Scholar - 6.Sedgewick, R.: Left-leaning RedCBlack Trees. http://www.cs.princeton.edu/rs/talks/LLRB/LLRB.pdf
- 7.Warren, H.S.: Hackers Delight, 2nd edn. Addison-Wesley, New York (2002)Google Scholar
- 8.Weiss, M.A.: Data Structures and Problem Solving Using C++, 2nd edn. Addison-Wesley, New York (2000)Google Scholar