Abstract
This paper proposes the cost evaluation model for the existing TB-tree insertion algorithm called BASEINSERT. Factors that may affect its performance are analyzed in a detailed manner. While we search the latest node, the number of nodes being accessed is very prominent. Given this property, the paper comes up with an improved algorithm, named IMPRINSERT by adding the cache structures of root-latest-node path and right-most path, which reduce the number of nodes being accessed while searching for the latest node and right-most path close to zero. Under the bulk data setting, algorithm that inserts the data segment by segment is fairly inefficient, therefore this paper introduces a new insertion algorithm, called BULKINSERT and its cost evaluation model is proposed as well. Verified by a large number of synthesized datasets, it is proved that the cost models are reliable, both IMPRINSERT and BULKINSERT are effective and demonstrate nearly linear scalability.
Get full access to this article
View all access options for this article.
