CMU 15-445: BusHub整体分析

bushub是一个非常好的单机数据库存储引擎部分的学习仓库, 阅读他的源码很有价值, 所以这篇文章用来记录对他的整体性的分析.

整体文件结构分析

首先它并不是一个完整的数据库, 它并不包含sql部分. src下主要是以下部分

buffer是buffer pool的实现, 包括lru替换算法.

catalog相当于数据库的目录, 它包含数据库内表的元数据, 因此这个文件夹内的几个文件就是元数据相关的

common的话非常简单, 表示一些通用结构

concurrency表示并发, 主要包含锁管理器和事务管理器; 这部分看起来很重要但是代码很少, 估计是要在project3自己实现

container中只有一个hash文件夹, 里面实现了两种hash表数据结构, 例如扩展hashtable, 这是用于磁盘存储的, 也是2021版本project2中实现索引的数据结构. 这里没有B+树的实现. 该结构在storage/index中会被使用.

execution则是具体的算子, 如scan, insert, 聚合等等. bushub使用的是火山模型, 每个算子都进行了统一的抽象, 这部分可以深入学习一下结构的设计.

include包含了其他每个文件夹内的头文件. 头文件统一放在这个文件夹内

include中内容最多的是这个executoin, 在非include的execuation文件夹内主要包含的是算子的实现, 然而这个include内除了executors还有其他几个如expressions和plans, 以及最重要的表示执行引擎的execution_engine.h和表示执行上下文的executor_context.h. plans文件夹内是对执行计划的抽象, 而expressions文件夹内则是对表达式的抽象. 这部分需要进一步分析.

recovery是负责容错和恢复的, 包含日志管理, 日志恢复, checkpoint管理等内容.

storage则看起来比较复杂

它包含了存储引擎的存储模块, 主要有:

disk管理器. 一个全局的disk管理
index索引. 索引的实现, 有b+树和hash的版本. b+树的这个看起来也只实现了接口没有具体实现.
page. page在磁盘上组织的数据结构, 有各种类型的page, hash_table在这里有几种页面, 是索引实现需要.
table内是和表相关的结构. 有个table_heap文件, 应该表示数据的存储是表堆的方式, 也就是说和mysql不同, mysql是聚簇索引的方式, 所有数据存储在B+树叶子节点上. 而在这里, b+树以及hash_table都是只作为索引来使用

type是最后一个, 实现了各种数据的类型.

这个对类型系统的抽象值得去学习了解一下, 看type.h, 它提供了丰富的接口.

executor执行器

正如前面介绍的, 它包括了两大部分: include内以及算子的具体实现

executor_engine.h是执行器引擎, 他的Execute方法是启动执行一个执行计划的入口; 传递一个抽象plan节点, 将结果返回到result_set.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


  bool Execute(const AbstractPlanNode *plan, std::vector<Tuple> *result_set, Transaction *txn,
               ExecutorContext *exec_ctx) {
    // Construct and executor for the plan
    auto executor = ExecutorFactory::CreateExecutor(exec_ctx, plan);

    // Prepare the root executor
    executor->Init();

    // Execute the query plan
    try {
      Tuple tuple;
      RID rid;
      while (executor->Next(&tuple, &rid)) {
        if (result_set != nullptr) {
          result_set->push_back(tuple);
        }
      }
    } catch (Exception &e) {
      // TODO(student): handle exceptions 所以这里要做什么?
    }

    return true;
  }

这个方法中, 有一个while循环, 不断地从executor中调用Next方法获取下一行数据并添加到result_set. 这就是火山模型典型的用法.

而传递进来的plan参数的作用就是为了调用ExecutorFactory来创建一个Executor算子.

在

1
2


std::unique_ptr<AbstractExecutor> ExecutorFactory::CreateExecutor(ExecutorContext *exec_ctx,
                                                                  const AbstractPlanNode *plan)

中, 它会根据plan的不同type来创建不同的Executor, 对于简单的plan如SeqScan, 他会直接创建

1
2
3


    case PlanType::SeqScan: {
      return std::make_unique<SeqScanExecutor>(exec_ctx, dynamic_cast<const SeqScanPlanNode *>(plan));
    }

而对于其他非底层的如Limit或Insert等, 则会生成child_executor

1
2
3
4
5


    case PlanType::Limit: {
      auto limit_plan = dynamic_cast<const LimitPlanNode *>(plan);
      auto child_executor = ExecutorFactory::CreateExecutor(exec_ctx, limit_plan->GetChildPlan());
      return std::make_unique<LimitExecutor>(exec_ctx, limit_plan, std::move(child_executor));
    }

因为Limit下面是一定有子节点的, Insert的数据分为两种, 一种是直接插入raw数据一种是根据子查询的结构, 因此也是可能有child_executor的.

plan和executor算子的关系

从源码上看, plan和executor都有child结构, 通过上面的代码分析, 发现plan和executor其实是一一对应的关系, executor算子是通过传递进来的plan生成的, 而当算子有child算子时, 也是通过传递plan的child plans进行生成的.

有的plan没有子节点, 如SeqScan, 有的只有一个子节点如Insert, Limit, 有的有两个子节点, 如NestedLoopJoin

1
2
3
4
5
6
7


    case PlanType::NestedLoopJoin: {
      auto nested_loop_join_plan = dynamic_cast<const NestedLoopJoinPlanNode *>(plan);
      auto left = ExecutorFactory::CreateExecutor(exec_ctx, nested_loop_join_plan->GetLeftPlan());
      auto right = ExecutorFactory::CreateExecutor(exec_ctx, nested_loop_join_plan->GetRightPlan());
      return std::make_unique<NestedLoopJoinExecutor>(exec_ctx, nested_loop_join_plan, std::move(left),
                                                      std::move(right));
    }

这里需要注意: abstract plan是有子节点的, 而abstract executor则没有, 所以实现具体的executor时, 需要根据该算子的特点去创建对应的child.

expressions

这个文件夹内表示表达式, 和plan类似, 同样是一个abstract类, 然后一些具体的子类实现.

CMU 15-445: BusHub整体分析

Contents

整体文件结构分析

executor执行器

plan和executor算子的关系

expressions

storage存储

concurrency并发

recovery恢复