2016年1月11日 星期一

[Quick Note] V8 Javascript Engine's First Stage Compiler

V8 is properly, in my opinion, the fastest javascript runtime at the time(No offense SpiderMonkey, although benchmarks differ from one to another, V8 got the best grade in average). There are some well known properties that make it run so fast:

  • No interpreter, use baseline compiler instead. But this characteristic has just been subverted on the bleeding edge of the development tree. 
  • No intermediate representation(IR) in baseline compiler. But optimization(second stage) compiler do use its own IR.
  • Use method-based JIT strategy. This is fairly controversial, but as we can see later, use function as its compilation scope makes lots of works easier in comparison with trace-based strategy, which spends lots of efforts on determining trace boundary.
  • Hidden class. In short, if a variable modifies its object layout, adding a new field for example, v8 create a new layout(class) instead of changing the origin object layout(e.g. Using linked list to "chain" object fields, which appears in many implementation of early era javascript runtime). 
  • Code patch and inline cache. These two feature are less novel since they have already beed used in lots of dynamic type runtime for over twenty years.
I'm going to talk about the baseline compiler. The code journey starts from Script::Compile API which as the name suggest, compiles the script. V8 has a fabulous characteristic that it can be used standalone, that's the very key which bears Node.js. You can checkout V8's embedder guide for related resource.

After several asserting checks, the main compilation flow comes to Compiler::CompileScript(compiler.cc:1427). One of the important things done here is checking code cache. (compiler.cc:1465)

    // First check per-isolate compilation cache.
    maybe_result = compilation_cache->LookupScript(
        source, script_name, line_offset, column_offset, resource_options,
        context, language_mode);
    if (maybe_result.is_null() && FLAG_serialize_toplevel &&
        compile_options == ScriptCompiler::kConsumeCodeCache &&
        !isolate->debug()->is_loaded()) {
      // Then check cached code provided by embedder.
      HistogramTimerScope timer(isolate->counters()->compile_deserialize());
      Handle<SharedFunctionInfo> result;
      if (CodeSerializer::Deserialize(isolate, *cached_data, source)
              .ToHandle(&result)) {
        // Promote to per-isolate compilation cache.
        compilation_cache->PutScript(source, context, language_mode, result);
        return result;
      }
      // Deserializer failed. Fall through to compile.
    }

If there are previous compiled code(or snapshot in v8's term), it would deserialize them from disk. Otherwise, the real compilation procedure would be triggered, namely, CompileToplevel (compiler.cc:1242). In this function, parsing information and compilation information would be setup and passed to CompileBaselineCode (compiler.cc:817). Let's ignore the Compiler::Analyze first and jump to GenerateBaselineCode.
Here comes some interesting things: (compiler.cc:808)
static bool GenerateBaselineCode(CompilationInfo* info) {
  if (FLAG_ignition && UseIgnition(info)) {
    return interpreter::Interpreter::MakeBytecode(info);
  } else {
    return FullCodeGenerator::MakeCode(info);
  }
}

Previous says that one of the advantages V8 has is entering compilation process directly without interpreting ahead. So why there are codes related to interpreter ? It turns out that V8 is also going to have a interpreter! Here is a brief introduction about Ignition, the new interpreter engine. But let's just stare on the compilation part first.

So now we comes to FullCodeGenerator::MakeCode (full-codegen/full-codegen.cc:26). There are a few important things here where we'll come back to, for now, let's focus on FullCodeGenerator::Generate .
Now here's the exciting part. FullCodeGenerator::Generate is a architecture-specific function, which residents in source code files in separated folders like arm, x64 .etc. Let's pick few lines of code from x64 version:
      if (locals_count >= 128) {
        Label ok;
        __ movp(rcx, rsp);
        __ subp(rcx, Immediate(locals_count * kPointerSize));
        __ CompareRoot(rcx, Heap::kRealStackLimitRootIndex);
        __ j(above_equal, &ok, Label::kNear);
        __ CallRuntime(Runtime::kThrowStackOverflow);
        __ bind(&ok);
      }

Pretty familiar right? It seems that it directly generates the assembly code!
Normally this sort of approach would appears in education project like homework in college compiler courses. Mature compiler framework often has lots of procedures on generating native codes. Directly generate native assembly is thus one of the most important factors that speed up V8.
Another worth mentioned thing is the "__" at the prefix of each several lines above. It's a macro defined in full-codegen/full-codegen.cc:
#define __ ACCESS_MASM(masm())
Here MASM does not means Microsoft's MASM where the first 'M' character of the latter one stands for Microsoft. Our MASM stands for macro assembler, which is responded for assembly code generating.

Back to FullCodeGenerator::Generate. VisitStatements seems to be the code emitter for js function body. Now we stop the code tracing here and step back to see the declaration of FullCodeGenerator.(full-codegen/full-codegen.h) 

class FullCodeGenerator: public AstVisitor
It turns out that the code generator itself is a huge AST visitor! And VisitStatements is the top level visitor function that would dispatch different part of code to visitors like FullCodeGenerator::VisitSwitchStatementFullCodeGenerator::VisitForInStatement or FullCodeGenerator::VisitAssignment to name a few.

Last but not the least, some people say Ignition, the interpreter mentioned before, is not like SpiderMonkey or JavascriptCore(Safari). I'm really curious about that, maybe I would wrote another article about that : )


沒有留言:

張貼留言