python-源码阅读(4)-python字节码

PyCodeObject (虚拟机运行时,常量,环境,字节码信息)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* code.h */
/* Bytecode object */
typedef struct {
PyObject_HEAD
int co_argcount; /* #arguments, except *args */
int co_nlocals; /* #local variables */
int co_stacksize; /* #entries needed for evaluation stack */
int co_flags; /* CO_..., see below */
PyObject *co_code; /* instruction opcodes */
PyObject *co_consts; /* list (constants used) */
PyObject *co_names; /* list of strings (names used) */
PyObject *co_varnames; /* tuple of strings (local variable names) */
PyObject *co_freevars; /* tuple of strings (free variable names) */
PyObject *co_cellvars; /* tuple of strings (cell variable names) */
/* The rest doesn't count for hash/cmp */
PyObject *co_filename; /* string (where it was loaded from) */
PyObject *co_name; /* string (name, for reference) */
int co_firstlineno; /* first source line number */
PyObject *co_lnotab; /* string (encoding addr<->lineno mapping) See Objects/lnotab_notes.txt for details. */
void *co_zombieframe; /* for optimization only (see frameobject.c) */
PyObject *co_weakreflist; /* to support weakrefs to code objects */
} PyCodeObject;

pyc文件(PyCodeObject持久化)

1
2
3
4
5
6
7
8
/*import.c*/
static void write_compiled_module(PyCodeObject *co, char *cpathname, struct stat *srcstat, time_t mtime)
{
PyMarshal_WriteLongToFile(pyc_magic, fp, Py_MARSHAL_VERSION);
/* First write a 0 for mtime */
PyMarshal_WriteLongToFile(0L, fp, Py_MARSHAL_VERSION);
PyMarshal_WriteObjectToFile((PyObject *)co, fp, Py_MARSHAL_VERSION);
}

字节码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
1           0 LOAD_CONST               0 ('A')
3 LOAD_NAME 0 (object)
6 BUILD_TUPLE 1
9 LOAD_CONST 1 (<code object A at 0x7f993a73adb0, file "demo.py", line 1>)
12 MAKE_FUNCTION 0
15 CALL_FUNCTION 0
18 BUILD_CLASS
19 STORE_NAME 1 (A)

4 22 LOAD_CONST 2 (<code object func at 0x7f993a662a30, file "demo.py", line 4>)
25 MAKE_FUNCTION 0
28 STORE_NAME 2 (func)

9 31 LOAD_NAME 1 (A)
34 CALL_FUNCTION 0
37 STORE_NAME 3 (a)

10 40 LOAD_NAME 2 (func)
43 CALL_FUNCTION 0
46 POP_TOP
47 LOAD_CONST 3 (None)
50 RETURN_VALUE

注:左边第一列代表指令对应源码中的行号. 第二列代表当前字节码再co_code中的偏移量, 第三列是当前的字节码指令, 最后一列是当前字节码指令的参数.

  1. pyc文件是对PyCodeObject的持久化,存放在硬盘,包含了除字节码意外的其他参数。每次执行后会将pyCodeObject持久化下来 使用 python demo.py启动脚步,若demo.py中无import 不会持久化

PyFrameObject

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[frameobject.h]
typedef struct _frame {
PyObject_VAR_HEAD
struct _frame *f_back; // 执行环境链的上一个frame
PyCodeObject *f_code; // PyCodeObject对象
PyObject *f_builtins; // builtin名字空间
PyObject *f_globals; // global 名字空间
PyObject *f_locals; // local 名字空间
PyObject **f_valuestack; // 运行栈的栈底
PyObject **f_stacktop; // 运行栈的栈顶
PyObject *f_trace; // 异常时调用的句柄
...

int f_lasti; // 上一条字节码指令在f_code中的偏移位置
int f_lineno; // 当前字节码对应的源代码行
int f_iblock; // 当前指令在栈f_blockstack中的索引
...
PyObject *f_localsplus[1]; // locals+stack, 动态内存, 维护(局部变量+运行时栈)所需要的空间 */
} PyFrameObject;

  1. python只有一个interpreter, 其中维护了一个或多个PyThreadState对象, 这些对象对应的线程轮流使用一个字节码执行引擎. 为了实现线程同步, python通过一个全局解释器锁GIL.
  2. 线程执行栈帧(PyFrameObject),PyFrameObject 对象也是一个变长对象, 每次创建PyFrameObject对象的大小可能是不一样的. 每个栈帧对象都维护了一个PyCodeObject对象,f_builtins, f_globals 和 f_locals 是3个独立的名字空间,
  3. LOAD_NAME 将以此从local, global, builtin 3个名字空间顺序查找, 如果都没找到说明名字未定义, 抛出异常, 终止python虚拟机的运行. 搜索规则也就是 LGB 规则.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    // 有删改
    TARGET(LOAD_NAME) {
    PyObject *name = GETITEM(names, oparg);
    PyObject *locals = f->f_locals;
    PyObject *v;

    v = PyDict_GetItem(locals, name);
    Py_XINCREF(v);

    if (v == NULL) {
    v = PyDict_GetItem(f->f_globals, name);
    Py_XINCREF(v);
    if (v == NULL) {
    v = PyDict_GetItem(f->f_builtins, name);
    if (v == NULL) {
    format_exc_check_arg(
    PyExc_NameError,
    NAME_ERROR_MSG, name);
    goto error;
    }
    Py_INCREF(v);
    }
    }
    PUSH(v);
    DISPATCH();
    }
  4. python的异常机制处理中, 最重要的是why所表示的虚拟机状态及PyFrameObject对象中f_blockstack里存放的PyTryBlock对象. 变量why将指明python虚拟机当前是否发生了异常, 而PyTryBlock对象则指示程序员是否为异常设置了 except 代码块和 finally 代码块. python虚拟机处理异常的过程就是 why 和 PyTryBlock 的共同作用下完成的.
    except