Writing a VM capable of executing expression trees is different from writing a VM for executing assembly instructions. Here I’ll cover several issues stemming from this difference.
The first group of issues involve generality. Supporting a specific instruction set is a straightforward job, even if a hard one. Vial has to support multiple platforms and that’s a little tricky. These are just a few of the problems:
- Different registers for each instruction set. This one is easy enough to solve. Just have a node in the expression tree with some value to indicate the register.
- Register overlaps. A change to one register implies a change to its parents and its children. Consider RAX->EAX->AX->AH,AL. Changing EAX will affect all of the registers in the hierarchy.
To handle this, we wrote a CPU class to keep all the info about the platform, including register overlaps. - Delayed branches. Some platforms have branch delay slots. This means that after any branch instruction, right before the branch is taken, instructions in the delayed slots are executed anyway. For instance, SPARC has three delay slots, while MIPS has just one. This isn’t an easy issue to solve, and for now we didn’t tackle it. We’ve got a few ideas though.
To make sure that our implementation is generic enough, we decided to write a skeleton disassembler implementation for MIPS as well.
The second group of issues involve the nature of expression trees versus instructions:
- Stepping over statements or instructions? Each expression tree for an instruction usually holds more than one statement. For example, dec eax changes eax as well as the zero flag. Since some instructions like rep stosd may contain a long loop, being able to step over statements instead of expressions is preferable.
The problem is, executing expression trees is done with a DFS-like walk. If implemented with recursion it makes pausing the walk for each step a bit complicated. However, recursion is the clearest way to write the execution code, and I’d rather not give it up.
My solution was to use Python generators. Each time a statement was executed, the function would yield, thus returning control to the calling function, while keeping its state intact.
- Instruction Pointer changes. The lower-level disassembler returns expression trees for each instruction. However, it does not return the jump to the next instruction as well. While this is the required behavior, it means that the VM should change the instruction pointer after executing each instruction.
This is easier said than done: What should we do after jumps? Several possible solutions emerged.
The first was to append an IP change to each instruction’s expression. Those changes will have to be appended only where needed.
A second solution was to check if the IP was changed, and if it was not, change it. This solution however will not support instructions that jump to themselves.
The last and our preferred solution was to check if the IP was touched. This solution is simple and very straightforward.
There are many more issues I didn’t write about, for example, self modifying code. I’ll leave those issues for future articles.
Regarding the 2nd 2:
Appending a change to the IP should work for all instructions, if you follow the IP to figure out what statement to execute next.
The real solution however (and you should know it), for Intel at least, is *prepending* the change to the IP for all instructions. It works for absolute jumps, and especially relative ones: jumps/calls are relative to the end of the instruction.
You are right in regards to the real solution, this is a very good point. However, it might apply differently to other processors. It might still be workable there by changing the instructions’ expression trees. In any case I’ll probably have to work it in. Thanks!
As to appending the change to the IP:
This solution was workable, but more work, and it meant modifying the expressions. The third solution is more general, and it seems to me to be more elegant.
Ahh Eres nice idea. But if we care about performance, I’m not sure whether inserting an expression all the times, where sometimes they won’t be used, is good enough. Though on the other hand, it depends on how quick you know if there was a change.
The easiest, ok not the easiest, but the best solution would be to have an IP register and a shadow IP register. The shadow register will contain the current IP, and will be used for rvalues only. The IP register will be reset to undefined every instruction. Then if the real IP register was set, you will know to use it. Otherwise continue to next instruction, and set the shadow one again.
This wasy even if an instruction jumps to itself it will keep on working…
Ah I forgot to mention that you will have to change the expression trees so that all IP references as rvalues will use the shadowed IP..
My solution was just checking if the IP register was *touched*.
The CPU class knows what is the IP register, and the VM can ask it about it.
Each time a register is modified, the VM-state holder checks if it’s the eip register, and if it is, it sets a flag to True. Before each instruction, the VM checks to see if the flag is false. If it is, it modifies the IP. If it isn’t it keeps the IP value, and resets the flag back to false.
In effect, this is a hook by the VM state on register changes. I know it special-cases the IP register, but I think that’s OK.
Also, Arkon: the change to eip Erez suggested may be done by the VM manually. It doesn’t necessarily involve changing the expressions.
I’ve just discovered your blog. Fantastic! Keep on writing.