-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to query direction of an operand. #35
Comments
hello! thanks for the issue. the main reason yaxpeax-x86 still doesn't have an "operand direction" kind of feature is that ways i've thought about building it generally seem.. insufficient. a few examples:
the number of special cases in x86 are not that bad, and off the top of my head i don't know other "false positive" cases than those, so looking at operands and matching on the opcode for the rest definitely makes sense for where you're at - conversely i've seen tools regularly misunderstand instructions and report questionable results because everyone's special cases end up missing something 😅 as one example, i don't see a provision in capstone for behavior implied purely from the opcode, like " so, blending some of my earlier thoughts with what you've described needing... please tell me if this sounds like a good approach to you:
i think there are some nice ways to make that kind of information available, and not terribly hard. do you think you'd find the |
Hi Ixi -- Thanks for the response! Indeed -- I see your point that there are many different ways you can approach this, and all of them have their tradeoffs. I really like the ideas you proposed; they seem like they would give us sort-of a "toolbox" of low-level information about an instruction that we could use to derive the higher-level information we need. That seems preferable to any kind of solution where Yaxpeax tries to imagine what higher-level information its clients might want synthesized together (which may change over time, or be used in new and unexpected ways). Coincidentally, just yesterday @BrianShTsoi and I were testing to see what would happen if we got a crash from using I could definitely see your suggestion of listing the potential faults for an instruction being useful as well -- It might be nice to be able to detect an impossible fault caused by a CPU bug. FWIW I think we would view that as a bit of a lower priority for our Minidump Processor, since Brian will be jammed shortly and unable to make forward progress without the ability to detect whether an instruction is reading or writing to memory (through either implicit or explicit operands).
I can't think of any other instruction information we'd want added right now -- You've been very thorough in thinking through this :) Please advise if there is anything we can do to help out -- It won't be long until Brian is blocked on this, so it might make sense for him to help push this along. Thanks! 😃 |
Oh, I also wanted to specifically draw attention to this point, because I think it's important. I like the idea of you leaving implicit and explicit operands separate, but definitely there should be documentation explaining that each API only provides a piece of the whole story. FWIW I was corrected by the maintainers of |
i've been thinking about this over the weekend and i'll try to start sketching something out this afternoon or tomorrow - in terms of public surface area this is a very rough draft of what i'd like users to be able to do.. let inst = InstDecoder::default().decode_slice(&[0x33, 0xc1]);
// i'd like to measure the overall amount of added code/data bytes from this,
// `.behavior()` may end up being a cargo flag'd feature much like `serde`, `fmt`, or capstone's "diet" mode
let behavior = inst.behavior();
match behavior.privilege_level() {
PrivilegeLevel::Only0 => {
// what happens if a requires-ring-0 instruction is executed in ring 3 varies! monitor, mwait, and
// one leaf of pconfig will #UD here, otherwise typically it is #GP(0)..
}
PrivilegeLevel::ConfigurableRing0 => {
// RDPMC can be allowed outside ring 0 if a bit in CR4 is set
}
PrivilegeLevel::Any => {
// most instructions can run in any privilege level
}
}
match behavior.implicit() {
Special => {
// the instruction is one which really needs application care to handle its implicit behaviors.
// yaxpeax-x86 cannot precisely report all architectural effects of the instruction.
//
// such instructions include:
// * `rep {cmps/scas/movs/lods/stos/ins/outs}`: the memory to be accessed is a variable number of potentially-many bytes
// * `pusha/popa`: these operate on all GPRs - applications likely want the all-read/all-write information
// much more than a register-by-register description of the accesses...
// * `xsave/xrstor/xsaveopt/fxsave/fxrstor`: like above but these have register sets that vary by extension!!
// * scatter/gather instructions access two or four locations in memory depending on indices the second register
// probably worth a `.handle_special()` which is documented to best-effort the `Special` cases below in a lossy manner:
// `rep movs` etc report the same as without rep
// `pusha/popa`, write/read memory through rsp, modify rsp. ignore gprs maybe?
// `xsave/xrstor/xsaveopt`: read/write through the memory operand.
}
Normal(implicit_behavior) => {
// it would be nice to have a visitor-like function here, similar to
// https://github.com/iximeow/yaxpeax-x86/blob/f4ae2ed/src/long_mode/mod.rs#L797-L835
// so use like "does this read memory" could codegen to something very succinct. but for the mean time..
for (_op, access) in implicit_behavior.operands() {
// access is effectively a struct OperandAccess { direction: {Read, Write, ReadWrite}, conditional: bool }.
// but practically can be stored in three bits with private members..
if access.write() { // direction is Write or ReadWrite
...
}
}
}
}
// and for behavior that comes from the operands of an instruction..
// for instructions like `int 0x13` the operand will look uninteresting
for (op, access) in behavior.operands().accesses() {
if op.is_memory() {
if access.write() {
..
} else if access.read() {
..
}
}
}
// in your case this might actually be the most useful high-level tool!
//
// this would be something like:
// > instruction can inherently #GP or an operand is used in a way that can
// > #GP (e.g. memory operand that actually causes permissions checks)
if !behavior.exceptions().general_protection() {
.. do something knowing a fault did not come from this instruction ..
} the implementation behind |
Hey @iximeow -- Sorry for the slow reply; I was on vacation for a bit and then it's been non-stop since I've gotten back, so I've finally got around to plowing through my GH inbox. FWIW I love your example there. I think that all of that would be very useful to us on the crash reporter, and I can already see how @BrianShTsoi could immediately make good use of any of that. Have you by any chance been able to find time to work on this? Would it be helpful for you if we were able to get Brian some hours to help you implement some of your ideas? Thanks! |
hello! sorry i didn't get back to you sooner, myself. i haven't really gotten a more solid draft than the above together, though i still think it seem like a good way to go; i'm happy to collab with @BrianShTsoi to make it more real! most of the time is probably going to be in filling out tables of instruction info, not likely super thrilling either way :) i do find that using an interface alongside building it helps flush out any obvious errors, so first order of business for me would probably be some standalone similar-to- |
I would be happy to help. I don't mind filling out tables of instruction info :) |
Hi there,
We are currently working on a feature for Mozilla's Crash Reporter and Minidump Processor. We are trying to detect "impossible crashes" that may be caused by malfunctioning hardware.
For example, if the CPU reports a crash due to an invalid write, but the crashing instruction doesn't write anything, that would most likely be due to the CPU malfunctioning.
The issue is that neither of the Rust disassemblers that we could use (yaxpeax-x86 or iced-x86) currently report the "direction" of the operands of an instruction.
For example, it would be great to know that in the instruction
mov [rax], rbx
the first operand is a "WRITE" operand.Currently, to do this, we would have to match off of every x86 Opcode that accesses memory ourselves to determine the nature of each opcode's access (see this PR for a partial example). This is not something we generally want to do, and so we are kind-of stuck with perhaps switching to a non-Rust disassembler like Capstone (which, to be clear, we don't want to do because it has its own headaches).
Here is the docs for Capstone's take on this feature. You can see the RegAccessType tells you if each operand is ReadOnly, WriteOnly, or ReadWrite (like in the case of an
add [rax], rbx
instruction.If there is any way we can offer some help, please let us know... It might actually be easier to fix this in a Rust disassembler rather than trying to switch over to Capstone 😂.
Thanks!
The text was updated successfully, but these errors were encountered: