Is there static analysis tools for x86 assembler?

akalenuk · June 03, 2013, 08:06:45 PM

I've been writing in MASM32 for a while couple of years ago and I remember its neat minimalistic enviroment. I didn't want anything better for the moment, but now, as I gained some experience with modern IDEs, I feel like a lack of static analyser or embeded profiler might be discomforting. Well, I don't use a profiler every minute, so it's ok to have it as a separate tool. But what about static analysis?

I googled out some heavy weight industrial tools like LDRA testbed or MALPAS, but they seem to be working moslty with embedded software. They use own IL and model-ckecking aproach, which is great, but I'm looking more for something to tell me about some style issues and pieces of dead code.

So the question is, is there some lightweight tools for static analysis of assembler code available?

Tedd · June 03, 2013, 11:54:01 PM

Tools: pencil + paper + Intel instruction manuals + hours to waste

Intel used to have tools as part of the code optimizer suite, but it seems to have been split into multiple products now. And the focus is on higher-level languages, though since they're complied to asm anyway, I'd expect they should still be useful.

This list maybe worth examining in more detail: http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis

akalenuk · June 04, 2013, 03:32:25 PM

Well, hours to waste are exactly what I'm trying to save here :-)
I am familliar with this list. These tools are indeed for mostly high level languages. Although for instance LDPR testbed and MALPAS do support assembly code and binary static code analysis. And this one is even free: http://bitblaze.cs.berkeley.edu/vine.html But it's not completelly what I had in mind. They are mostly model-checking or invarian based validation tools for small pieces of embedded code. I want something more like CppCheck or JLint. Something to help me write tons of code, not to proove its correctness afterwards.

anta40 · June 04, 2013, 10:05:29 PM

Something like bug checker, maybe?

I do most of my work in Java, and I use FindBugs to clean my code.
It would be nice if there's such tool for assembly language.

akalenuk · June 04, 2013, 10:56:35 PM

Yes, FindBugs is a fine example.

jj2007 · June 05, 2013, 05:05:46 AM

Quote from: anta40 on June 04, 2013, 10:05:29 PMI use FindBugs to clean my code.
It would be nice if there's such tool for assembly language.

Assembly code is either buggy, and Masm will bark at you, or it's "correct", and it will assemble without errors. I am usually an optimist, but try to find an example where a software could spot a "bug" in assembly... almost impossible at such a low level.

akalenuk · June 05, 2013, 05:00:48 PM

Agner Fog lists 20 of common pitfals in assembly code. Some of them as checking push/pop matching, or ensuring ret before endp can be done authomatically. Of course there could be a situation when one would want to leave push/pop unmatched, but false positive results are quite common for high level static analysis as well. It is cheaper to suppress every false positive, then to let a silly bug, such as these, into code.

Also modern assembly implies using a lot of user defined entities, such as procedures, variables, labels and so on. It would be nice to put them under some kind of supervision. Say, I want to calculate a 'very_importrant_x' and a 'very_importrant_y' variables. As I am quite lazy, I do first one and then copy paste the second forgeting to change 'very_importrant_x' to 'very_importrant_y'. Static analysis tool may found 'very_importrant_y' being instanced but never initialized. Which is not a bug from machines point of view, but sure is one logically.

Of course this type of analysis would be eighter comperehsive, nor reliable but, well, none of them are.

jj2007 · June 05, 2013, 06:54:54 PM

Quote from: akalenuk on June 05, 2013, 05:00:48 PMchecking push/pop matching, or ensuring ret before endp ... 'very_importrant_y' being instanced but never initialized.

Points taken. Shouldn't be too difficult, actually. Here is a little test, creating a list of procs identified by proc/endp:
line 14152 proc 143 SetParaFormat
line 14185 proc 144 GetParaFormat
line 14194 proc 145 InsDate
line 14217 proc 146 SetFontSizeP
line 14230 proc 147 SetFontFace
line 14249 proc 148 SetTabs
line 14279 proc 149 a <<<<<<<<< that was a COMMENT @ ... a proc ...@
line 14357 proc 150 SetSelFormat
line 14419 proc 151 ShowWinErr
line 14442 proc 152 UpdateRE
line 14468 proc 153 ClearLocVars

hutch-- · June 05, 2013, 07:08:32 PM

There is a certain discipline for most languages, in C for every opening brace "{" you need a closing brace "}" and if you exercise that discipline, you avoid a problem that can be hard to debug later. Assembler code is a very rabid version of the same problem, write it the right way the first time and test each part to make sure it works correctly. What you don't do is a particularly sloppy for of high level language where you slop any old junk out then try and fix it with toys after it is written. This accounts for a lot of rubbish code that is never very reliable.

With only a rare exception, for every PROC you write and ENDP preceded by a RET (some folks like to live dangerously and jump out of a proc but they mess up the CALL RET pairing). You can check if you have balanced the stack by testing ESP before and after a procedure call but the simple discipline is to write matching push pop sequences BEFORE you write the code between them.

JJ is correct in that if you mess up assembler code it goes BANG which is still the best way to find out if you have done something wrong.

jj2007 · June 05, 2013, 09:03:17 PM

Quote from: hutch-- on June 05, 2013, 07:08:32 PM
JJ is correct in that if you mess up assembler code it goes BANG which is still the best way to find out if you have done something wrong.

The BANG can come a bit delayed, though:

include \masm32\include\masm32rt.inc

.code
cheat proc uses esi edi ebx arg1
; LOCAL abc - no locals, please
push eax
ret
cheat endp

start:   mov esi, 11111111
   mov edi, 22222222
   mov ebx, 33333333
   invoke cheat, 123
   print str$(esi), 9, "esi", 13, 10
   print str$(edi), 9, "edi", 13, 10
   print str$(ebx), 9, "ebx", 13, 10
   exit
end start

No BANG. But try to build & run it with JWasm ;)

anta40 · June 05, 2013, 09:30:01 PM

Quote from: jj2007 on June 05, 2013, 09:03:17 PM
No BANG. But try to build & run it with JWasm ;)

ML 11:

Code Select


22222222        esi
33333333        edi
1990847834      ebx

jwasm 2.10:
BANG!

In this case, a tool that could hint us why such crash occurs would be nice

Antariy · June 07, 2013, 01:07:18 AM

JWasm obviously does not create an ebp based stack frame if there are no locals and no any formal procedure parameters used in the proc's code (did not check this practically though). No ebp based frame => no esp restoration at "ret" => return directs to somewhere eax points to. MASM creates frame with no care of params usage so the esp is restored at the moment of return, but the side buggy effect of imbalanced stack are any pops, in the case - pops of USES statement (uses esi edi ebx: prologue's push ebp \ mov ebp, esp \ push esi \ push edi \ push ebx => imbalancing with push eax => epilogue's pop ebx (ebx trashed with eax value) \ pop edi (edx trashed with ebx value) \ pop esi (trashed with edi value) \ leave (the same as mov esp,ebp \ pop ebp) \ ret (x) ).

Good example, Jochen, thanks :t Interesting detail about MASM<->JWasm differences (if JWasm really does not create frame if no params / locals used; if it is so - actually this is better optimization than MASM does).

jj2007 · June 07, 2013, 01:23:20 AM

Hi Alex,

Yes, that's the point, thanks for the detailed explanation :t

In the meantime, I removed the test in reply #7 and instead posted a new one here.

hutch-- · June 07, 2013, 09:56:46 AM

Must be a difference in brain, if I want a stack frame I write it that way and if I don't I also write it that way. When I hear "optimisation" mentioned with an assembler I think "YUK" if I wanted that I would be using a compiler.

MASM is at least consistent here, if you use PROC / ENDP you get a stack frame, if you use the OPTION notation you turn the stack frame off.

japheth · June 07, 2013, 04:50:06 PM

I had to read a few misconceptions here concerning the behavior of some assemblers that I'll try to correct:

both masm and jwasm won't generate a stack frame if a proc has neither parameters nor locals. You can force the assemblers to generate a frame anyway by the user parameter <forceframe>
both masm and jwasm will generate a stack frame if either parameters or locals are detected
jwasm will, on certain occasions, generate a simple "POP EBP" instead of "LEAVE" as epilogue code. This is not to reduce "cycle counts" or save space but to make debugging easier - it's usually easier to fix a crash than to have some register contents being silently destroyed. If you don't want this feature, it can be turned off by option -Zg

Hope this helps, boys!

The MASM Forum

News:

Is there static analysis tools for x86 assembler?

akalenuk

Tedd

akalenuk

anta40

akalenuk

jj2007

akalenuk

jj2007

hutch--

jj2007

anta40

Antariy

jj2007

hutch--

japheth