News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Is there static analysis tools for x86 assembler?

Started by akalenuk, June 03, 2013, 08:06:45 PM

Previous topic - Next topic

akalenuk

I've been writing in MASM32 for a while couple of years ago and I remember its neat minimalistic enviroment. I didn't want anything better for the moment, but now, as I gained some experience with modern IDEs, I feel like a lack of static analyser or embeded profiler might be discomforting. Well, I don't use a profiler every minute, so it's ok to have it as a separate tool. But what about static analysis?

I googled out some heavy weight industrial tools like LDRA testbed or MALPAS, but they seem to be working moslty with embedded software. They use own IL and model-ckecking aproach, which is great, but I'm looking more for something to tell me about some style issues and pieces of dead code.

So the question is, is there some lightweight tools for static analysis of assembler code available?

Tedd

Tools: pencil + paper + Intel instruction manuals + hours to waste :badgrin:

Intel used to have tools as part of the code optimizer suite, but it seems to have been split into multiple products now. And the focus is on higher-level languages, though since they're complied to asm anyway, I'd expect they should still be useful.

This list maybe worth examining in more detail: http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
Potato2

akalenuk

Well, hours to waste are exactly what I'm trying to save here :-)
I am familliar with this list. These tools are indeed for mostly high level languages. Although for instance LDPR testbed and MALPAS do support assembly code and binary static code analysis. And this one is even free: http://bitblaze.cs.berkeley.edu/vine.html But it's not completelly what I had in mind. They are mostly model-checking or invarian based validation tools for small pieces of embedded code. I want something more like CppCheck or JLint. Something to help me write tons of code, not to proove its correctness afterwards.

anta40

Something like bug checker, maybe?

I do most of my work in Java, and I use FindBugs to clean my code.
It would be nice if there's such tool for assembly language.

akalenuk


jj2007

Quote from: anta40 on June 04, 2013, 10:05:29 PMI use FindBugs to clean my code.
It would be nice if there's such tool for assembly language.

Assembly code is either buggy, and Masm will bark at you, or it's "correct", and it will assemble without errors. I am usually an optimist, but try to find an example where a software could spot a "bug" in assembly... almost impossible at such a low level.

akalenuk

Agner Fog lists 20 of common pitfals in assembly code. Some of them as checking push/pop matching, or ensuring ret before endp can be done authomatically. Of course there could be a situation when one would want to leave push/pop unmatched, but false positive results are quite common for high level static analysis as well. It is cheaper to suppress every false positive, then to let a silly bug, such as these, into code.

Also modern assembly implies using a lot of user defined entities, such as procedures, variables, labels and so on. It would be nice to put them under some kind of supervision. Say, I want to calculate a 'very_importrant_x' and a 'very_importrant_y' variables. As I am quite lazy, I do first one and then copy paste the second forgeting to change 'very_importrant_x' to 'very_importrant_y'. Static analysis tool may found 'very_importrant_y' being instanced but never initialized. Which is not a bug from machines point of view, but sure is one logically.

Of course this type of analysis would be eighter comperehsive, nor reliable but, well, none of them are.

jj2007

#7
Quote from: akalenuk on June 05, 2013, 05:00:48 PMchecking push/pop matching, or ensuring ret before endp ... 'very_importrant_y' being instanced but never initialized.

Points taken. Shouldn't be too difficult, actually. Here is a little test, creating a list of procs identified by proc/endp:
line 14152      proc 143        SetParaFormat
line 14185      proc 144        GetParaFormat
line 14194      proc 145        InsDate
line 14217      proc 146        SetFontSizeP
line 14230      proc 147        SetFontFace
line 14249      proc 148        SetTabs
line 14279      proc 149        a  <<<<<<<<< that was a COMMENT @ ... a proc ...@
line 14357      proc 150        SetSelFormat
line 14419      proc 151        ShowWinErr
line 14442      proc 152        UpdateRE
line 14468      proc 153        ClearLocVars


hutch--

There is a certain discipline for most languages, in C for every opening brace "{" you need a closing brace "}" and if you exercise that discipline, you avoid a problem that can be hard to debug later. Assembler code is a very rabid version of the same problem, write it the right way the first time and test each part to make sure it works correctly. What you don't do is a particularly sloppy for of high level language where you slop any old junk out then try and fix it with toys after it is written. This accounts for a lot of rubbish code that is never very reliable.

With only a rare exception, for every PROC you write and ENDP preceded by a RET (some folks like to live dangerously and jump out of a proc but they mess up the CALL RET pairing). You can check if you have balanced the stack by testing ESP before and after a procedure call but the simple discipline is to write matching push pop sequences BEFORE you write the code between them.

JJ is correct in that if you mess up assembler code it goes BANG which is still the best way to find out if you have done something wrong.

jj2007

Quote from: hutch-- on June 05, 2013, 07:08:32 PM
JJ is correct in that if you mess up assembler code it goes BANG which is still the best way to find out if you have done something wrong.

The BANG can come a bit delayed, though:

include \masm32\include\masm32rt.inc

.code
cheat proc uses esi edi ebx arg1
; LOCAL abc - no locals, please
  push eax
  ret
cheat endp

start:   mov esi, 11111111
   mov edi, 22222222
   mov ebx, 33333333
   invoke cheat, 123
   print str$(esi), 9, "esi", 13, 10
   print str$(edi), 9, "edi", 13, 10
   print str$(ebx), 9, "ebx", 13, 10
   exit
end start


No BANG. But try to build & run it with JWasm ;)

anta40

Quote from: jj2007 on June 05, 2013, 09:03:17 PM
No BANG. But try to build & run it with JWasm ;)

ML 11:

22222222        esi
33333333        edi
1990847834      ebx


jwasm 2.10:
BANG!

In this case, a tool that could hint us why such crash occurs would be nice  :biggrin:

Antariy

JWasm obviously does not create an ebp based stack frame if there are no locals and no any formal procedure parameters used in the proc's code (did not check this practically though). No ebp based frame => no esp restoration at "ret" => return directs to somewhere eax points to. MASM creates frame with no care of params usage so the esp is restored at the moment of return, but the side buggy effect of imbalanced stack are any pops, in the case - pops of USES statement (uses esi edi ebx: prologue's push ebp \ mov ebp, esp \ push esi \ push edi \ push ebx => imbalancing with push eax => epilogue's pop ebx (ebx trashed with eax value) \ pop edi (edx trashed with ebx value) \ pop esi (trashed with edi value) \ leave (the same as mov esp,ebp \ pop ebp) \ ret (x) ).

Good example, Jochen, thanks :t Interesting detail about MASM<->JWasm differences (if JWasm really does not create frame if no params / locals used; if it is so - actually this is better optimization than MASM does).

jj2007

Hi Alex,

Yes, that's the point, thanks for the detailed explanation :t

In the meantime, I removed the test in reply #7 and instead posted a new one here.

hutch--

Must be a difference in brain, if I want a stack frame I write it that way and if I don't I also write it that way. When I hear "optimisation" mentioned with an assembler I think "YUK" if I wanted that I would be using a compiler.

MASM is at least consistent here, if you use PROC / ENDP you get a stack frame, if you use the OPTION notation you turn the stack frame off.

japheth


I had to read a few misconceptions here concerning the behavior of some assemblers that I'll try to correct:


  • both masm and jwasm won't generate a stack frame if a proc has neither parameters nor locals. You can force the assemblers to generate a frame anyway by the user parameter <forceframe>
  • both masm and jwasm will generate a stack frame if either parameters or locals are detected
  • jwasm will, on certain occasions, generate a simple "POP EBP" instead of "LEAVE" as epilogue code. This is not to reduce "cycle counts" or save space but to make debugging easier - it's usually easier to fix a crash than to have some register contents being silently destroyed. If you don't want this feature, it can be turned off by option -Zg

Hope this helps, boys!  :bgrin: