News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Mangled names

Started by Relvinian, February 25, 2017, 03:42:58 AM

Previous topic - Next topic

Relvinian

Hey All,

In updating my library, I am working on C++ namespaces for sections of my library.  Since I only have Visual Studio 2015 installed, I only know how VC++ creates the mangled names.  So, my question is this:

Do all compilers generate the same mangle name for a function?

I have the following example in a .h file:

// generic defines for IN, OUT and OPTIONAL parameters. These do nothing defines
// are an aid to developers so they can quickly tell how the parameters are used.
#if !defined(IN)
#define IN
#endif

#if !defined(OUT)
#define OUT
#endif

#if !defined(OPTIONAL)
#define OPTIONAL
#endif


namespace ASMRTL {
namespace Strings {
#if defined(_UNICODE) || defined(UNICODE)
   wchar_t*   __stdcall copy(OUT wchar_t* out, IN const wchar_t* in);
   size_t        __stdcall length(IN const wchar_t* in);
#else
   char*        __stdcall copy(OUT char* out, IN const char* in);
   size_t       __stdcall length(IN const char* in);
#endif
}
}


Working on my ASM libraries and wanted to put them into a namespace (using the mangle name convention) to help avoid conflicts with other system calls, libraries (like MASM32 & MASM64, GoAsm, etc). So, if different compilers generate different mangled names, then I will have to figure something else out.

Anyway, here is what VC15 generates for the above:

// ANSI mangled names
?copy@Strings@ASMRTL@@YGPADPADPBD@Z
?length@Strings@ASMRTL@@YGIPBD@Z

// UNICODE mangled names
?copy@Strings@ASMRTL@@YGPA_WPA_WPB_W@Z
?length@Strings@ASMRTL@@YGIPB_W@Z


And then in ASM code, I have defines that will simplify the name as follows to make calling a little easier.

ASMRTL_Strings_copy      equ <?copy@Strings@ASMRTL@@YGPADPADPBD@Z>
ASMRTL_Strings_length    equ <?length@Strings@ASMRTL@@YGIPBD@Z>


Could someone check a different compiler and see what they get.

Thanks, Relvinian

Vortex

Hi Relvinian,

Agner Fog wrote a very good manual about calling conventions including mangling schemes :

QuoteCalling conventions for different C++ compilers and operating systems

    This document contains details about data representation, function calling conventions, register usage conventions, name mangling schemes, etc. for many different C++ compilers and operating systems. Discusses compatibilities and incompatibilities between different C++ compilers. Includes information that is not covered by the official Application Binary Interface standards (ABI's). The information provided here is based on my own research and therefore descriptive rather than normative. Intended as a source of reference for programmers who want to make function libraries compatible with multiple compilers or operating systems and for makers of compilers and other development tools who want their tools to be compatible with existing tools.
     
    File name: calling_conventions.pdf, size: 1025214, last modified: 2016-Nov-26.

http://agner.org/optimize/

On the same page, you will find the Object file converter capable of modifying symbols in object modules :

QuoteObject file converter

This utility can be used for converting object files between COFF/PE, OMF, ELF and Mach-O formats for all 32-bit and 64-bit x86 platforms. Can modify symbol names in object files. Can build, modify and convert function libraries across platforms. Can dump object files and executable files. Also includes a very good disassembler supporting the SSE4, AVX, AVX2, AVX512, FMA3, FMA4, XOP and Knights Corner instruction sets. Source code included (GPL). Manual.

File name: objconv.zip, size: 1017681, last modified: 2016-Nov-27.

Relvinian

Thanks vortex.

Looks like fun reading over the next few days and then to rethink my "design".   :dazzled:

Relvinian

hutch--

Tell me this, what is the problem with a normal C/C++ compiler in simply prototyping an external assembler procedure with the normal "extern" notation ?

Relvinian

Hutch,

For normal C++ programs, simple C style declarations and externs goes against why C++ was created.  For example, think of the popular standard library and writing to the console...  All functions are wrapped in namespaces, like std::cout.  So, to use 'cout', you have to type it like I did, or at the top of your file, have the following:  using namespace std;    Of course you still need to include the "header" file.   <iostream>.

Now, for straight C, since there was no such thing as classes, encapsulation, and virtual tables, it was not a problem.  Just the standard namespace of all externs could get very long and you had to be careful because you could easily conflict with an existing function.

That is why C++ relies on namespaces to help give the developer more freedom with declaring function names.  That is why compilers also prepare a "mangled" name for those functions using namespaces.

A final note.  Much easier to understand that a function named 'copy' resides some a particular namespace or object then what the old style C and of course, ASM uses. Of course, if a person never programs in more than a single language, not much worry is needed.  :biggrin:

Perry

hutch--

That's a shame, it sounds like a compatibility issue and from memory every compiler has its own name mangling scheme. I wonder if there is at least a convention for each specific compiler in terms of its name mangling rules ? Microsoft still supply MASM in both 32 and 64 bit and particularly in 64 bit there is much that can be done with the extra registers. I gather that some form of "extern c" works which will be specific to each compiler but it does not look like from what you have said that they offer anything that is much better.

Has Microsoft published anything useful in interfacing MASM modules into a C++ application ?

jj2007

Quote from: Relvinian on February 25, 2017, 01:32:56 PMThat is why C++ relies on namespaces to help give the developer more freedom with declaring function names.  That is why compilers also prepare a "mangled" name for those functions using namespaces.

Sometimes they exaggerate a little bit; try GetProcAddress with this from wet.dll:?AddToSelectionXml@CSelectionHelper@Mig@@IAEXPAV?$CTreeNode@VCSelectionTreeData@Mig@@@2@PAVString@UnBCL@@PAV?$ArrayList@PAV?$DictionaryEntry@PAVString@UnBCL@@PAV?$Hashtable@PAVString@UnBCL@@H@2@@UnBCL@@@5@PAV?$Hashtable@PAVString@UnBCL@@PAV?$ArrayList@PAV?$DictionaryEntry@PAVString@UnBCL@@H@UnBCL@@@2@@5@1HPAV?$Hashtable@PAVString@UnBCL@@PAV?$DictionaryEntry@PAV?$ArrayList@PAVString@UnBCL@@@UnBCL@@PAV?$ArrayList@PAV?$DictionaryEntry@PAVString@UnBCL@@H@UnBCL@@@2@@2@@5@@Z

Note that only about 2% of all DLLs in Sys32 and subdirectories use name mangling. It may look common to C++ compilers, but the great majority of developers prefer different methods to avoid namespace conflicts. For example, you will probably never see a conflict with gsl_printwhatever. Nobody except the developers of the GSL library would randomly choose a gsl_ prefix for their functions. A bit more typing, but nothing compared to the 473 chars of the mangled gibberish in wet.dll 8)

Vortex

Calling C++ decorated functions from Masm is more tricky. Here is an example :

#include <windows.h>

int __stdcall mbox(char *msg,char *title)
{
return(MessageBox(0,msg,title,MB_OK));
}


Compiling with Microsoft Visual Studio 2010 Express :

cl /c /Zl /Fa mbox.cpp

Examining the assembly module, the decoration of the mbox function :

PUBLIC ?mbox@@YGHPAD0@Z

Format of a C++ Decorated Name :

https://msdn.microsoft.com/en-us/library/2ax8kbk1.aspx

The difficulty while calling such function from Masm is that the assembler will decorate the function. With some macro tricks, it's possible to avoid the extra decoration.


include     ..\CallCppFunc.inc

EXTERN SYSCALL ?mbox@@YGHPAD0@Z:PROC

mbox EQU <pr2 PTR ?mbox@@YGHPAD0@Z>

.data

msg     db 'Calling C++ decorated function',0
capt    db 'Hello',0

.code

start:

    invoke  mbox,ADDR msg,ADDR capt

    invoke  ExitProcess,0

END start


The pr2 macro declaring a function taking two parameters is defined in windows.inc like the following :

ArgCount MACRO number
      LOCAL txt
      txt equ <typedef PROTO :DWORD>
        REPEAT number - 1
          txt CATSTR txt,<,:DWORD>
        ENDM
      EXITM <txt>
    ENDM
.
.
    pr2  ArgCount(2)


Normally, the invoke macro will balance the stack after calling a SYSCALL function :


?mbox@@YGHPAD0@Z PROTO SYSCALL :DWORD,:DWORD

mbox EQU <?mbox@@YGHPAD0@Z>
.
.
invoke  mbox,ADDR msg,ADDR capt


Disassembling the object module :

push    offset capt
push    offset msg
call    ?mbox@@YGHPAD0@Z
add     esp, 8


Naturally, no need to balance  the stack as the mbox function is declared as _stdcall

Another method is to use a custom invoke macro :

include     ..\CallCppFunc.inc
include     invoke.inc

EXTERN SYSCALL ?mbox@@YGHPAD0@Z:PROC

mbox EQU <?mbox@@YGHPAD0@Z>

.data

msg     db 'Calling C++ decorated function',0
capt    db 'Hello',0

.code

start:

   _invoke  mbox,ADDR msg,ADDR capt

    invoke  ExitProcess,0

END start


The _invoke macro does not require any function prototype. Declaring the external function with EXTERN before calling it is sufficient.