News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Mysteries of C++ revealed

Started by jj2007, December 14, 2013, 08:16:10 PM

Previous topic - Next topic

jj2007

After reading twice through a convoluted essay titled "C++ References" which seems to be the top Google hit for C++ by reference by pointer, I still had not the faintest idea what the guy was talking about, so I decided to look under the hood...

Enjoy ;)

#include <stdio.h> //if is found in the first 1k, F5 will build all as C code

void    TheFunc(int ThePassedNumber) {
  ThePassedNumber = 0x11111111; // manipulates the stack content, i.e. [the arg]
}

void    TheFunc_ptr(int *ThePassedNumber) {
  *ThePassedNumber = 0x22222222; // gets a pointer and writes there
}

void    TheFunc_ref(int &ThePassedNumber) {
  ThePassedNumber = 0x33333333; // gets a pointer and writes there (identical to TheFuncPtr)
}

int  main(void) {
//__asm int 3; // for Olly
    int  TheNumber;
int& TheRef = TheNumber;
/*
00401079                ³?  8D45 F8             lea eax, [ebp-8]
0040107C                ³.  8945 F0             mov [ebp-10], eax  ; save a pointer to ptr TheNumber
*/
    TheNumber = 0x12345678; // writes to local memory
/*
0040107F                ³?  C745 F8 78563412    mov dword ptr [ebp-8], 12345678
*/
printf("Original number=\t%xh\n", TheNumber);

//__asm int 3;
TheFunc(TheNumber);
/*
004010A2                ³.  8B55 F8             mov edx, [ebp-8]
004010A5                ³?  52                  push edx
004010A6                ³?  E8 6EFFFFFF         call 00401019
...
TheFunc                 Ú$ À55                  push ebp      ; CppConsoleJJ.TheFunc(ThePassedNumber)
00401031                ³.  8BEC                mov ebp, esp
00401033                ³.  C745 08 11111111    mov dword ptr [ebp+8], 11111111  ; WRITE TO STACK
0040103A                ³.  5D                  pop ebp
0040103B                À.  C3                  retn
*/
    printf("My number func'ed=\t%xh (i.e. not changed)\n", TheNumber);

TheFunc_ptr(&TheNumber);
/*
004010C9                ³.  8D4D F8             lea ecx, [ebp-8]   ; ptr TheNumber
004010CC                ³?  51                  push ecx
004010CD                ³?  E8 4CFFFFFF         call 0040101E
...
TheFunc_ptr             Ú$ À55                  push ebp           ; CppConsoleJJ.TheFunc_ptr(ThePassedNumber)
00401041                ³.  8BEC                mov ebp, esp
00401043                ³.  8B45 08             mov eax, [ebp+8]
00401046                ³.  C700 22222222       mov dword ptr [eax], 22222222
0040104C                ³.  5D                  pop ebp
0040104D                À.  C3                  retn
*/
printf("My number ptr'ed=\t%xh\n", TheNumber);

TheFunc_ref(TheRef);
/*
004010F0                ³.  8B45 F0             mov eax, [ebp-10]  ; value of TheRef
004010F3                ³?  50                  push eax
004010F4                ³?  E8 2AFFFFFF         call 00401023
...
TheFunc_ref             Ú$ À55                  push ebp           ; CppConsoleJJ.TheFunc_ref(ThePassedNumber)
00401051                ³.  8BEC                mov ebp, esp
00401053                ³.  8B45 08             mov eax, [ebp+8]
00401056                ³.  C700 33333333       mov dword ptr [eax], 33333333
0040105C                ³.  5D                  pop ebp
0040105D                À.  C3                  retn
*/
printf("My real number=  \t%xh\n", TheNumber);
printf("My ref'ed number=\t%xh\n", TheRef);
return (0);
}


dedndave

looking at C++ stuff is bad for mental health   :lol:

as far as asm goes, i understand passing an address of SomeThing, rather than passing the value of SomeThing

TWell

C++ support C, so first function is as C and second as C++
...
void TheFunc_ptr(int *ThePassedNumber) {
*ThePassedNumber = 0x22222222; // gets a pointer and writes there
}
#ifdef __cplusplus
void TheFunc_ref(int &ThePassedNumber) {
ThePassedNumber = 0x33333333; // gets a pointer and writes there (identical to TheFuncPtr)
}
#endif
...

jj2007

Here is another snippet with mysterious output ;-)

#include <stdio.h>
#include <malloc.h>

int  main(void) {
   printf("%s\n\n", "References are cute!");
   int* MyArray;
   MyArray = (int*) malloc (100*sizeof(int));
   int& ElementFive = MyArray[5];
   MyArray[5]=1005;
   printf("MyArray[5]=%i, ElementFive=%i\n", MyArray[5], ElementFive);

   MyArray[5]=2*MyArray[5];
   printf("MyArray[5]=%i, ElementFive=%i\n", MyArray[5], ElementFive);

   MyArray = (int*) realloc (MyArray,100*sizeof(int));
   MyArray[5]=2*MyArray[5];
   printf("MyArray[5]=%i, ElementFive=%i\n", MyArray[5], ElementFive);

   MyArray = (int*) realloc (MyArray,1000*sizeof(int));
   MyArray[5]=2*MyArray[5];
   ElementFive=2*ElementFive;
   printf("MyArray[5]=%i, ElementFive=%i <<<<<<< great, isn't it?\n", MyArray[5], ElementFive);
   printf("%s", "\nHit Enter to end this disaster");
   getchar();
   return (0);
}

Output:
References are cute!

MyArray[5]=1005, ElementFive=1005
MyArray[5]=2010, ElementFive=2010
MyArray[5]=4020, ElementFive=4020
MyArray[5]=8040, ElementFive=-35783204 <<<<<<< great, isn't it?


Of course, seasoned assembler programmers might have an idea what happens under the hood ;)

TWell

With msvcpp 2010 sp1References are cute!

MyArray[5]=1005, ElementFive=1005
MyArray[5]=2010, ElementFive=2010
MyArray[5]=4020, ElementFive=4020
MyArray[5]=8040, ElementFive=8040 <<<<<<< great, isn't it?

Hit Enter to end this disaster

qWord

The reference becomes invalid when realloc() moves the memory block ... what a mystery  ::)
MREAL macros - when you need floating point arithmetic while assembling!

Manos

References are dangerous and are invalid when the object does not exists.
I use pointers.
When free memory, set the pointer to NULL.
In this way, when use a pointer check first if it is NULL.
Sometimes HeapReAlloc fails but the original pointer is still valid.
I had problems using HeapReAlloc in foretime.
In case that HeapReAlloc fails, I allocate a new memory block and then I copy the old memory.

Manos.

jj2007

Quote from: TWell on December 16, 2013, 06:27:40 AM
MyArray[5]=8040, ElementFive=8040 <<<<<<< great, isn't it?

Yes, it gives you the correct value if you run it with Ctrl F5 (WinXP SP3); but it chokes with F5. Same if you run it from the console, in normal mode (works) or through Olly (chokes).

Quote from: qWord on December 16, 2013, 08:01:10 AM
The reference becomes invalid when realloc() moves the memory block ... what a mystery  ::)

The explanation is simple, of course. But all that is in sharp contrast to the enthusiastic article quoted above. As Manos rightly writes, references are apparently dangerous nonsense. If they are invalidated by something as trivial as HeapRealloc, what is their added value for programming, compared to a pointer? Not to mention the bug potential, given that most of the time HeapRealloc leaves the memory where it is...

A propos bug potential: This morning I rebooted my machine. And, voilà, Adobe claims that this time, really, honestly, no kidding, the latest security holes of Flash have been fixed. Well, this week's security holes, of course. I am sure Flash is not programmed in assembler because assembler is a fairly safe programming language.

P.S.: A goodie from the code above:

MyArray[5]=2*MyArray[5];

004011BB                ³.  8B55 F4           mov edx, [local.3]
004011BE                ³.  8B42 14           mov eax, [edx+14]
004011C1                ³.  D1E0              shl eax, 1  <<<<<< a clever compiler!
004011C3                ³.  8B4D F4           mov ecx, [local.3]
004011C6                ³.  8941 14           mov [ecx+14], eax


For comparison: D162 14 shl dword ptr [edx+14], 1

qWord

Quote from: jj2007 on December 16, 2013, 09:11:12 AMBut all that is in sharp contrast to the enthusiastic article quoted above.
no it isn't - the author explicit warns about invalid references.

Quote from: jj2007 on December 16, 2013, 09:11:12 AMAs Manos rightly writes, references are apparently dangerous nonsense.
All that has been said also applies for pointers! Just replace the reference with a pointer and see what happen.

Quote from: jj2007 on December 16, 2013, 09:11:12 AMwhat is their added value for programming, compared to a pointer?
I would call it "extended" syntax sugar. However, they are not foolproof and the programmer is still responsible for correct usage (as for pointers).
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

Quote from: qWord on December 16, 2013, 10:04:34 AMthe author explicit warns about invalid references

He should get a prize for the foggiest para of the year - and not a word on HeapReAlloc, the only relevant case ;-)

QuoteReferences and Dynamically Allocated Memory
Finally, beware of references to dynamically allocated memory. One problem is that when you use references, it's not clear that the memory backing the reference needs to be deallocated--it usually doesn't, after all. This can be fine when you're passing data into a function since the function would generally not be responsible for deallocating the memory anyway.

On the other hand, if you return a reference to dynamically allocated memory, then you're asking for trouble since it won't be clear that there is something that needs to be cleaned up by the function caller.

Quote from: qWord on December 16, 2013, 10:04:34 AM
Quote from: jj2007 on December 16, 2013, 09:11:12 AMAs Manos rightly writes, references are apparently dangerous nonsense.
All that has been said also applies for pointers! Just replace the reference with a pointer and see what happen.

What happens is that the pointer is being updated, the reference not:

Passed as ptr: 8040
Passed as ref: -35783204


void    TheFunc_ptr(int *ThePassedNumber) {printf("Passed as ptr: %i\n", *ThePassedNumber);}
void    TheFunc_ref(int &ThePassedNumber) {printf("Passed as ref: %i\n", ThePassedNumber);}
...
TheFunc_ptr(&MyArray[5]);
TheFunc_ref(ElementFive);


Again, the wrong result happens only for the F5 (debug) run; for Ctrl F5, the output looks correct, which I find slightly ... mysterious :P

qWord

Quote from: jj2007 on December 16, 2013, 10:39:34 AM
Quote from: qWord on December 16, 2013, 10:04:34 AMthe author explicit warns about invalid references

He should get a prize for the foggiest para of the year - and not a word on HeapReAlloc, the only relevant case ;-)
For a good reason: the c++ standard doesn't know HeapReAlloc and realloc() exist for compatibility with C!
What should we think know? C++ is bad because you pick out an not-that-detailed article? Or because you are not aware of it?
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

I think the lesson is simply that references are much trickier than it seems. Pointers are more reliable - see my edit above, posted just a second before your last post.

So realloc is old-fashioned. Would the C++ compiler update the reference to element 5 if you insert an element at position 2 into a vector of integers?

qWord

Quote from: jj2007 on December 16, 2013, 10:58:09 AMI think the lesson is simply that references are much trickier than it seems.
it not that hard: when using them, the syntax is like working with the object they are referring to, but they have the lifetime of a const pointer (T *const).

Quote from: jj2007 on December 16, 2013, 10:58:09 AMPointers are more reliable - see my edit above, posted just a second before your last post.
I have the impression that you didn't realize the problem of your above code: when I said you should replace the reference with a pointer, I was talking about "ElementFive". A reference simply can't used here because of the reallocation. When using a  pointer, you must detect the memory-move and update ElementFive (exactly this is not possible with a reference).

Quote from: jj2007 on December 16, 2013, 10:58:09 AMSo realloc is old-fashioned.
It is C-fashioned.

Quote from: jj2007 on December 16, 2013, 10:58:09 AMWould the C++ compiler update the reference to element 5 if you insert an element at position 2 into a vector of integers?
References are never "updated" - they always refer to the object they have been initialized with. Because std::vector is implemented as an array that is reallocated if required, all references, pointer and iterators are invalidated after insertion. This is documented by the standard and by online references like www.cplusplus.com.
std::list (commonly a linked list) would fit your requirements btw.
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

Quote from: qWord on December 17, 2013, 03:14:01 AM
Quote from: jj2007 on December 16, 2013, 10:58:09 AMWould the C++ compiler update the reference to element 5 if you insert an element at position 2 into a vector of integers?
References are never "updated" - they always refer to the object they have been initialized with. Because std::vector is implemented as an array that is reallocated if required, all references, pointer and iterators are invalidated after insertion.

If I put the value 555h into element five, and obtain a reference to this "object", then I would expect that the reference to element five keeps referring to the "object" that contains 555h. Instead, the reference is apparently simply a fixed pointer to a fixed memory address, and gets invalidated when the compiler decides, for whatever reason, to move the whole object elsewhere. So the programmer must make sure that the "object" is never moved in memory...

qWord

You might simply stick with BASIC  ;)
MREAL macros - when you need floating point arithmetic while assembling!