Author Topic: Mysteries of C++ revealed  (Read 7673 times)

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Mysteries of C++ revealed
« on: December 14, 2013, 08:16:10 PM »
After reading twice through a convoluted essay titled "C++ References" which seems to be the top Google hit for C++ by reference by pointer, I still had not the faintest idea what the guy was talking about, so I decided to look under the hood...

Enjoy ;)

Code: [Select]
#include <stdio.h> //if is found in the first 1k, F5 will build all as C code

void    TheFunc(int ThePassedNumber) {
  ThePassedNumber = 0x11111111; // manipulates the stack content, i.e. [the arg]
}

void    TheFunc_ptr(int *ThePassedNumber) {
  *ThePassedNumber = 0x22222222; // gets a pointer and writes there
}

void    TheFunc_ref(int &ThePassedNumber) {
  ThePassedNumber = 0x33333333; // gets a pointer and writes there (identical to TheFuncPtr)
}

int  main(void) {
//__asm int 3; // for Olly
    int  TheNumber;
int& TheRef = TheNumber;
/*
00401079                ³?  8D45 F8             lea eax, [ebp-8]
0040107C                ³.  8945 F0             mov [ebp-10], eax  ; save a pointer to ptr TheNumber
*/
    TheNumber = 0x12345678; // writes to local memory
/*
0040107F                ³?  C745 F8 78563412    mov dword ptr [ebp-8], 12345678
*/
printf("Original number=\t%xh\n", TheNumber);

//__asm int 3;
TheFunc(TheNumber);
/*
004010A2                ³.  8B55 F8             mov edx, [ebp-8]
004010A5                ³?  52                  push edx
004010A6                ³?  E8 6EFFFFFF         call 00401019
...
TheFunc                 Ú$ À55                  push ebp      ; CppConsoleJJ.TheFunc(ThePassedNumber)
00401031                ³.  8BEC                mov ebp, esp
00401033                ³.  C745 08 11111111    mov dword ptr [ebp+8], 11111111  ; WRITE TO STACK
0040103A                ³.  5D                  pop ebp
0040103B                À.  C3                  retn
*/
    printf("My number func'ed=\t%xh (i.e. not changed)\n", TheNumber);

TheFunc_ptr(&TheNumber);
/*
004010C9                ³.  8D4D F8             lea ecx, [ebp-8]   ; ptr TheNumber
004010CC                ³?  51                  push ecx
004010CD                ³?  E8 4CFFFFFF         call 0040101E
...
TheFunc_ptr             Ú$ À55                  push ebp           ; CppConsoleJJ.TheFunc_ptr(ThePassedNumber)
00401041                ³.  8BEC                mov ebp, esp
00401043                ³.  8B45 08             mov eax, [ebp+8]
00401046                ³.  C700 22222222       mov dword ptr [eax], 22222222
0040104C                ³.  5D                  pop ebp
0040104D                À.  C3                  retn
*/
printf("My number ptr'ed=\t%xh\n", TheNumber);

TheFunc_ref(TheRef);
/*
004010F0                ³.  8B45 F0             mov eax, [ebp-10]  ; value of TheRef
004010F3                ³?  50                  push eax
004010F4                ³?  E8 2AFFFFFF         call 00401023
...
TheFunc_ref             Ú$ À55                  push ebp           ; CppConsoleJJ.TheFunc_ref(ThePassedNumber)
00401051                ³.  8BEC                mov ebp, esp
00401053                ³.  8B45 08             mov eax, [ebp+8]
00401056                ³.  C700 33333333       mov dword ptr [eax], 33333333
0040105C                ³.  5D                  pop ebp
0040105D                À.  C3                  retn
*/
printf("My real number=  \t%xh\n", TheNumber);
printf("My ref'ed number=\t%xh\n", TheRef);
return (0);
}
« Last Edit: December 14, 2013, 09:40:07 PM by jj2007 »

dedndave

  • Member
  • *****
  • Posts: 8734
  • Still using Abacus 2.0
    • DednDave
Re: Mysteries of C++ revealed
« Reply #1 on: December 14, 2013, 08:36:37 PM »
looking at C++ stuff is bad for mental health   :lol:

as far as asm goes, i understand passing an address of SomeThing, rather than passing the value of SomeThing

TWell

  • Member
  • ****
  • Posts: 748
Re: Mysteries of C++ revealed
« Reply #2 on: December 14, 2013, 08:45:22 PM »
C++ support C, so first function is as C and second as C++
Code: [Select]
...
void TheFunc_ptr(int *ThePassedNumber) {
*ThePassedNumber = 0x22222222; // gets a pointer and writes there
}
#ifdef __cplusplus
void TheFunc_ref(int &ThePassedNumber) {
ThePassedNumber = 0x33333333; // gets a pointer and writes there (identical to TheFuncPtr)
}
#endif
...

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Mysteries of C++ revealed
« Reply #3 on: December 16, 2013, 05:33:20 AM »
Here is another snippet with mysterious output ;-)

#include <stdio.h>
#include <malloc.h>

int  main(void) {
   printf("%s\n\n", "References are cute!");
   int* MyArray;
   MyArray = (int*) malloc (100*sizeof(int));
   int& ElementFive = MyArray[5];
   MyArray[5]=1005;
   printf("MyArray[5]=%i, ElementFive=%i\n", MyArray[5], ElementFive);

   MyArray[5]=2*MyArray[5];
   printf("MyArray[5]=%i, ElementFive=%i\n", MyArray[5], ElementFive);

   MyArray = (int*) realloc (MyArray,100*sizeof(int));
   MyArray[5]=2*MyArray[5];
   printf("MyArray[5]=%i, ElementFive=%i\n", MyArray[5], ElementFive);

   MyArray = (int*) realloc (MyArray,1000*sizeof(int));
   MyArray[5]=2*MyArray[5];
   ElementFive=2*ElementFive;
   printf("MyArray[5]=%i, ElementFive=%i <<<<<<< great, isn't it?\n", MyArray[5], ElementFive);
   printf("%s", "\nHit Enter to end this disaster");
   getchar();
   return (0);
}

Output:
References are cute!

MyArray[5]=1005, ElementFive=1005
MyArray[5]=2010, ElementFive=2010
MyArray[5]=4020, ElementFive=4020
MyArray[5]=8040, ElementFive=-35783204 <<<<<<< great, isn't it?


Of course, seasoned assembler programmers might have an idea what happens under the hood ;)

TWell

  • Member
  • ****
  • Posts: 748
Re: Mysteries of C++ revealed
« Reply #4 on: December 16, 2013, 06:27:40 AM »
With msvcpp 2010 sp1
Code: [Select]
References are cute!

MyArray[5]=1005, ElementFive=1005
MyArray[5]=2010, ElementFive=2010
MyArray[5]=4020, ElementFive=4020
MyArray[5]=8040, ElementFive=8040 <<<<<<< great, isn't it?

Hit Enter to end this disaster

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Mysteries of C++ revealed
« Reply #5 on: December 16, 2013, 08:01:10 AM »
The reference becomes invalid when realloc() moves the memory block ... what a mystery  ::)
MREAL macros - when you need floating point arithmetic while assembling!

Manos

  • Guest
Re: Mysteries of C++ revealed
« Reply #6 on: December 16, 2013, 08:22:50 AM »
References are dangerous and are invalid when the object does not exists.
I use pointers.
When free memory, set the pointer to NULL.
In this way, when use a pointer check first if it is NULL.
Sometimes HeapReAlloc fails but the original pointer is still valid.
I had problems using HeapReAlloc in foretime.
In case that HeapReAlloc fails, I allocate a new memory block and then I copy the old memory.

Manos.

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Mysteries of C++ revealed
« Reply #7 on: December 16, 2013, 09:11:12 AM »
MyArray[5]=8040, ElementFive=8040 <<<<<<< great, isn't it?

Yes, it gives you the correct value if you run it with Ctrl F5 (WinXP SP3); but it chokes with F5. Same if you run it from the console, in normal mode (works) or through Olly (chokes).

The reference becomes invalid when realloc() moves the memory block ... what a mystery  ::)

The explanation is simple, of course. But all that is in sharp contrast to the enthusiastic article quoted above. As Manos rightly writes, references are apparently dangerous nonsense. If they are invalidated by something as trivial as HeapRealloc, what is their added value for programming, compared to a pointer? Not to mention the bug potential, given that most of the time HeapRealloc leaves the memory where it is...

A propos bug potential: This morning I rebooted my machine. And, voilà, Adobe claims that this time, really, honestly, no kidding, the latest security holes of Flash have been fixed. Well, this week's security holes, of course. I am sure Flash is not programmed in assembler because assembler is a fairly safe programming language.

P.S.: A goodie from the code above:

MyArray[5]=2*MyArray[5];

004011BB                ³.  8B55 F4           mov edx, [local.3]
004011BE                ³.  8B42 14           mov eax, [edx+14]
004011C1                ³.  D1E0              shl eax, 1  <<<<<< a clever compiler!
004011C3                ³.  8B4D F4           mov ecx, [local.3]
004011C6                ³.  8941 14           mov [ecx+14], eax


For comparison: D162 14 shl dword ptr [edx+14], 1

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Mysteries of C++ revealed
« Reply #8 on: December 16, 2013, 10:04:34 AM »
But all that is in sharp contrast to the enthusiastic article quoted above.
no it isn't - the author explicit warns about invalid references.

As Manos rightly writes, references are apparently dangerous nonsense.
All that has been said also applies for pointers! Just replace the reference with a pointer and see what happen.

what is their added value for programming, compared to a pointer?
I would call it "extended" syntax sugar. However, they are not foolproof and the programmer is still responsible for correct usage (as for pointers).
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Mysteries of C++ revealed
« Reply #9 on: December 16, 2013, 10:39:34 AM »
the author explicit warns about invalid references

He should get a prize for the foggiest para of the year - and not a word on HeapReAlloc, the only relevant case ;-)

Quote
References and Dynamically Allocated Memory
Finally, beware of references to dynamically allocated memory. One problem is that when you use references, it's not clear that the memory backing the reference needs to be deallocated--it usually doesn't, after all. This can be fine when you're passing data into a function since the function would generally not be responsible for deallocating the memory anyway.

On the other hand, if you return a reference to dynamically allocated memory, then you're asking for trouble since it won't be clear that there is something that needs to be cleaned up by the function caller.

As Manos rightly writes, references are apparently dangerous nonsense.
All that has been said also applies for pointers! Just replace the reference with a pointer and see what happen.

What happens is that the pointer is being updated, the reference not:

Passed as ptr: 8040
Passed as ref: -35783204


Code: [Select]
void    TheFunc_ptr(int *ThePassedNumber) {printf("Passed as ptr: %i\n", *ThePassedNumber);}
void    TheFunc_ref(int &ThePassedNumber) {printf("Passed as ref: %i\n", ThePassedNumber);}
...
TheFunc_ptr(&MyArray[5]);
TheFunc_ref(ElementFive);

Again, the wrong result happens only for the F5 (debug) run; for Ctrl F5, the output looks correct, which I find slightly ... mysterious :P

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Mysteries of C++ revealed
« Reply #10 on: December 16, 2013, 10:53:08 AM »
the author explicit warns about invalid references

He should get a prize for the foggiest para of the year - and not a word on HeapReAlloc, the only relevant case ;-)
For a good reason: the c++ standard doesn't know HeapReAlloc and realloc() exist for compatibility with C!
What should we think know? C++ is bad because you pick out an not-that-detailed article? Or because you are not aware of it?
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Mysteries of C++ revealed
« Reply #11 on: December 16, 2013, 10:58:09 AM »
I think the lesson is simply that references are much trickier than it seems. Pointers are more reliable - see my edit above, posted just a second before your last post.

So realloc is old-fashioned. Would the C++ compiler update the reference to element 5 if you insert an element at position 2 into a vector of integers?

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Mysteries of C++ revealed
« Reply #12 on: December 17, 2013, 03:14:01 AM »
I think the lesson is simply that references are much trickier than it seems.
it not that hard: when using them, the syntax is like working with the object they are referring to, but they have the lifetime of a const pointer (T *const).

Pointers are more reliable - see my edit above, posted just a second before your last post.
I have the impression that you didn't realize the problem of your above code: when I said you should replace the reference with a pointer, I was talking about "ElementFive". A reference simply can't used here because of the reallocation. When using a  pointer, you must detect the memory-move and update ElementFive (exactly this is not possible with a reference).

So realloc is old-fashioned.
It is C-fashioned.

Would the C++ compiler update the reference to element 5 if you insert an element at position 2 into a vector of integers?
References are never "updated" - they always refer to the object they have been initialized with. Because std::vector is implemented as an array that is reallocated if required, all references, pointer and iterators are invalidated after insertion. This is documented by the standard and by online references like www.cplusplus.com.
std::list (commonly a linked list) would fit your requirements btw.
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Mysteries of C++ revealed
« Reply #13 on: December 17, 2013, 06:36:51 AM »
Would the C++ compiler update the reference to element 5 if you insert an element at position 2 into a vector of integers?
References are never "updated" - they always refer to the object they have been initialized with. Because std::vector is implemented as an array that is reallocated if required, all references, pointer and iterators are invalidated after insertion.

If I put the value 555h into element five, and obtain a reference to this "object", then I would expect that the reference to element five keeps referring to the "object" that contains 555h. Instead, the reference is apparently simply a fixed pointer to a fixed memory address, and gets invalidated when the compiler decides, for whatever reason, to move the whole object elsewhere. So the programmer must make sure that the "object" is never moved in memory...

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Mysteries of C++ revealed
« Reply #14 on: December 17, 2013, 10:06:50 AM »
You might simply stick with BASIC  ;)
MREAL macros - when you need floating point arithmetic while assembling!