News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

passing a string array from c++ into a masm function that will be called by c++

Started by bobl, April 27, 2013, 10:51:15 PM

Previous topic - Next topic

bobl

Over the moon with being able to use masm in it's own environment for my stuff...
I've been looking at making a start and researching the best way to process an array of strings passed by c++ to a masm function. It's nagging me that in the example masm function that I've used so far the arguments appear as DWORDS in masm but are floats in C++? I fear I'm missing some rules of thumb here.

Re the strings. At present they're in a TStringList...which is a class. I have no idea how they're represented and am only familiar with PB's dynamic strings and null terminated strings which I am quite happy to convert the TStringList to.

To my simple mind it seems a good idea to pass the string array as an address to a lump of memory structured as follows
number of strings           4 bytes
len of 1st string string    4 bytes
1st string                        n bytes
len of 2nd string             4 bytes
2nd string                       n bytes
etc

I am aware that the intel microprocessors have all sorts of built-in short cuts for doing stuff and don't want to miss a trick. Therefore...could some kind soul put me straight on the most straight-forward way to go about this.
BTW the strings are going to be language tokens...that I need to convert to some sort of routine addresses.
I don't know if this extra information has any bearing on the approach.

Any guidance much appreciated.




Gunther

Hi bobl,

passing an array to an assembly language procedure isn't so hard: pass a pointer to the first array element and the array length. That's all what your function needs.

Quote from: bobl on April 27, 2013, 10:51:15 PM
Re the strings. At present they're in a TStringList...which is a class.

That's the real problem. Writing external assembly language functions which operate on class members isn't easy. I wouldn't do that. If  I can remember it right, Agner Fog describes that complicated technique in one of the manuals. The other way is: let the compiler do the dirty work and use the inline assembler.

Gunther
You have to know the facts before you can distort them.

jj2007

Can you post your C++ code with a sample array? I'm also curious how to do that :biggrin:

Just for fun, I googled string array ansi C - strings seem to be hilariously clumsy in C... ::)

bobl

I've just got a richedit box that I type the tokens into...delimited by space.
I was splittling them and putting them into a listbox....
and will again but I've stripped it out for now to process them in turn with Outer() in class cForth in unit2.
In the richedit box you need to type a couple of words in to see it in action.
Hope that helps

here's unit1.cpp

//---------------------------------------------------------------------------

#include <vcl.h>
#pragma hdrstop

#include "Unit1.h"
#include "Unit2.h"


//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner)
: TForm(Owner)
{
}
//---------------------------------------------------------------------------

#define CR1 13
#define CR2 10
#define SPC 32

//remember to delete this
//you have to use new to create it
//TStringList * tkns = new TStringList;

void __fastcall TForm1::Button1Click(TObject *Sender)
{
ShowMessage ("hi");

//split program into tokens and put in listbox
AnsiString src = RichEdit1->Lines->Text;
char * pB = src.c_str();
AnsiString buf = "";
TStringList * tkns = new TStringList;

for ( int i = 0; i < src.Length(); i++ ) {
switch(*pB){
case CR1 :
if (*(pB+1) == CR2){
tkns->Add(buf);
//ListBox1->Items->Add(buf);
buf = "";
pB++;
}
break;
case SPC :
tkns->Add(buf);
//ListBox1->Items->Add(buf);
buf = "";
break;
default :
buf = buf + *pB;
}
pB++;
}
//How to see an individual string
//ShowMessage(  tkns->Strings[1]   );

cForth f;
f.Outer(tkns);

delete tkns;

}
//---------------------------------------------------------------------------

void __fastcall TForm1::FormKeyPress(TObject *Sender, char &Key)
{
//only works if form's keypreview is set to true
//Key must be with capital K in c builder
switch (Key) {
case 27 :  //esc
Close();
//don't trap else can't type stuff in richtext
// default:
// ShowMessage("don't know what key you pressed in TForm1::FormKeyPress");
}
}
//---------------------------------------------------------------------------



which uses unit2.cpp

//---------------------------------------------------------------------------
#pragma hdrstop
#include "Unit2.h"

#define CR1 13
#define CR2 10
#define SPC 32

cForth::cForth(){
Init();
}

void cForth::Init(){
ShowMessage("doing Init");
/*
char * p = src.c_str(); //c_str() returns byte array addr
for ( int i = 0; i < src.Length(); i++ ) {
if ( (*p == CR1) && (*(p+1) == CR2) ){
ListBox1->Items->Add("<CR>detected");
}
else if (*p == SPC){ListBox1->Items->Add("<SPC>detected");}
else {ListBox1->Items->Add(*p);}
p++;
}
*/
}


//only passing one in at a time means step
void cForth::Outer(TStringList * tkns){
for (int i = 0; i <= tkns->Count-1; i++) {
ShowMessage(tkns->Strings[i]);
}
}



unit1.h

//---------------------------------------------------------------------------

#ifndef Unit1H
#define Unit1H
//---------------------------------------------------------------------------
#include <Classes.hpp>
#include <Controls.hpp>
#include <StdCtrls.hpp>
#include <Forms.hpp>
#include <ComCtrls.hpp>
//---------------------------------------------------------------------------
class TForm1 : public TForm
{
__published: // IDE-managed Components
TButton *Button1;
TRichEdit *RichEdit1;
TRichEdit *RichEdit2;
void __fastcall Button1Click(TObject *Sender);
void __fastcall FormKeyPress(TObject *Sender, char &Key);
private: // User declarations
public: // User declarations
__fastcall TForm1(TComponent* Owner);
};
//---------------------------------------------------------------------------
extern PACKAGE TForm1 *Form1;
//---------------------------------------------------------------------------
#endif



unit2.h

#include "system.hpp"  //for AnsiString

#include "Classes.hpp" //for TStringList

#include "Unit1.h"
#include "Dialogs.hpp"

class cForth {
public:
static int pc;
static int sub_pc;
cForth();
~cForth(){};
void Init();
void Outer(TStringList *);
};

//---------------------------------------------------------------------------

#ifndef Unit2H
#define Unit2H
//---------------------------------------------------------------------------
#endif



project1.cpp

//---------------------------------------------------------------------------

#include <vcl.h>
#pragma hdrstop
//---------------------------------------------------------------------------
USEFORM("Unit1.cpp", Form1);
//---------------------------------------------------------------------------
WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR, int)
{
try
{
Application->Initialize();
Application->CreateForm(__classid(TForm1), &Form1);
Application->Run();
}
catch (Exception &exception)
{
Application->ShowException(&exception);
}
catch (...)
{
try
{
throw Exception("");
}
catch (Exception &exception)
{
Application->ShowException(&exception);
}
}
return 0;
}
//--------------

hutch--

If you need to pass a string array, you do it best by passing an array of pointers. If the first element of the array is the string count followed by a pointer to each string up to the number of items in the array. With a bit of algo design you can make this a single memory allocation and it will occur something like this.


count,pointer1,pointer2,pointer3, ---- item1,0,item2,0,item3,0 etc ....


You just pass the address of the array and at the receivers end you read the count and use that number of pointers to the array members.

jj2007

Hutch's solution is fine, of course, but it requires some work on the C++ end. I wonder if C++ maintains already a memory block with these pointers...

Could you zip up that project folder and attach it here? Preferably with an assembler stub that is passed a pointer to string element 0?

bobl

Hutch
Thanks for your advice...I'll try that.
I've been reading around strlen type functions in case null-terminated strings is the way to go.
With that sort of array it sounds like it might be.
I'll come back when I've got something to show.

jj207
Yes. No problem. Please bear with me.

Tedd

Keep the pointers together, so you're not jumping around in memory just to find what you want -- which would spoil the cache. Alignment is the other issue; not mixing data type sizes helps with this.
Also, unless you expect tokens to be particularly long, 1 byte should be fine for the length (up to 255 chars.)

I'd probably go with something like:

numStrings  DWORD
pString1    DWORD
pString2    DWORD
...
pStringN    DWORD
0           DWORD       ;null pointer - indicates the end of the list

lenString1  BYTE        ;length of string in characters (actual byte length is +1)
String1     CHAR[]      ;zero-terminated to be used with C
lenString2  BYTE
String2     CHAR[]
...
lenStringN  BYTE
StringN     CHAR[]
Potato2

bobl

Tedd
Thanks very much for your post

jj2007
by asm stub...do you mean incorporate the obj of this and supply the first TStringList substring

.586p
.model flat

asmstub PROTO C Arg1:DWORD

.CODE

asmstub PROC C Arg1:DWORD

;what do you want in here?

asmstub ENDP
END

jj2007

Quote from: bobl on April 28, 2013, 12:18:49 AM
asmstub PROC C Arg1:DWORD

;what do you want in here?

asmstub ENDP

A pointer to the first string in the array. But the stub is not the problem, the C++ code will be. That's not my strong point - most of the time my blood pressure rises when VC++ throws (an array of) errors at me. Pelles C is a bit gentler but no ++ there, it's plain ANSI C...

bobl


Gunther

Jochen, Steve, bobl,

the problem isn't the assembly language code or how to pass the array from C++ to the procedure. The tricky part is that:

Quote
Classes are coded as structures in assembly and member functions are coded as functions that receive a pointer to the class/structure as a parameter. It is not possible to apply the extern "C" declaration to a member function in C++ because extern "C" refers to the calling conventions of the C language which doesn't have classes and member functions. The most logical solution is to use the mangled function name.

The quote comes frome Agner Fog's Optimizing subroutines in assembly language, section 7.4, p. 48 and 49. Agner continues:

Quote
The mangled function name ... is compiler specific. Other compilers may have other name-mangling codes. Furthermore, other compilers may put 'this' on the stack rather than in a register.

Bobl, you should read that section. Agner has code samples and good explanations for that what you would like to do. I've provided a link to Agner's manuals in my first post above.

Gunther
You have to know the facts before you can distort them.

bobl

Gunther
I just came back to say that for the last 30 minutes all my attempts to get the address of the first string in the list
has resulted in a value of 1 which looks more like an index than an address. You comments look like an explanation.
I shall be investgating those.
Thank you very much for the links.

void cForth::Outer(TStringList * tkns){
char * p = tkns->Strings[0].c_str();
ShowMessage(p);                       //this is printing the string fine i.e. as if p is an address
ShowMessage(IntToStr(p));       //this is giving me 1 which doesn't look like an address


jj2007
That aside I don't think my asm stub was right so I'd appreciate an example of what you were expected for the asm stub and I'll try that.


bobl

My goodness...BCB projects are big compared to PB ones to the extent that it's well over 512Kb, as is,
and therefore unpostable.
I'm going to have to look in to this because it is useful to be able to post such stuff.
In fact I think I'll retire for the day, regroup, and have a good run at this tomorrow.
See you then and thank you for helping me to get this far.

Edit:
I don't like finishing off on a low note and getting 1 is precisely that....
I've been playing in delphi 3 and get a ptr's value doing this...

procedure TForm1.Button1Click(Sender: TObject);
var
  myString  : string;
  ptrString : PString;
begin
  // Set up variable values
  myString := 'Hello there';
  ptrString := Addr(myString);
  ShowMessage(Format('Pointer          = %p', [Addr(myString)]));
  ShowMessage('myString : '+ptrString^);
end;

end.

I'll see whether this is transferrable to BCB tomorrow.

qWord

you might try to adapt the following lines:
typedef struct {
const char* data;
size_t count;
} raw_str;
//MASM:
//  raw_str struct
//   data PCHAR ?
//   count DWORD ?
//  raw_str ends
//...

// try {...
if(tkns->Count) {

raw_str* praw = new raw_str[tkns->Count];
for(int i=0; i < tkns->Count; i++) {
praw[i].data = tkns->Strings[i].c_str();
praw[i].count = tkns->Strings[i].Length();
}

// call MASM function
// e.g.: Cpp:  extern "C" bool foo(raw_str* p,size_t n)
//       MASM: foo PROC/PROTO C p: ptr raw_str,n:DWORD
//foo(praw,tkns->Count)

delete [] praw;
} //...
MREAL macros - when you need floating point arithmetic while assembling!