News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Instruction Set Detection including AVX-512 sub-sets

Started by Gunther, December 26, 2017, 09:37:26 AM

Previous topic - Next topic

aw27

The ASM code does follow the 64-bit ABI rules and some of the procedures are running with the stack unaligned.
This is a simple comment, I know you can do much better otherwise would not be producing software to look for the Higs Bosom.  :(

johnsa

Quote from: hutch-- on December 27, 2017, 11:08:49 AM
I have this problem with the direction that the new Watcom forks are going in and that is trying to do too many things and trying to hold the hot little hand of the programmer instead of pointing them at the hard and complex stuff. At its most basic an assembler is a crude gadget to screw mnemonics together in the right order so it will run on a compatible processor and perform the task it was designed to do. I can see all sorts of good reasons to add familiar capacities for hacking through much of the high level code you need to produce in conjunction with normal mnemonic code, things like .IF, .SWITCH, procedure entry and exit, call automation (INVOKE and similar) and these can be appended on without compromising the crude basics of what an assembler is.

As soon as you go in the direction of complex error checking and the like you are starting to write a high level compiler and in fact this is how languages like early C started. "Just like assembler but easier". The more you hide from the programmer, the less useful the tool is and this would be very unfortunate as I know that a massive amount of work has been done to get these tools up and going. Complex capacity is something that should be done by programmers, not assembler designers in the assembler, if the basic accessories are done properly then a decent set of libraries adds the higher level capacities needed to make the tool address a much wider market. The assumption that the assembler's own internal high level code is better than what the assembler programmer writes is a dangerous one in that while it may be true in some instances, there are enough decent assembler programmers around who can make such assumptions false.

My comments break down to these,

1. Make sure the assembler IS an assembler, not a pseudo compiler.
2. Use modular design so that additional capacities are optional choices for the programmer.
3. Produce a decent PHUKING library or in fact many libraries, that is what made old C the major professional language for so many years.

I agree mostly, with the work I'm doing on UASM I try to break it up so that we try to cover the full spectrum:

1) For example adding an OSX version, macho64 support just increases the availability and use of the tool.
2) Calling conventions for Borland register, Vectorcall and SystemV once again extend it's usefulness.
3) Bug fixes which can apply anywhere.
4) Enhancements (and this is where I guess most of the debate comes in), I try to ensure that every new feature in no way compromises or breaks the existing functionality of the assembler
    so some of these are implemented in the core of the assembler with this creed in mind, for example extending the way you can initialise floating point data or unions by specifying which sub-field to use:


UnionType union
_a fourfloats <?>
_b fourdwords <?>
ends

mything1 UnionType < 1.0, 2.0, 3.0, 4.0 > ; traditionally this is all that was allowed, floating point initializers for floats and only the first member of the union.. but now we have:
mything2 UnionType < 1, 2, 3, 4> ; promote integers to float
mything3 UnionType._a < 1, 2, 3.0, 4.0 > ; specify the field directly and promote integers to floats
mything4 UnionType._b < 1, 2, 3, 4 > ; specify the field, normal dwords



So that is an example of where we try to extend the core assembler in a way that is backwards compatible.

For things that fall into the HLL category the key is that they should never prevent the user from doing things their own way or at the lowest possible level, but provide a mechanism to simplify a lot of the
boiler plate code that even as a veteran assembler programmer starts to become annoying.. In this area I agree with Nidud's approach that borrowing familiar syntax from C is a good option, as long as we don't overdo it.

I'm trying as far as possible to also ensure that all of this HLL work either lives entirely in the built-in macro library thus relying on the existing assembler core or by implementing it in combination with a light pre-pre-processor step. This way the HLL functionality is kept modular, independant from the assembler itself and can be switched on/off or removed if required, the other great thing about stuff in the macro library is you can replace any of the macros with your own at assemble-time, due to the fact that macros can be redefined.

Anyway, thats just my thoughts on how we proceed. There have been a couple of things I've wanted to add but haven't for these reasons, as there is potential for either confusion or obfuscation of implementation, one of them being support for overloading procedures, which can work fine with HLL constructs.. but would be a total disaster with a direct low level CALL, hence it's exclusion from future plans.

We have to accept that as much as we love assembler, it needs to progress in SOME direction to attract more people and even for those of us that don't need to be convinced about the joy of writing assembler it just needs to be made more amenable to working on large projects with less pain and maintenance issues. This is what I have found in my experience at least.. but I agree 100% that nothing new should ever be added that detracts from it's purity or ability to operate at any required level of abstraction.

nidud

#17
deleted

nidud

#18
deleted

nidud

#19
deleted

Gunther

Hi nidud,

thank you for providing the new version. It works now without complaint.

Supported Features by Processor and Operating System
====================================================

Vendor String: GenuineIntel
Brand  String: Intel(R)Core(TM)i7-7820XCPU@3.60GHz

Instruction Sets
----------------

MMX  SSE  SSE2  SSE3  SSSE3  SSE4.1  SSE4.2  AVX  AVX2  AVX-512 F

It's safe to use the following AVX-512 extensions with this machine:
--------------------------------------------------------------------

AVX-512 DQ      : DWORD and QWORD instructions (conversion, transcendental support etc.)
AVX-512 CD      : Conflict Detection instructions offer additional vectorization of loops with possible
                  address conflicts.
AVX-512 BW      : Byte and Word support for 8- and 16-bit integers.
AVX-512 VL      : Vector Length instructions add vector length orthogonality, allowing most AVX-512 instructions
                  to also operate on XMM and YMM registers.

Any processor that implements any portion of the AVX-512 extensions MUST implement AVX-512 F. Some AVX-512 extensions
are currently only planned by Intel. Not every architecture has all the instruction sets built in:

Knights Landing provides      : CD, ER, and PF.
Skylake provides              : CD, BW, DQ, and VL.
For Cannon Lake are scheduled : CD, BW, DQ, VL, IFMA, and VBMI.
For Icelake are scheduled     : CD, BW, DQ, VL, IFMA, and VBMI.
For Knights Mill are scheduled: CD, ER, PF, 4FMAPS, 4VNNIW, and VPOPCNTDQ.

That's what Intel has released so far.

Solid work.  :t

aw27,

Quote from: aw27 on December 28, 2017, 01:40:52 AM
The ASM code does follow the 64-bit ABI rules and some of the procedures are running with the stack unaligned.
This is a simple comment, I know you can do much better otherwise would not be producing software to look for the Higs Bosom.  :(
I love that sort of comments. Allow me, however, the following minimum marginalia:

  • We do not write software to find the Higgs boson. We only help a bit with the reasonable archiving of the large amount of data. This is a very marginal task and would work without me. You shouldn't overvalue my role in any case.
  • My assembler procedures were directed by the compiler. The compiler takes care of the correct stack alignment for the function call, right?
  • The ASMC source is not mine.

Gunther
You have to know the facts before you can distort them.

aw27

Quote
My assembler procedures were directed by the compiler. The compiler takes care of the correct stack alignment for the function call, right?

Yes, for the function call but not for the external function itself.
This has to be done by the ASM routine itself, manually or by the assembler.
The way you declare the procedures will not help the assembler to do it for you. Do you remember that there is a USES clause?

Quote
The ASMC source is not mine
I am not talking about the ASMC part. Your ASM part could well crash the program if there were any API function call or instruction that require alignment.

Gunther

Quote from: aw27 on December 28, 2017, 06:12:06 AM
Yes, for the function call but not for the external function itself.
Fine, but my assembly language procedures are designed for a high-level language call.

Quote from: aw27 on December 28, 2017, 06:12:06 AM
Your ASM part could well crash the program if there were any API function call or instruction that require alignment.
Right, but in the assembly language part, not a single API function is called. What the hell is going to crash that? Yes, if, and would and could, but that's hair splitting. All registers used by CPUID are saved in advance on the stack and restored at the end. More is really not required if the API is not used, right?

I already thought something about it. With very few changes (because of the different ABI), the procedure can also be used under SCO UNIX, BSD, Linux and OS X. That's why there is not a single API call in it. There really is not only Windows in this world. Sometimes you have to work in very heterogeneous computer clusters, no matter how they have historically evolved. Whether you like it or not doesn't matter.

Gunther
You have to know the facts before you can distort them.

aw27

Quote from: Gunther on December 28, 2017, 06:54:07 AM
Fine, but my assembly language procedures are designed for a high-level language call.
They are not, and you have no clear idea of what I am talking about; may be we should be talking about chess today.

Quote
I already thought something about it. With very few changes (because of the different ABI), the procedure can also be used under SCO UNIX, BSD, Linux and OS X. That's why there is not a single API call in it. There really is not only Windows in this world. Sometimes you have to work in very heterogeneous computer clusters, no matter how they have historically evolved. Whether you like it or not doesn't matter.
From what I have seen in the last few days all the small programs you brought here have flaws. Instead of fixing you blame the tools and the people.



Gunther

aw27,

Quote from: aw27 on December 28, 2017, 07:15:55 AM
They are not, and you have no clear idea of what I am talking about;
Is it even worthwhile to go into detail here? First the C prototypes (lines 10 to 14 in the C source code).

extern long long int CpuVs(char *p);
extern long long int GetPbs(void);
extern void CopyPbs(char *p);
extern long long int Iset(void);
extern void GetAVX(long long int *p);

Then an example of the correct function call (line 111 in the C source).

featurenumber = Iset();

Who can read is clearly in the advantage.

Quote from: aw27 on December 28, 2017, 07:15:55 AM
may be we should be talking about chess today.
You can do that in the Colosseum, with who you want, but please don't in my thread.

Quote from: aw27 on December 28, 2017, 07:15:55 AM
From what I have seen in the last few days all the small programs you brought here have flaws.
Very nice, then do not use this code, no one is forcing you and certainly not me.

Quote from: aw27 on December 28, 2017, 07:15:55 AM
Instead of fixing you blame the tools and the people.

Is it that what you mean with blaming the tools?
Quote from: Gunther on December 26, 2017, 09:37:26 AM
That's made with the latest release of UASM, an excellent tool that works without complaint, by the way.

Or did you mean that maybe, with blaming the people?
Quote from: Gunther on December 27, 2017, 06:29:07 AM
Branislaw (aka Habran), Johnsa and you have done a great deal, hats off.

In that sense, I plead guilty, but who knows. Dixi et salvavi animam meam.

Gunther







You have to know the facts before you can distort them.

aw27

Quote from: Gunther on December 28, 2017, 09:51:26 AM
Quote from: aw27 on December 28, 2017, 07:15:55 AM
They are not, and you have no clear idea of what I am talking about;
Is it even worthwhile to go into detail here? First the C prototypes (lines 10 to 14 in the C source code).

extern long long int CpuVs(char *p);
extern long long int GetPbs(void);
extern void CopyPbs(char *p);
extern long long int Iset(void);
extern void GetAVX(long long int *p);

You don't understand the replies, or skip over them, let me repeat:
Yes, for the function call but not for the external function itself.
This has to be done by the ASM routine itself, manually or by the assembler.
The way you declare the procedures will not help the assembler to do it for you. Do you remember that there is a USES clause?

Quote from: Gunther on December 28, 2017, 09:51:26 AM
Quote from: aw27 on December 28, 2017, 07:15:55 AM
From what I have seen in the last few days all the small programs you brought here have flaws.
Very nice, then do not use this code, no one is forcing you and certainly not me.
Of course not. CPU Extensions detection code is everywhere and I have software that reports up to avx512 since long. Even so, it is strange you could not make your routines work properly.

Quote from: Gunther on December 26, 2017, 09:37:26 AM
That's made with the latest release of UASM, an excellent tool that works without complaint, by the way.
And you also said:
"
That's exactly Hutch's idea: bring the assembler back to the desktop. Especially young, inexperienced high-level language programmers will benefit from your work. The other side of the coin is: There are cases where these comfortable high-level language elements are even harmful.
"

You are looking at UASM as a tool for beginners, when it is not. Beginners learn rough MASM in school through a six month intensive course where they can't use HLL elements or parameters on the PROC line. They follow a book written by Kip Irvine.

You really live in another planet. The planet where companies hire developers and pay them 120000 euro per anum to develop PowerPoint add-ins without even checking their capabilities.





jj2007

Quote from: aw27 on December 28, 2017, 09:10:31 PMYou are looking at UASM as a tool for beginners, when it is not. Beginners learn rough MASM in school through a six month intensive course where they can't use HLL elements or parameters on the PROC line. They follow a book written by Kip Irvine.

No need for strong declarations here: UAsm is in practice 99% compatible with MASM, and 100% for the things that beginners use (there are only a few exotic cases in macros etc where UAsm differs from Masm). And while some beginners are apparently forced into that course, they are not a relevant target group IMHO. However, even these poor students can use UAsm as if it was Masm.

But one problem with the forks (UAsm + AsmC) is indeed that they have new fancy features that can cause trouble, either because the experts who are supposed to use them don't read the FM, or because the experts developing them got lost in too much detail ;)

aw27

Quote from: jj2007 on December 28, 2017, 09:23:59 PM
And while some beginners are apparently forced into that course, they are not a relevant target group IMHO. However, even these poor students can use UAsm as if it was Masm.
I am not criticizing the courses, they are good. Students can indeed learn a lot by coding this way:

FastIntegerSQR PROC
   push ebp
   mov ebp, esp
   mov eax, [ebp+8]
   mov edx, [ebp+0ch]
   cmp eax, edx
   jbe @L1
   shr edx,1
   jmp short @L2
@L1:   
   shr eax, 1
@L2:
   add eax, edx
   pop ebp
   ret 8
FastIntegerSQR ENDP

johnsa

Quote from: nidud on December 28, 2017, 04:04:48 AM
Quote from: johnsa on December 28, 2017, 02:32:33 AM
We have to accept that as much as we love assembler, it needs to progress in SOME direction to attract more people and even for those of us that don't need to be convinced about the joy of writing assembler it just needs to be made more amenable to working on large projects with less pain and maintenance issues. This is what I have found in my experience at least.. but I agree 100% that nothing new should ever be added that detracts from it's purity or ability to operate at any required level of abstraction.

I don't believe in compromise with regard to the extension of the HLL section: you either do it or you don't. This means that it have to be linguistically correct, rule-based, simple and logical. This is hard to achieve without using existing syntax. You may try doing it half way, make a few  exceptions here and there. It will work but it wont stick in the long run.

As for taking advice from old farts hanging around this forum that may not be a good idea in this case. We are all typecast in our own habits of doing things and thus a bit anti progressive. So think about the children John: THE CHILDREN  :lol:

Agreed, it must be correct linguistically and logically, which is yet another good reason to borrow from C, it's familiar and we have a know set of lexing rules and expectations to follow.
I'm all for the children.. I myself am still trying to convince mine to use my Amiga 1200 (which has Devpac 3.18 and all the h/w + rom kernel manuals) to do some funky copper and system take-over code ;) (note they're only 5 and 7 .. so It's hard to convince them to leave the xbox alone to code!)

hutch--

I don't see any problems with using at least some of the C format, it is as John said, don't over do it or you will end up with the clutter and restrictions of C with its strong typing and other irritations. IF blocks are fine, SWITCH blocks work OK and give you clean code without loss but exercise caution with pre-built loop code systems as they are internally obscure and are often hard to translate back to mnemonic code.

In 64 bit having reliable pre-built stackframe procedure entry and exit is sensible as the Microsoft ABI is complex to write manually and with no real gain by doing so, its easy enough to write the simple 4 args or less style of stackframe free leaf procedures directly using registers and if you want this automated, it works fine and with no loss simply as a macro. A word of wisdom here, instead of hard coding "invoke", make the technique with another name and use a macro wrapper to get invoke as this allows you to use the call automation code in a number of other ways such as function form macros with return values.

In a crude pseudo macro,

MyInvoke MACRO args:VARARG
    InternalCallAutomation args
    EXITM <rax>
ENDM