News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Amazed or Mortified?

Started by TBRANSO1, January 24, 2019, 03:44:48 PM

Previous topic - Next topic

AK_AK

...if i understand you [Hutch]  correctly,   You are describing why, with pure assembler, there is no need for TLS [ TLS is win32 programming ]
going through the literature; straddling between MASM, and win32 programming is a way to make something simple, into somthing unnesscarily complicated.

What i found interesting is what seems to be a bunch of WIN32 functions [such as TLS] that are stop gaps for very exclusive shortcomings of the win32 language.

pure MASM seems to not have this problem. The condition of mentally straddling both languages came about for me when i began writing MASM code but making calls to the windows API functions from an assembler routine.

Perhaps this may explain why im so scatter brained about it for the time being?

hutch--

Its more the case that the reference material is all over the place and it caters for a variety of different languages where assembler code is relatively free of many of the mechanisms of higher level languages. You lost the familiarity of the high level language to gain the access to many other things. It is often the case that something that is useful in one language simply has no value in another.

In assembler you are free of most of these mechanisms but have to construct your own. It is more work but it gives you the control you need to do different things. Much high level code that you have to deal with to interact with the OS is like shoveling sewerage where the real deal is architectural freedom and algorithms written in pure mnemonics when you are chasing speed and/or power.

TBRANSO1

Quote from: hutch-- on January 27, 2019, 11:36:38 AM
There are a few tricks when creating multiple threads with the same data, With CreateThread() you pass a structure full of whatever you need the thread to have, the caller then runs a spinlock that waits for a reply from the newly created thread and when the reply is received, it the creates the next one. Without this the structure is overwritten by the following thread creation.

I will tackle this exercise next, sounds interesting.

Lots of gems in this thread by you guys.

Hutch,

As for your libraries, they are extensive.  Since I created the Powershell script to search for functions, I have found mostly what I am looking for... the cool thing is that the returns from the script give me the library, line numbers where the functions are mentioned.

I'm a noob to Win32 API, and Windows programming.  The naming conventions were a tough nut to crack at first, but I'm getting it down.  I've got a pdf copy of Petzhold's famous book, so I need to go through that and learn more.

The confusing thing about all of the Microsoft DLLs, are that there are so many DLLs, and each one from each generation seems to change the data types and functions a little bit.  And I realize that since the Windows OS has been a target of malware for the past 4 decades that MS has always had to recreate new functions to deal with security risks and the new era of multiprocessor, concurrent and parallel programming.  It's hard to distinguish what was written in the DOS era, Win32 era, NT era, and now the x64 UWP era. The Linux/Unix libraries are little more codified and packaged tightly, so dealing with POSIX is a lot easier. But, I am still learning architecturally where they are in the hierarchy, and what they do, where they belong, etc... but now I've got a better understanding that your libraries are Kernel focused, which is cool... no need to carry extra baggage around, right?  :exclaim:

Quote
Perhaps this may explain why im so scatter brained about it for the time being?

me, too

AK_AK

here is something i recently found at the end of some breadcrumbs.


https://en.wikibooks.org/wiki/Windows_Programming

and there is this fellow, who has some advanced topics that i would recommend for later:

https://www.agner.org/optimize/

some of the links in that site might be dead, but the meat and bones are there, and its a good piece, in my opinion to include in your  programming/development resources.

thnx again Hutch;Tbrans et.al.

TBRANSO1

Quote from: AK_AK on January 27, 2019, 01:07:59 PM
here is something i recently found at the end of some breadcrumbs.


https://en.wikibooks.org/wiki/Windows_Programming

and there is this fellow, who has some advanced topics that i would recommend for later:

https://www.agner.org/optimize/

some of the links in that site might be dead, but the meat and bones are there, and its a good piece, in my opinion to include in your  programming/development resources.

thnx again Hutch;Tbrans et.al.

Yes, I have those links.  The hours of the day are the limit.  There are a lot of resources devoted to MS code, just takes time and practice to digest it all.

I have seen the Agner site, to highfalutin for me right now.  I have just barely got to the intermediate level in the use of the six 32-bit registers, and getting started on the 64-bit... trying to use the FPU, SSE, ST, XMM. is next.  I'm not a math person, so I don't have the imagination or ability to think of code to use those fancy registers.  I'd just like to be able to compute a very long Fibonacci or Factorial series beyond the limits of 64-bits, like in Python or the Boost libraries, where you can get a number a page long.


jj2007

Quote from: TBRANSO1 on January 27, 2019, 12:41:51 PMThe confusing thing about all of the Microsoft DLLs, are that there are so many DLLs, and each one from each generation seems to change the data types and functions a little bit.

You can do all your coding with a 20 year old Win32.hlp file. Try that with any other OS 8) 
QuoteThe TlsAlloc function allocates a thread local storage (TLS) index. Any thread of the process can subsequently use this index to store and retrieve values that are local to the thread.

DWORD TlsAlloc(VOID)


Parameters

This function has no parameters.

Return Values

If the function succeeds, the return value is a TLS index.
If the function fails, the return value is 0xFFFFFFFF. To get extended error information, call GetLastError.

Remarks

The threads of the process can use the TLS index in subsequent calls to the TlsFree, TlsSetValue, or TlsGetValue functions.
TLS indexes are typically allocated during process or dynamic-link library (DLL) initialization. Once allocated, each thread of the process can use a TLS index to access its own TLS storage slot. To store a value in its slot, a thread specifies the index in a call to TlsSetValue. The thread specifies the same index in a subsequent call to TlsGetValue, to retrieve the stored value.

The constant TLS_MINIMUM_AVAILABLE defines the minimum number of TLS indexes available in each process. This minimum is guaranteed to be at least 64 for all systems.
TLS indexes are not valid across process boundaries. A DLL cannot assume that an index assigned in one process is valid in another process.
A DLL might use TlsAlloc, TlsSetValue, TlsGetValue, and TlsFree as follows: ...

TBRANSO1

@Jochen,

Yeah, after looking into this... I gotta drift of the underpinnings of the TLS.

The TLS is a an array data structure with slots of a size to contain the contents of the global data of the programmer's choosing (or it can be done automagically).  Then each when the thread gets, and plays with the data on a local variable, then sets it back.  It's just getting access to the slot for that time, I'm sure behind the scenes the array data structure is a synchronized queue.

I think the structure is interesting, as it allows different functions anywhere to access the global indexed structure and pull out the data, work on it, and store it back, it seems like a monitor in HLL of sorts... the information is not hanging it's nuts out in the breeze for everyone to kick them.  :icon_redface:  how rude of me, sorry... LOL



AK_AK

That sounds like you have, an understanding to the extent, required to be able to use this tool for a purpose.
good stuff TBRANS.

Im wondering if you have encountered the reason DLLs exist in the first place? there is a bare minimum set you can get away with.
That is the place where the WIN API lurks and sleeps until called on.
You can even roll your own DLLs, and this was an industry standard at the time, also a source of security compromise and a major frustration known as DLL Hell.  basically DLLs are a library of executable routines,

I have read, that DLLs have to do with page limits in the CPU.
you want the main routine to be small enough, to fit in the CPU cache, and call out to DLL segments that will fit in cache as well.
if you smashed it all together you dont fit in the CPU, and you have a massive performance penalty.

Sound good?

felipe

you will never have all what you want in the cache memory. dlls are good to make the executable size smaller and to make changes in the routines without affecting the program that use them... :idea:

hutch--

I think you need to know where DLLs came from. Long long ago Windows would run in 2 megabytes of ram and that meant extremely efficient code to do that. A DLL is a very good technique for reducing memory demands in that you can load, use then remove the DLL so that you don't have a pile of dead code in memory that is not being used most of the time. In the DOS days you could shell out to another program, get it to do something then shut it down but the level of inter-communication was very poor, generally done with temp disk files.

DLL hell comes from elsewhere, when Microsoft started to release different versions of a particular DLL, you could get bitten by not having the right version but if you wrote a DLL for a specific task, it was not effected by Microsoft's stuffups. They are still a very efficient method of extending an application where you use the "Load on call" technique, perform the task required then release the DLL and all of its resources. It also allows you to share a capacity between apps so that all of them don't have to have a pile of dead code that is barely ever used.

AK_AK

..Yes i remember the time when everybody software manufacturer, and thier mother would write a custom DLL for thier applications, and version mismatch was a repeated issue [DLL HELL]   The use of a DLL was a way to avoid reinventing a wheel or a screwdriver or a canopener.  you just pack the things [routines] into a library and they are available for repetitive use, by many applications.  DLLs had the luxury of being large, but each individual routine or resource was optimized in sizel to fit a particular memory space

If the code calling the DLL routine was kept small enought to fit in the CPU cache, there was a speed boost , as the cache is fast compared to ram.
if the DLL routine was also small enough to fit in the cache, or in a cache page, then the speed boost was preserved.
when a piece of code is to big to fit entirely in cache then, time is consumed rewriting CPU cache.
SO.. a cache sized piece of code, that calls on cache sized pieces of code often, is supposed to be faster then a piece of code , that is bloated with identical routines peppered throughout, the main routine.

I forget where this was i think it was an optimization technique from optimization for IA32 programming. The overall goal was supposed to be keeping your code running in the CPU cache, and not spending time to rewrite the cache.

TBRANSO1

I was on Bytepointer. I happened to be poking around, and found a document related to the PE and the Linker.  In the 500 or more pages, I was just zipping through it, since most of it was beyond my grasp at the moment, but I saw TLS and stopped.

So, this is way the TLS works if we're using it.  When it finds that the developer is calling upon the TLSIndex call and TLSAloc. The linker creates a special section below the .Data section called .Tls.. it is here that the special data structure offset is place and the pointer to the Heap where it is located.  The amount of slots is determined by the developer or by default.  So, this is where the indices, and data is placed for threads that are using this function.


TimoVJL

__declspec(thread) int tls_i = 1;
int __cdecl main(void)
{
printf("tls_i: %d\n", tls_i);
return 0;
}
_main:
00000000  55                       push ebp
00000001  8BEC                     mov ebp, esp
00000003  A100000000               mov eax, dword ptr [__tls_index]
00000008  648B0D00000000           mov ecx, dword ptr fs:[__tls_array]
0000000F  8B1481                   mov edx, dword ptr [ecx+eax*4]
00000012  8B8200000000             mov eax, dword ptr [_tls_i]
00000018  50                       push eax
00000019  6800000000               push $SG4295
0000001E  E800000000               call _printf
00000023  83C408                   add esp, 8h
00000026  33C0                     xor eax, eax
00000028  5D                       pop ebp
00000029  C3                       ret
May the source be with you

TBRANSO1

Quote from: TimoVJL on January 30, 2019, 03:35:47 AM

00000008  648B0D00000000           mov ecx, dword ptr fs:[__tls_array]


Right there! Unraveled the mystery.

Further, I read that this convention is agreed upon by all compiler / assembler writers, and platform agnostic, so it works on every platform OS with their respective Kernel functions.


LordAdef

What a nice thread!!!
Very educational.
I intend to do some threading in the future, for my little game. But really... I am way below my limit in performance, and doing GDI stuff instead of DirectX AND doing a sleep to prevent my cpu do fry.
I am porting to DirectX (with the help of Marinus) and later see what I can do with some side threads. Just for the kick. Sometimes I wonder if the hassle is worth it.
Soon I will bother you guys :eusa_dance: