News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Speed question [RESOLVED]

Started by mikorians, October 12, 2012, 05:12:39 AM

Previous topic - Next topic

mikorians

Erm.  So now I've made my first program and I performed a simple bench test to compare to Visual Basic 6 (My present native language)

On my P4 3.6ghz, VB6 takes about 276 seconds to count to 4 billion  ($ff ff ff ff)
It took assembly language about 3.5 seconds.  I had half expected to be able to double loop it for that amount of time...
This is wonderful and I'm not complaining, but it seems like the 3d graphic card must be running a lot faster than that to produce a complex scene?
Have I discovered a dark secret of the modern world?
Computers aren't actually that fast after all?
(My intended applications were to be 3d)

dedndave

i think i measured a simple loop to FFFFFFFFh at about 3 seconds on my machine (3 GHz P4)
so - it sounds right

keep in mind - 4 billion of something is a lot   :biggrin:

mikorians

I know how big 4 billion is...   Heh...  I'm not trying to be greedy here, but it is a mystery to this noob
as to why the graphics card seems several orders of magnitude faster.  I am aware also that it must be a simpler RISC type processor, but...   Why that much faster?
If so, why aren't all CPUs GFX CPUs???  I guess we wouldn't process any faster with RISC that was a bust I think...
Anybody know the reason/how it's done/how come we can't write DirectX ourselves... bla bla...   ?

dedndave

lol
i wasn't implying that you didn't know that   :P

the question you are asking has a pretty involved answer, really
it is a matter of electronics hardware capability

the CPU has to interface with many different pieces of hardware and address a large range
so - the bus is "split" up several times
it has to handle interrupts, wait states, task switches, not to mention all the limitations of the OS

the GPU, on the other hand, only has a limited amount of memory to address - and that's it
the bus between the GPU and the memory is a direct connection
the GPU hardware can be optimized for a much more limited situation

mikorians

My initial goal - why I'm doing all of this:
I need to make what they call a 'triangle pusher' that works with DirectX.
My current system in VB6 seems to have reached a ceiling limit of 140k triangles at 47 FPS with 100 materials/textures.
I have seen much more complex scenes used in games, with tons of monsters, huge maps, and high frame rate.
read here - Serious Sam - for example...  Half-life, Halo, System Shock2

Now these apps were probably written in C --- which I despise, by the way.

I was at this for 30 years in VB6 and now I'm stuck at this ceiling limit.
Is it my hardware, my technique, what have I been doing wrong all these years that I cannot get my frame rate up?
P4 3.6ghz NVidia FX5500, Win98 - plays Half-life 1, SS2, Halo, and Serious Sam.   :(
Part of why I ask is that a friend of mine, Mikle (russian guy) said his computer did like 1400FPS and prob isn't much different from mine.
Can I assume it's his graphics card?

Tedd

The graphics card is quite specialized - it has a fairly fixed pipeline, you just set a few parameters and push data in one end, and out come pretty pixels at the other. The CPU, on the other hand, could do any number of things depending on what the current instruction is, and the instructions in memory can change. Not to mention that it generally has 20 things to do at once. So, on bare performance, the GPU is much better suited and therefore considerably quicker at bashing triangles. However, that is all it's good for, while the CPU is much more general purpose - hence the speed difference on that specific task.

From the other side, you're using VB, which is interpreted, so you'll lose a certain amount of speed from that. In a lower-level language, you get less overhead and therefore more speed. Which is why serious 3D games tend not to be written in VB.

And once you've reached the low-level limits, then you resort to tricks in order to avoid drawing things that are currently not visible on screen, and drawing distant things at lower resolution. And that's how you can have huge worlds with little slow down (the whole world is never drawn.)
Potato2

Ryan

As far as speed in VB, did you compile to native code with the optimize for fast code option?  The speed would be very different if you simply ran the code in the IDE.

mikorians

Well, I haven't observed much of any difference in my compiled VB application.
I have seen for a fact that while an entire map is not rendered it has many more polygons in it than my program seems capable of displaying.

Am I really going to get say 50x more polgons rendered per second in Assy than VB?

I guess what I'm asking- Is this trip going to be worth it?  Because the games that I play on this computer simply put my own app to shame after 3 decades of work on it.
I don't learn new things very fast.

dedndave

typically, when you are trying to make that much of an improvement, it requires a change in
your overall approach to a problem, rather than isolating a few lines of code that are slow

for example, i have a buffer...
when i add a new item, i move everything in the buffer and insert the new item
that design is going to be slow
i change my approach and use a circular buffer
now, instead of moving everything in the buffer, i simply change some pointers
this type of design change can make for a vast improvement

it is probably something in your basic design approach

jj2007

Quote from: mikorians on October 14, 2012, 03:13:22 AM
Am I really going to get say 50x more polygons rendered per second in Assy than VB?

That sounds extreme. As Dave wrote, often there is a design problem.
Typically, we can speed up a C runtime algo by a factor of 2 with hand-made assembler code. But there are (rare) cases where you cannot improve an algo simply because the C compiler produces exactly the same sequence. And there are cases where the speed gain is ridiculously high, meaning the C coders have been thoroughly asleep :biggrin:

Re painting a polygon, that is a call to a Windows API; usually so slow that you cannot gain anything with fast assembler before and after.

Why don't you make up a simple sample app in VB, e.g. a map with a number of polygons that you consider slow? Writing the assembler equivalent shouldn't be difficult. We'll see.

mikorians

I appreciate the advice, but I am -believe it or not- learning disabled.  This project has been a great way to occupy my time and to help keep me sane, but only because I grew up learning BASIC on the old Atari and Apple systems of yore.
I would just PAY someone to help me out.  I can afford only $200 a month because of my medical expenses.
I believe that Microsoft has led me astray with DirectX8 which offered -tempting- meshloader functions and hopping from API to API for me was just tragic.
I should probably have been developing my own loaders and I have of late been trying to understand proper mesh optimization and I've gotta say it-
I finished the engine I have been writting about 3 months ago and have just been in tears trying to get anybody to assist me in some way.
It's in VB6 and everything was fine until I tried to make my largest map which I thought was reasonably sized, and got a frame rate of 7 frames per second.
I didn't feel I was asking too much of my engine, but now it seems like it's just a piece of junk,
:(

hutch--

mikorians,

Never fear, no one is after your money and it certainly will not happen in here. you are buying into some very complex technology with a games engine, with no deference to you it usually takes a team of professionals a reasonable amount of time to get one right and which is fast enough. High performance in gaming usually comes with high end hardware and if you are using older hardware the software for the later stuff may not be fast enough on an older box. Vaguely I remember different people complaining about how much slower the later DirectX was alongside the earlier versions.

As hardware gets faster the tasks in gaming get more complex which takes a lot more processor and video card grunt to make it work. If you are going to use older hardware then you may need to use older DirectX versions if you can still get them and don't try and do the stuff that needs high end hardware to make it work OK. VB is not the best of programming environments to interface with assembler, mainly due to its restricted calling conventions and data types. Normally if you are using a recent C compiler you can target some parts of the PC side of the code in assembler if you can get a gain out of it but it is a big task.

jj2007

Quote from: hutch-- on October 15, 2012, 03:04:12 AMVB is not the best of programming environments to interface with assembler, mainly due to its restricted calling conventions and data types.

It is tricky, and I could test it only on VB for MS Word, but it works fine. An example is in the \masm32\MasmBasic\mb2vb folder of the MasmBasic library.

mikorians, it would be nice to see the map (and the VB6 code) that takes so long to draw. Nobody here will ask to get paid for it :icon14:

mikorians

OK. Took a breather and did a little hardware research.   :eusa_naughty:

My card is the NV FX5500 and is rated for 67.5 MILLION vertexes per second and my model is nowhere near that complex.
So in VB there isn't just a little overhead, there's a lot.
I was thinking I should write a benchmark program, but they've already done it to death.
If that number is to be believed, then I should be getting a much faster frame rate.   :eusa_snooty:

The point NOW is:   :lol:
I don't think I'd chalk 140K polygons up to hardware with that vertex rate, since we've already covered the fact that there are very nice games for this old machine.
I'm not talking about interfacing with VB anymore, let's hypothetically forget about it.
Now--- If I'm not going to go at the video card through DirectX anymore, how do you people do it, and do you have some examples.

My money is still on the table for some sort of working library I can feel my way through, if there's anything or anybody who can help.

My method has been to just try and display a static scene of some complexity to see how complex I am allowed to go before there is a frame rate degradation.
I don't wish to make my engine very complex by doing any scene divisions or BSP whatever code until I know what I'm up against.

Rats the attachment limit is 512K, my x-file is 3MB

japheth


IMO you should first make sure that there's no problem with your computer. This thread

http://masm32.com/board/index.php?topic=773.0

tells me that there's something wrong ( both Masm v6.x and the DOS version of jwasm are supposed to work in Win98 without problems ).