Author Topic: Getting html content of google search  (Read 23534 times)

dedndave

  • Member
  • *****
  • Posts: 8823
  • Still using Abacus 2.0
    • DednDave
Re: Getting html content of google search
« Reply #30 on: December 04, 2013, 02:45:39 PM »
 :redface:

JZ and JE are the same opcode

Siekmanski

  • Member
  • *****
  • Posts: 1927
Re: Getting html content of google search
« Reply #31 on: December 04, 2013, 03:01:08 PM »
Oooh yeahhh   ::)

I really need a break it's 5 AM and i need to go to bed  :biggrin:
Creative coders use backward thinking techniques as a strategy.

dedndave

  • Member
  • *****
  • Posts: 8823
  • Still using Abacus 2.0
    • DednDave
Re: Getting html content of google search
« Reply #32 on: December 04, 2013, 03:13:53 PM »
me too - see you tomorrow, Marinus   :t

Siekmanski

  • Member
  • *****
  • Posts: 1927
Re: Getting html content of google search
« Reply #33 on: December 05, 2013, 11:55:19 AM »
Thanks guys for testing and helping out.  :t
I'm done now with the search routines.
Now i can use it for searching album covers in my program.

It's also nice to search for fixed sizes, in this case images 512 by 512 pixels.

final version,
Creative coders use backward thinking techniques as a strategy.

Magnum

  • Member
  • *****
  • Posts: 2304
Re: Getting html content of google search
« Reply #34 on: December 05, 2013, 12:10:45 PM »
Excellent job.  :t

What kind of download time is average for that file size ?

Andy
Take care,
                   Andy

Ubuntu-mate-18.04-desktop-amd64

http://www.goodnewsnetwork.org

Siekmanski

  • Member
  • *****
  • Posts: 1927
Re: Getting html content of google search
« Reply #35 on: December 05, 2013, 12:40:03 PM »
Thanks  :biggrin:

It depends on the speed of your Internet connection i think.
But you can search for different image sizes.
Creative coders use backward thinking techniques as a strategy.

Magnum

  • Member
  • *****
  • Posts: 2304
Re: Getting html content of google search
« Reply #36 on: December 05, 2013, 01:35:57 PM »
Does the code search for rammstein.jpg or just download that file if it finds it ?

Andy
Take care,
                   Andy

Ubuntu-mate-18.04-desktop-amd64

http://www.goodnewsnetwork.org

Siekmanski

  • Member
  • *****
  • Posts: 1927
Re: Getting html content of google search
« Reply #37 on: December 05, 2013, 01:56:20 PM »
if you want to search for album cover art,

artist = Rammstein
album = Sehnsucht

the search frase is then this one, it searches for album art images with exact sizes (512 by 512)

/search?q=rammstein sehnsucht album cover&tbm=isch&tbs=isz:ex,iszw:512,iszh:512

But you can search for any image you want,

search for panda?

/search?q=panda&tbm=isch

in the source code the save name  "rammstein.jpg" is fixed and saves the first found image ( for testing only )
Creative coders use backward thinking techniques as a strategy.

Antariy

  • Member
  • ****
  • Posts: 551
Re: Getting html content of google search
« Reply #38 on: December 05, 2013, 08:31:00 PM »
Hi Marinus :t You get the large HTTP answer from google when you used the UserAgent string that is used currently?

Siekmanski

  • Member
  • *****
  • Posts: 1927
Re: Getting html content of google search
« Reply #39 on: December 05, 2013, 08:44:19 PM »
Yes. :biggrin:

But nothing mentioned by microsoft or MSDN. I was playing with it because ""Mozilla/5.0" returned 70 Kb instead of 36 Kb.
So i searched the net for useragent examples as i used it before in the winsock example.
Creative coders use backward thinking techniques as a strategy.

traphunter

  • Guest
Re: Getting html content of google search
« Reply #40 on: December 05, 2013, 09:22:47 PM »
maybe your own real user-agent works: http://www.viewmyuseragent.com/

Antariy

  • Member
  • ****
  • Posts: 551
Re: Getting html content of google search
« Reply #41 on: December 05, 2013, 09:43:10 PM »
Yes. :biggrin:

But nothing mentioned by microsoft or MSDN. I was playing with it because ""Mozilla/5.0" returned 70 Kb instead of 36 Kb.
So i searched the net for useragent examples as i used it before in the winsock example.

I think Google servers try to filter automated requests by checking UserAgent, and if it looks like not very similar to the real browser's string, it returns not full answer. Also I noticed that if the're is too many / too frequent requests from one IP, then google blocks the request and provides a captcha to verify that the request was done by people, so the program should not ask for searches too frequently - that is not looks like the people do the search. But you may use the proxies as well - specify one external proxy in there:

invoke InternetOpen,CTXT("ASM example"),INTERNET_OPEN_TYPE_PROXY,CTXT("proxyaddress:proxyport"),CTXT("<local>"),0

so your request will be routed through external proxy with its address "seeing" to google :t If one IP address gets blocked after frequent searches, you may change the proxy and continue seaches :biggrin:

Siekmanski

  • Member
  • *****
  • Posts: 1927
Re: Getting html content of google search
« Reply #42 on: December 05, 2013, 09:58:12 PM »
Thanks Antariy, i'll keep this "proxy address" trick in mind.  :t
Creative coders use backward thinking techniques as a strategy.

Evan_

  • Guest
Re: Getting html content of google search
« Reply #43 on: December 19, 2013, 03:32:45 PM »
Weird stuff. Make a fake browser or something. Fork another; idk.
Run it so you don't even have to look at the web pages anymore.

Or just script it with your creepy requests.

dedndave

  • Member
  • *****
  • Posts: 8823
  • Still using Abacus 2.0
    • DednDave
Re: Getting html content of google search
« Reply #44 on: December 19, 2013, 06:14:32 PM »
yes.....
Marinus is very "creepy" - lol
(those creepy Nederlanders)
that's just how he rolls   8)

we've all learned something from Marinus, though   :t