Scrape web contents faster

2006 Oct 1

when scraping websites, i usually use the function file_get_contents. However, there are times when we only need a specific portion of the site to get; for instance: getting the title of the site or the description.

Instead of using file_get_contents function we instead use the builtin file fopen and fgets functions like this:

$buffer

"; ?>

But, using CURL functions will be a lot faster. We will use CURLOPT_RANGE to get the specific amount of data from a specified url. CURLOPT_RANGE defines as range(s) of data to retrieve in the format "X-Y" where X or Y are optional. HTTP transfers also support several intervals, separated with commas in the format "X-Y,N-M".

$content

"; ?>

1 Comment

This range thing doesnt work when we are using POST :s

it downloads the whole page..

bilal ghouri | March 1, 2010 12:50 AM | Reply

Prev Next

Mostly Harmless

I really hate it when the protagonist dies.

Dragonball Evolution Movie

I am really disappointed with this movie. Although the characters are great, the effects are good actions are excellent but the story sucks bigtime. Its like watching 2 TV episodes back to back but with pay. I like the way Bulma is presented but Master Roshi and Gokou? come on you guys can do better than that. Use of Kamehameha is way disturbing, i did not know it could heal.

Blocking or Redirecting Spammers by IP Address Using HTAccess

Image by Za3tOoOr! via FlickrAfter several months of fighting with manual comment spam, I have decided to block all the traffic coming from India. All they do...
Fix No Internet Access in Windows 7

I have just upgraded my PC to Windows 7 and since I never tried Windows Vista before; this new operating system is alien to me....
Increase activeCollab upload limit using htaccess

Probably the most easiest way to increase the upload limit not just on activeCollab but almost to all PHP applications is by configuring the...
Answers to The Lost Symbol Quest

SPOILER WARNING! If you are looking for hints, then this is not the site. I have finally completed the Lost Symbol Quest from the...
Adding YM online status icons to websites using PHP and HTML scripts

Adding Yahoo Messenger status icons to websites is fairly easy and straightforward, it doesn't even need any programming language to customize them. By this time,...

Scrape web contents faster

1 Comment

Leave a comment

Gallery

Mostly Harmless

Dragonball Evolution Movie

Life the Universe and Everything

The Restaurant at the End of the Universe

the-ultimate-hitchhikers-guide-to-the-galaxy.jpg

Angels And Demons

StarCraft: Brood War

Eagle Eye

Linksys Compact Wireless G USB

We Sing, We Dance, We Steal Things

About Me

Categories

Recent Entries

Blocking or Redirecting Spammers by IP Address Using HTAccess

Fix No Internet Access in Windows 7

Increase activeCollab upload limit using htaccess

Answers to The Lost Symbol Quest

Adding YM online status icons to websites using PHP and HTML scripts