Website Crawler (How to Copy any Website)


#1

Plz tell me is there any software or script to download any website complete folder with original files.

For example this is a site addrese:

www.mysite.com/script/

Now i wanna get all files in script folder, plz tell me how to do this ?

As i think its possible bcoz i have seen many famous script clones like youtube, how web hackers crawls all file with database.

Here i wanna tell you that i dont want to get database and all other info of any site, just tell me how to get all files in any folder of site ? no matter they are .html file or .php


#2

http://www.httrack.com/

[quote=", post:, topic:"]
HTTrack is a free (GPL, libre/free software) and easy-to-use offline browser utility.

It allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the "mirrored" website in your browser, and you can browse the site from link to link, as if you were viewing it online. HTTrack can also update an existing mirrored site, and resume interrupted downloads. HTTrack is fully configurable, and has an integrated help system.

WinHTTrack is the Windows 9x/NT/2000/XP release of HTTrack, and WebHTTrack the Linux/Unix/BSD release. See the download page.

[/quote]

#3

last time i downloaded site using httrack was in 1998 i think it works brilliant but if you dc your connection then you better start from scratch again. but that was in 1998 not its 2008 there must be some more improvements.


#4

Offline Explorer Enterprise or Pro.


#5

Thanks all guys specially admin for refering cool software.

Asad i know offline exploer and many other software like it but i want a software that take original files to use for any other site, for example if i like any wordpress based site theme and i wanna use it for my blog then i will copy whole theme and use it, its simple

Hope now you will understand my need.

KO (admin) dear will you plz tell me which HtTrack Version i have to download, as there are many different versions with different needs and i dont know which one to downlaod.

Will you plz give a short description of every version?


#6

[quote=", post:, topic:"]

Thanks all guys specially admin for refering cool software.

Asad i know offline exploer and many other software like it but i want a software that take original files to use for any other site, for example if i like any wordpress based site theme and i wanna use it for my blog then i will copy whole theme and use it, its simple

Hope now you will understand my need.

[/quote]

you mean you wanna rip someone’s wordpress theme :D well it is easy if the guy hasnt protected his wp-content directory but if he has its not possible

there are many free themes available on the net use them


#7

SaadIbrahim What you mean by protected theme ?

I protect my folders by just putting a simple .html file with redirector, as i know there is a way to protect folders by cpanel but i dont know, if you plz tell how to do it then plz tell me, i wll be very thankfull.

KO (admin) dear will you plz tell me which HtTrack Version i have to download, as there are many different versions with different needs and i dont know which one to downlaod.

Will you plz give a short description of every version?

Also if a folder is protected can we crawl it by HtTrack ?


#8

A website crawler can only access publicly viewable folders.

If you're trying to 'copy' a theme, it won't be able to do that either, as php files (in the case of wordpress) are executed on the server, and only the resulting output is sent through.

With CSS, PHP and Perl and other scripts so common on websites, it's no longer possible to 'copy' a website - all you'll get is the html and the css, which is a small part of a modern website.

As far as what version to download, I can't explain any simpler that what is already up on the httrack site. Just ask someone what version of Windows you are running and that is all the info you need.


#9

[quote=", post:, topic:"]

SaadIbrahim What you mean by protected theme ?

I protect my folders by just putting a simple .html file with redirector, as i know there is a way to protect folders by cpanel but i dont know, if you plz tell how to do it then plz tell me, i wll be very thankfull.

KO (admin) dear will you plz tell me which HtTrack Version i have to download, as there are many different versions with different needs and i dont know which one to downlaod.

Will you plz give a short description of every version?

Also if a folder is protected can we crawl it by HtTrack ?

[/quote]

i said protected wp-content folder which can be done by making a html file in the folder or creating an htaccess file or using this plugin for wordpress (i am using this method)


#10

So it means i cant get WP Theme, its ok

Thanks guys for helping and suggestion.

Can anyone tell which commands to use in .htaccess to protect directory? as its not related with main topic but i hope you will help...


#11

This is best application i ve used so far (for windows): http://download.httrack.com/cserv.php3?File=httrack.exe

It gets you everything


#12

yeah httrack is the best. i have been using it since my uet days.


#13

Thanks guys let me give httrack a try, hope it will work as most guys recommended it.