CURL to download a directory

Question

I am trying to download a full website directory using CURL. The following command does not work:

curl -LO http://example.com/

It returns an error: curl: Remote file name has no length!.

But when I do this: curl -LO http://example.com/someFile.type it works. Any idea how to download all files in the specified directory? Thanks.

score 105 · Answer 1 · answered Jan 31 '14 at 16:44

105

Always works for me, included no parent and recursive to only get the desired directory.

 wget --no-parent -r http://WEBSITE.com/DIRECTORY

answered Jan 31 '14 at 16:44

StanleyZheng

1,151

score 36 · Answer 2 · answered Oct 17 '10 at 19:59

HTTP doesn't really have a notion of directories. The slashes other than the first three (http://example.com/) do not have any special meaning except with respect to .. in relative URLs. So unless the server follows a particular format, there's no way to “download all files in the specified directory”.

If you want to download the whole site, your best bet is to traverse all the links in the main page recursively. Curl can't do it, but wget can. This will work if the website is not too dynamic (in particular, wget won't see links that are constructed by Javascript code). Start with wget -r http://example.com/, and look under “Recursive Retrieval Options” and “Recursive Accept/Reject Options” in the wget manual for more relevant options (recursion depth, exclusion lists, etc).

If the website tries to block automated downloads, you may need to change the user agent string (-U Mozilla), and to ignore robots.txt (create an empty file example.com/robots.txt and use the -nc option so that wget doesn't try to download it from the server).

score 25 · Answer 3 · edited Jun 20 '14 at 15:35

25

In this case, curl is NOT the best tool. You can use wget with the -r argument, like this:

wget -r http://example.com/

This is the most basic form, and and you can use additional arguments as well. For more information, see the manpage (man wget).

edited Jun 20 '14 at 15:35

Canadian Luke

24,640

answered Jan 23 '14 at 11:50

moroccan

251

score 8 · Answer 4 · answered Oct 17 '10 at 17:59

This isn't possible. There is no standard, generally implemented, way for a web server to return the contents of a directory to you. Most servers do generate an HTML index of a directory, if configured to do so, but this output isn't standard, nor guaranteed by any means. You could parse this HTML, but keep in mind that the format will change from server to server, and won't always be enabled.

score 5 · Answer 5 · answered Aug 30 '21 at 08:24

5

lftp -c mirror <url>

Obviously, you need to install lftp first.

answered Aug 30 '21 at 08:24

HappyFace

1,389

score 4 · Answer 6 · answered Dec 20 '20 at 12:32

4

When you're downloading from a directory listing add one more argument to wget called reject.

wget --no-parent -r --reject "index.html*" "http://url"

answered Dec 20 '20 at 12:32

LAamanni

161

score 3 · Answer 7 · answered Jan 20 '13 at 00:08

3

You can use the Firefox extension DownThemAll! It will let you download all the files in a directory in one click. It is also customizable and you can specify what file types to download. This is the easiest way I have found.

answered Jan 20 '13 at 00:08

Asdf

31

score 1 · Answer 8 · answered Jan 23 '14 at 12:44

1

You might find a use for a website ripper here, this will download everything and modify the contents/internal links for local use. A good one can be found here: http://www.httrack.com

answered Jan 23 '14 at 12:44

Gaurav Joseph

1,783

CURL to download a directory

8 Answers8

Linked