All posts

How to Use the IA Command Line Tool

May 17, 2026

The ia command line tool is a Python-based utility built specifically for interacting with archive.org. It can download items, search the catalog, read metadata, upload files, and manage collections — all from your terminal. For anyone doing more than occasional one-off downloads, it is the most efficient way to work with the Internet Archive.

Installation

The recommended method is pip: pip install internetarchive. Run this inside a virtual environment if you want to keep your system Python clean (python -m venv venv && source venv/bin/activate first). You can also use pipx install internetarchive for an isolated install. A standalone binary is available at archive.org/download/ia-pex/ia if you prefer not to install Python packages at all — download it, make it executable with chmod +x ia, and run it directly.

Important: Do not install the ia tool via Homebrew, MacPorts, apt, yum, or other system package managers. These versions are often severely outdated, incompatible, or broken. Always install from pip or the official binary.

Configuration

Run ia configure to set up your account credentials. It will prompt for your archive.org email and password, then save them locally. Authentication is required for uploading, modifying metadata, and accessing some restricted content. For read-only operations like searching and downloading public items, configuration is optional but still recommended.

Downloading items

The core download command is: ia download [identifier] — this downloads every file from an item into a local folder. Replace [identifier] with the item's identifier (the last segment of its archive.org URL). To download only specific formats: ia download [identifier] --format='512Kb MPEG4'. To match filenames with patterns: ia download [identifier] --glob='*.mp4'. To exclude certain files: ia download [identifier] --glob='*.mp4' --exclude='*512kb*'. Multiple patterns can be separated with pipes: --glob='*.mp4|*.xml'. Note that --format cannot be combined with --glob in the same command.

To download an entire collection: ia download --search 'collection:prelinger'. To download from a list of identifiers: ia download --itemlist itemlist.txt. Add --checksum to enable verification and resumability — if a download fails partway through, re-running with --checksum skips files you already have. Add --log to record what was downloaded. For on-the-fly generated formats (like MOBI or EPUB for books): ia download [identifier] --on-the-fly.

Searching the catalog

To search: ia search 'subject:"market street" collection:prelinger'. Results return as JSON, one item per line. To get just the identifiers (one per line): ia search 'collection:prelinger' --itemlist. This is especially useful for piping into downloads: ia search 'collection:prelinger' --itemlist | parallel 'ia download {}'. To limit results: ia search --parameters='page=1&rows=20' 'dogs'.

Reading metadata

To see all metadata for an item: ia metadata [identifier] — this returns JSON. To see available formats: ia metadata --formats [identifier]. To list all files: ia list [identifier], or ia list -l [identifier] for full URLs. You can pipe metadata into jq for structured queries: ia metadata [identifier] | jq '.metadata.date'.

Bulk operations

For downloading large collections or many items, combine ia search with GNU Parallel: ia search 'collection:glasgowschoolofart' --itemlist | parallel -j4 'ia download {}'. The -j4 flag runs four downloads concurrently. Adjust the parallelism based on your connection and how polite you want to be to the Archive's servers — four to eight simultaneous downloads is a reasonable range. For a guide specifically about downloading whole collections, see How to Download a Whole Collection from Internet Archive.

Limitations to know about

The ia tool does not have built-in resume support for partially downloaded files. The --checksum flag helps by skipping completed files, but if a single large file fails mid-download, you will need to delete and retry it. There is no native parallel download for files within a single item — you can parallelize across items but not within one. For very large single files that tend to fail, a browser with auto-resume (like Chrome) may actually be more reliable, if less elegant. Shell-dependent glob escaping can also trip you up — if your patterns are not matching, try quoting them differently or escaping asterisks.

Arkibber complements the ia tool by providing a visual discovery layer. Use Arkibber to search, filter, and identify the items or collections you want, then switch to ia for the actual bulk downloading. The combination of visual triage and command-line power covers the full workflow efficiently.

For item-level downloads using the browser instead, see How to Download All Files from an Internet Archive Item.

How to Find Original Files on Internet Archive
How to Use Internet Archive Torrent Downloads