Skip to content
The Entire Text of Wikipedia Is Only About 22 Gigabytes

Photo via Pexels

Discovery

Edited by Alex Surfaced·Technology·2 min read
Share:

The complete textual content of the English Wikipedia, when compressed, occupies approximately 22 gigabytes (GB) of storage space. This remarkably compact size, which can easily fit on a standard USB flash drive or a single modern smartphone, represents the sum of over 6.8 million articles and billions of words of human knowledge. While the full database, including revision histories, images, and metadata, is significantly larger (over 200 GB for the English version), the core informational content is surprisingly small, demonstrating the efficiency of text data and compression algorithms.

Source linkedContext summarizedTechnology

Editorial check

How this page is checked

Source:dumps.wikimedia.org

Source trail

dumps.wikimedia.org

External links are separated from Surfaced commentary.

Reader safety

Context before clicks

Product links and external services are not presented as guarantees.

Monetization

No affiliate flag

Ads and commerce links are kept distinct from editorial text.

Surfaced take

Why It’s Fascinating

Experts were initially surprised by how little storage is required for humanity's largest collaborative knowledge repository, especially given its perceived vastness. It overturns the common perception that 'all human knowledge' must require massive, inaccessible data centers, showing it can be incredibly portable. In 5-10 years, this efficiency will be critical for delivering knowledge to regions with limited internet access via offline caches, or for integrating vast text corpuses into compact AI models for local processing. For a non-expert, it's like realizing that an entire library of books, if stripped down to just the words, could fit into your pocket, while your holiday photo album takes up far more space. Educators, researchers, developers of offline learning tools, and anyone interested in data efficiency benefits most. It raises the thought-provoking question: if the world's collective textual knowledge is so compact, what does that imply about the true 'size' of information versus the ever-growing volume of multimedia data we generate daily?

Enjoyed this? Get five picks like this every morning.

Free daily newsletter — zero spam, unsubscribe anytime.

Get the day's top tech discoveries delivered at 6 PM.

Free, source-linked, and easy to unsubscribe from.