~/wiki / dannye-i-khranenie / what-is-s3-storage

S3 Storage: What It Is and Why It Works

◷ 11 min read 6/6/2026

Main chat

A chat for vibe coders: news, guides, live cases, marketplace, and finding executors.

$ cd section/ $ join vibe dev

If you’re developing an app, setting up backups, or working with media files, sooner or later you’ll come across the concept of S3. In this article, we will analyze what it is, how it is arranged from the inside, how it differs from a regular disk and when it is worth using.


What is S3 in simple words

S3 (Simple Storage Service) is a cloud-based **object storage. In 2006, AWS launched Amazon S3, which became so popular that the S3 protocol itself became an industry standard. Today, S3 is not only an Amazon service, but also a class of compatible solutions from dozens of providers.

Simple analogy. Imagine an endless warehouse with a unique number written on each box. You put a box (file), get a number (key), and then you can get it from anywhere in the world. No folders, no hierarchy, just boxes and numbers.

This is how object storage works:

  • every file is an object
  • the object is stored in a bucket, a named container
  • each object has a key - a unique name inside the bucket
  • access via HTTP API

How the S3 works from the inside out

Object

The object in S3 is not just a file. It consists of three parts:

  1. Data - the file itself, from 1 byte to 5 terabytes
  2. **Key is the unique name of the object inside the bucket. A line like uploads/2026/06/photo.jpg is just a text key, not a real folder
  3. Metadata - additional information: content type, date of creation, tags, arbitrary fields that the developer specifies

Bucket (Bucket)

A bag is a container for objects. What's important to know:

  • baket has a globally unique name (among all users of the provider)
  • access rights, policies, versioning, logging
  • bucket is tied to the region (geographic data center)
  • an unlimited number of objects can lie in one bag

Key and pseudofolders

There are no folders inside the S3, only a flat keyspace. But the developers agreed to use the / slash in keys, and most tools visually show this as folders. The team/photos/avatar.png string is simply the name of an object that looks like a path in a file system.

Access via API

The main way to work with S3 is the REST HTTP API. Operations are simple:

Метод Действие
PUT Загрузить объект
GET Скачать объект
DELETE Удалить объект
HEAD Получить метаданные
LIST Получить список объектов в бакете

In addition to the direct API, there are ready-made tools: AWS CLI, rclone, SDK for Python (boto3), Go, Java, JavaScript and other languages.


S3 vs file storage vs block storage

This is a key question that often confuses beginners. There are three types of storage, and each is suitable for its own tasks.

File Storage (NAS, NFS, SMB)

The usual file system: folders, files, rights. It works like a network drive. Good for: document collaboration, corporate file servers.

Constraints: poorly scaled horizontally, performance drops with a large number of files.

Block storage (SSD, HDD, SAN)

A disk that connects to the server. The operating system sees it as a physical disk. Good for: databases, virtual machines, anything that requires low latency and frequent small write/read operations.

Constraints: Linked to one server, difficult to scale, not intended for public access over the Internet.

S3 object storage

Flat structure, HTTP access, horizontal scaling. Good for: static files, media, backups, logs, content distribution.

Restrictions: Not suitable for frequent small changes to a single file. The object is either created entirely or replaced entirely – partial editing is not supported.

Comparative table

Критерий Блочное Файловое S3 (объектное)
Масштабирование Ручное Ограниченное Автоматическое, практически бесконечное
Доступ Локально к серверу Сеть (NFS/SMB) HTTP из любой точки
Частые изменения файла Отлично Хорошо Плохо
Большие статические файлы Нормально Нормально Отлично
Стоимость Высокая Средняя Низкая
Метаданные Базовые Базовые Расширенные, кастомные
Публичный доступ Сложно настроить Сложно Просто

Key characteristics of S3 storage

Reliability: What do "11 nines" mean

Amazon S3 claims 99.9999999% data storage reliability (11 nines). This means that out of 10 billion objects per year, an average of one will be lost. This is achieved through:

  • Replication – each object is stored in multiple copies (usually 3+) on different physical machines and data centers
  • Erasure coding - distributed storage algorithm that allows you to recover data even if you lose several nodes

Availability vs reliability

It is important not to confuse these two parameters:

  • **Durability: The probability that data will not be lost
  • **Availability: The probability that the data can be read right now

The standard Amazon S3 offers 99.99% availability, which is about 52 minutes of downtime per year. For cheaper storage classes, availability is lower, reliability remains high.

Versioning

S3 supports the versioning of objects. When turned on, each file overwrite saves the previous version. This is protection against accidental deletion or overwriting – you can roll back to any version.

Lifecycle Policy (Lifecycle Policy)

You can configure to automatically move objects between storage classes or remove:

  • 30 days to move to cold storage (cheaper)
  • in 180 days, move the archives
  • yearly

Storage classes

Most S3 providers offer multiple classes of storage with different balances of price, latency and availability:

Класс Применение Задержка доступа Стоимость
Standard Активные данные, частый доступ Миллисекунды Высокая
Infrequent Access Данные, которые редко читают Миллисекунды Средняя
Glacier / Archive Долгосрочный архив, бэкапы Минуты–часы Низкая

Access management

Public and private access

By default, all objects are private. The object can be made public:

  • via ACL (Access Control List) to specify rights at the object or bucket level
  • **bucket policy: JSON rules that define who can do what
  • presigned URL – Temporary signed link valid for a limited time (e.g. 1 hour)

IAM and roles

In enterprise scenarios, access is managed through IAM (Identity and Access Management): users and services are given dot-permission roles—for example, “read-only from a particular bucket.”.


Typical use cases

1. Statics and Media for Websites and Applications

Pictures, video, audio, PDF – all this is stored in S3, and distributed through a CDN. The user receives the file quickly, the application server does not load.

2. Database and server backups

S3 is the standard for automatic backups. Schedule is set up, old backups are automatically transferred to the archive storage class and removed by policy.

3. Storage of logs and analytical data

Huge amounts of logs are cheap to store in S3. Systems like AWS Athena or ClickHouse can read data directly from S3 without first importing.

4. File distribution

Application distributions, updates, ISO images. S3 is great for large files that are downloaded by thousands of users at the same time.

5. Data lake for ML/analytics

Raw data for model training, datasets, experimental artifacts – all this is convenient to store in S3: cheap, scalable, accessible from any tool.

6. Static website hosting

A simple HTML/CSS/JS site can be placed directly in the S3-backet – it will give files over HTTP. No server, no hosting.

7. Storage of data between microservices

In microservices architectures, S3 is often used as an intermediate storage: one service puts a file, the other takes it, without a direct connection between the two.


Major S3 providers

Amazon S3

Original and standard. Deep integration with the AWS ecosystem. The widest range of features, but also the most complex tariffs: pay separately for storage, for requests, for traffic.

Cloudflare R2

It appeared as a direct competitor to the Amazon S3. The main difference is ** no fees for outgoing traffic** (egress free). For sites with large audiences, this is essential. S3-compatible API, simple tariffs.

Backblaze B2

One of the cheapest options. Partial S3 compatibility. Popular for backups and archives.

MinIO

Not a cloud, but an open source solution for self-hosted deployment. Compatible with the S3 API. You can pick up on your server and work just like you would with Amazon S3.

Yandex Object Storage, VK Cloud, Selectel

Russian S3-compatible providers. Data storage in Russia, which is important for compliance with 152-FZ. Integrated with other Russian cloud services.


How to Get Started with S3: A Practical Minimum

Step 1. Choose a provider and create a bundle

Register with any S3 provider. Create a packet - specify the name and region.

Step 2. Get access keys

S3 uses a pair of keys: Access Key ID and Secret Access Key. This is not a login / password, but special credentials for the API.

Step 3. Download the file via AWS CLI

bash
# Install AWS CLI
pip install awscli

# Set up
aws configure
# Enter: Access Key ID, Secret Access Key, region, output format

# Download the file
aws s3 cp ./myfile.jpg s3://my-bucket-name/uploads/myfile.jpg

# Download the file
aws s3 cp s3://my-bucket-name/uploads/myfile.jpg ./myfile.jpg

# View the contents of the bike
aws s3 ls s3://my-bucket-name/

For third-party providers, add the --endpoint-url setting:

bash
aws s3 cp ./myfile.jpg s3://my-bucket/ --endpoint-url https://s3.provider.ru

Step 4. Use in code (Python, Boto3)

python
import boto3

s3 = boto3.client(
    's3',
    endpoint_url='https://s3.provider.ru',  # для сторонних провайдеров
    aws_access_key_id='YOUR_KEY',
    aws_secret_access_key='YOUR_SECRET'
)

# Загрузить файл
s3.upload_file('local_file.jpg', 'my-bucket', 'uploads/photo.jpg')

# Скачать файл
s3.download_file('my-bucket', 'uploads/photo.jpg', 'local_copy.jpg')

# Получить временную ссылку (presigned URL, 1 час)
url = s3.generate_presigned_url(
    'get_object',
    Params={'Bucket': 'my-bucket', 'Key': 'uploads/photo.jpg'},
    ExpiresIn=3600
)

Step 5. Set up rclone for synchronization

rclone is a universal tool for working with any S3-compatible storage:

bash
# Set up remote
rclone config.

Synchronize the local folder with a bucket
rclone sync ./local-folder remote:my-bucket/backups/

# Copy the folder.
rclone copy./photos remote:my-bucket/photos/

Advice and typical mistakes

Do not store private keys in code. Use environment variables, .env files, or secret managers.

** Check the right to access the bike. ** Everything is private by default. Make public only what really needs to be public.

Name objects meaningfully. uploads/users/123/avatar.jpg is better than abc123xyz.jpg, making it easier to organize and debug.

Enable versioning for critical data is insurance against accidental deletion.

Configure Lifecycle policies from the start - without them, the backpack will grow endlessly and you will overpay for storage.

Use a CDN in front of S3 to distribute public files – this will reduce latency for users and the cost of outbound traffic.


Outcome

S3 is not just a cloud drive. It is the industry standard for storing unstructured data: scalable, accessible via HTTP from anywhere in the world, with flexible access management and a rich ecosystem of tools.

Whether you’re developing a web application, configuring backups, or building a data pipeline, S3 will solve the problem more reliably and cheaper than a regular disk server.

You can start for free: most providers have free tier for the first gigabytes. Choose a provider, create a backpack, try downloading the first file, and you’ll understand why S3 has become the industry standard.

$ cd ../ ← back to Data and storage