S3 Storage: What It Is and Why It Works
Main chat
A chat for vibe coders: news, guides, live cases, marketplace, and finding executors.
If you’re developing an app, setting up backups, or working with media files, sooner or later you’ll come across the concept of S3. In this article, we will analyze what it is, how it is arranged from the inside, how it differs from a regular disk and when it is worth using.
What is S3 in simple words
S3 (Simple Storage Service) is a cloud-based **object storage. In 2006, AWS launched Amazon S3, which became so popular that the S3 protocol itself became an industry standard. Today, S3 is not only an Amazon service, but also a class of compatible solutions from dozens of providers.
Simple analogy. Imagine an endless warehouse with a unique number written on each box. You put a box (file), get a number (key), and then you can get it from anywhere in the world. No folders, no hierarchy, just boxes and numbers.
This is how object storage works:
- every file is an object
- the object is stored in a bucket, a named container
- each object has a key - a unique name inside the bucket
- access via HTTP API
How the S3 works from the inside out
Object
The object in S3 is not just a file. It consists of three parts:
- Data - the file itself, from 1 byte to 5 terabytes
- **Key is the unique name of the object inside the bucket. A line like
uploads/2026/06/photo.jpgis just a text key, not a real folder - Metadata - additional information: content type, date of creation, tags, arbitrary fields that the developer specifies
Bucket (Bucket)
A bag is a container for objects. What's important to know:
- baket has a globally unique name (among all users of the provider)
- access rights, policies, versioning, logging
- bucket is tied to the region (geographic data center)
- an unlimited number of objects can lie in one bag
Key and pseudofolders
There are no folders inside the S3, only a flat keyspace. But the developers agreed to use the / slash in keys, and most tools visually show this as folders. The team/photos/avatar.png string is simply the name of an object that looks like a path in a file system.
Access via API
The main way to work with S3 is the REST HTTP API. Operations are simple:
| Метод | Действие |
|---|---|
PUT |
Загрузить объект |
GET |
Скачать объект |
DELETE |
Удалить объект |
HEAD |
Получить метаданные |
LIST |
Получить список объектов в бакете |
In addition to the direct API, there are ready-made tools: AWS CLI, rclone, SDK for Python (boto3), Go, Java, JavaScript and other languages.
S3 vs file storage vs block storage
This is a key question that often confuses beginners. There are three types of storage, and each is suitable for its own tasks.
File Storage (NAS, NFS, SMB)
The usual file system: folders, files, rights. It works like a network drive. Good for: document collaboration, corporate file servers.
Constraints: poorly scaled horizontally, performance drops with a large number of files.
Block storage (SSD, HDD, SAN)
A disk that connects to the server. The operating system sees it as a physical disk. Good for: databases, virtual machines, anything that requires low latency and frequent small write/read operations.
Constraints: Linked to one server, difficult to scale, not intended for public access over the Internet.
S3 object storage
Flat structure, HTTP access, horizontal scaling. Good for: static files, media, backups, logs, content distribution.
Restrictions: Not suitable for frequent small changes to a single file. The object is either created entirely or replaced entirely – partial editing is not supported.
Comparative table
| Критерий | Блочное | Файловое | S3 (объектное) |
|---|---|---|---|
| Масштабирование | Ручное | Ограниченное | Автоматическое, практически бесконечное |
| Доступ | Локально к серверу | Сеть (NFS/SMB) | HTTP из любой точки |
| Частые изменения файла | Отлично | Хорошо | Плохо |
| Большие статические файлы | Нормально | Нормально | Отлично |
| Стоимость | Высокая | Средняя | Низкая |
| Метаданные | Базовые | Базовые | Расширенные, кастомные |
| Публичный доступ | Сложно настроить | Сложно | Просто |
Key characteristics of S3 storage
Reliability: What do "11 nines" mean
Amazon S3 claims 99.9999999% data storage reliability (11 nines). This means that out of 10 billion objects per year, an average of one will be lost. This is achieved through:
- Replication – each object is stored in multiple copies (usually 3+) on different physical machines and data centers
- Erasure coding - distributed storage algorithm that allows you to recover data even if you lose several nodes
Availability vs reliability
It is important not to confuse these two parameters:
- **Durability: The probability that data will not be lost
- **Availability: The probability that the data can be read right now
The standard Amazon S3 offers 99.99% availability, which is about 52 minutes of downtime per year. For cheaper storage classes, availability is lower, reliability remains high.
Versioning
S3 supports the versioning of objects. When turned on, each file overwrite saves the previous version. This is protection against accidental deletion or overwriting – you can roll back to any version.
Lifecycle Policy (Lifecycle Policy)
You can configure to automatically move objects between storage classes or remove:
- 30 days to move to cold storage (cheaper)
- in 180 days, move the archives
- yearly
Storage classes
Most S3 providers offer multiple classes of storage with different balances of price, latency and availability:
| Класс | Применение | Задержка доступа | Стоимость |
|---|---|---|---|
| Standard | Активные данные, частый доступ | Миллисекунды | Высокая |
| Infrequent Access | Данные, которые редко читают | Миллисекунды | Средняя |
| Glacier / Archive | Долгосрочный архив, бэкапы | Минуты–часы | Низкая |
Access management
Public and private access
By default, all objects are private. The object can be made public:
- via ACL (Access Control List) to specify rights at the object or bucket level
- **bucket policy: JSON rules that define who can do what
- presigned URL – Temporary signed link valid for a limited time (e.g. 1 hour)
IAM and roles
In enterprise scenarios, access is managed through IAM (Identity and Access Management): users and services are given dot-permission roles—for example, “read-only from a particular bucket.”.
Typical use cases
1. Statics and Media for Websites and Applications
Pictures, video, audio, PDF – all this is stored in S3, and distributed through a CDN. The user receives the file quickly, the application server does not load.
2. Database and server backups
S3 is the standard for automatic backups. Schedule is set up, old backups are automatically transferred to the archive storage class and removed by policy.
3. Storage of logs and analytical data
Huge amounts of logs are cheap to store in S3. Systems like AWS Athena or ClickHouse can read data directly from S3 without first importing.
4. File distribution
Application distributions, updates, ISO images. S3 is great for large files that are downloaded by thousands of users at the same time.
5. Data lake for ML/analytics
Raw data for model training, datasets, experimental artifacts – all this is convenient to store in S3: cheap, scalable, accessible from any tool.
6. Static website hosting
A simple HTML/CSS/JS site can be placed directly in the S3-backet – it will give files over HTTP. No server, no hosting.
7. Storage of data between microservices
In microservices architectures, S3 is often used as an intermediate storage: one service puts a file, the other takes it, without a direct connection between the two.
Major S3 providers
Amazon S3
Original and standard. Deep integration with the AWS ecosystem. The widest range of features, but also the most complex tariffs: pay separately for storage, for requests, for traffic.
Cloudflare R2
It appeared as a direct competitor to the Amazon S3. The main difference is ** no fees for outgoing traffic** (egress free). For sites with large audiences, this is essential. S3-compatible API, simple tariffs.
Backblaze B2
One of the cheapest options. Partial S3 compatibility. Popular for backups and archives.
MinIO
Not a cloud, but an open source solution for self-hosted deployment. Compatible with the S3 API. You can pick up on your server and work just like you would with Amazon S3.
Yandex Object Storage, VK Cloud, Selectel
Russian S3-compatible providers. Data storage in Russia, which is important for compliance with 152-FZ. Integrated with other Russian cloud services.
How to Get Started with S3: A Practical Minimum
Step 1. Choose a provider and create a bundle
Register with any S3 provider. Create a packet - specify the name and region.
Step 2. Get access keys
S3 uses a pair of keys: Access Key ID and Secret Access Key. This is not a login / password, but special credentials for the API.
Step 3. Download the file via AWS CLI
# Install AWS CLI
pip install awscli
# Set up
aws configure
# Enter: Access Key ID, Secret Access Key, region, output format
# Download the file
aws s3 cp ./myfile.jpg s3://my-bucket-name/uploads/myfile.jpg
# Download the file
aws s3 cp s3://my-bucket-name/uploads/myfile.jpg ./myfile.jpg
# View the contents of the bike
aws s3 ls s3://my-bucket-name/
For third-party providers, add the --endpoint-url setting:
aws s3 cp ./myfile.jpg s3://my-bucket/ --endpoint-url https://s3.provider.ru
Step 4. Use in code (Python, Boto3)
import boto3
s3 = boto3.client(
's3',
endpoint_url='https://s3.provider.ru', # для сторонних провайдеров
aws_access_key_id='YOUR_KEY',
aws_secret_access_key='YOUR_SECRET'
)
# Загрузить файл
s3.upload_file('local_file.jpg', 'my-bucket', 'uploads/photo.jpg')
# Скачать файл
s3.download_file('my-bucket', 'uploads/photo.jpg', 'local_copy.jpg')
# Получить временную ссылку (presigned URL, 1 час)
url = s3.generate_presigned_url(
'get_object',
Params={'Bucket': 'my-bucket', 'Key': 'uploads/photo.jpg'},
ExpiresIn=3600
)
Step 5. Set up rclone for synchronization
rclone is a universal tool for working with any S3-compatible storage:
# Set up remote
rclone config.
Synchronize the local folder with a bucket
rclone sync ./local-folder remote:my-bucket/backups/
# Copy the folder.
rclone copy./photos remote:my-bucket/photos/
Advice and typical mistakes
Do not store private keys in code. Use environment variables, .env files, or secret managers.
** Check the right to access the bike. ** Everything is private by default. Make public only what really needs to be public.
Name objects meaningfully. uploads/users/123/avatar.jpg is better than abc123xyz.jpg, making it easier to organize and debug.
Enable versioning for critical data is insurance against accidental deletion.
Configure Lifecycle policies from the start - without them, the backpack will grow endlessly and you will overpay for storage.
Use a CDN in front of S3 to distribute public files – this will reduce latency for users and the cost of outbound traffic.
Outcome
S3 is not just a cloud drive. It is the industry standard for storing unstructured data: scalable, accessible via HTTP from anywhere in the world, with flexible access management and a rich ecosystem of tools.
Whether you’re developing a web application, configuring backups, or building a data pipeline, S3 will solve the problem more reliably and cheaper than a regular disk server.
You can start for free: most providers have free tier for the first gigabytes. Choose a provider, create a backpack, try downloading the first file, and you’ll understand why S3 has become the industry standard.