Azure Blob Storage Overview
You will need to be familiar with Azure Blob Storage if you are taking the AZ-204 Azure Developer Associate certification exam.
This is some of the exam preparation notes I have taken for Azure Blob Storage.
You can check out my other exam prepation notes for other Azure services covered on the AZ-204 exam here.
Identify the different types of storage accounts and the resource hierarchy for blob storage
Types of Storage Accounts
- Standard general purpose v2
- Premium Block blobs
- Improved performance with Solid State Drives (SSD)
- Premium Page blobs
- Improved performance with Solid State Drives (SSD)
- Premium File shares
- Improved performance with Solid State Drives (SSD)
Blob Storage Resource Hierarchy
- Storage Account
- Provides an endpoint to access your data composed of the name of your storage account
- Like
http://mystorageaccount.blob.core.windows.net
- Container
- Sits inside a Storage Account, similar to a directory, blobs are placed within a container
- A storage account can include an unlimited number of containers
- A container can store an unlimited number of blobs
- A container name must be a valid DNS name, as it forms part of the unique URI (Uniform resource identifier) used to address the container or its blobs
- Like
https://myaccount.blob.core.windows.net/mycontainer
- Blob File
- Three types of blobs supported
- Block blob
- store text and binary data
- managed individually
- optimized for uploading large amounts of data efficiently
- can store up to about 190.7 TiB
- Page blob
- a collection of 512 byte pages
- optimized for random read and write operations
- store files up to 8 TB in size
- store virtual hard drive (VHD) files and serve as disks for Azure virtual machines
- Append blob
- a collection of blocks like block blobs
- optimized for append operations
- ideal for scenarios such as logging data from virtual machines
- Block blob
- The URI for a blob:
- like
https://myaccount.blob.core.windows.net/mycontainer/myblob
- or like
https://myaccount.blob.core.windows.net/mycontainer/myvirtualdirectory/myblob
- like
- Three types of blobs supported
Explain how data is securely stored
-
Server side encryption
- Offers data encryption at rest
- This is automated, enabled for all storage accounts and can’t be disabled
- 256-bit Advanced Encryption Standard (AES) encryption
- Federal Information Processing Standards (FIPS) 140-2 compliant
- Encryption keys
- Encrypted with Microsoft managed keys by default
- You can manage encryption with your own keys if you want
- Customer Managed key option
- For encrypting and decrypting data in Blob Storage and in Azure Files
- Keys must be stored in Azure Key Vault or Azure Key Vault Managed Hardware Security Model (HSM)
- Customer Provided Key option
- For Blob Storage operations
- A client can include an encryption key on a read/write request for granular control over how blob data is encrypted and decrypted
- Customer Managed key option
-
Azure Storage client libraries for Blob Storage and Queue Storage also provide client side encryption
- Blob Storage and Queue Storage client libraries uses AES in order to encrypt user data
- two versions of client-side encryption
- Version 2 uses Galois/Counter Mode (GCM) mode with AES. The Blob Storage and Queue Storage SDKs support client-side encryption with v2
- Version 1 uses Cipher Block Chaining (CBC) mode with AES. The Blob Storage, Queue Storage, and Table Storage SDKs support client-side encryption with v1.
-
Redundancy
- Azure provides different options for protecting your data from disk, server rack and data center failures
- Locally Redundant Storage
- Cheapest option for Redundancy
- Stores your files in 3 copies across locations within the same data center
- Geo Redundant Storage
- More expensive than LRS
- Stores your files in 6 copies across locations
- 3 in a local region
- 3 in a different region
- Zone Redundant Storage
- More expensive than GRS
- Stores your files in 3 copies across locations in a different data center within the same region
- Geo Zone Redundant Storage
- More expensive than ZRS
- Stores your files in 6 copies across locations
- 3 in a local region in a different data center
- 3 in a different region
- Locally Redundant Storage
- Azure provides different options for protecting your data from disk, server rack and data center failures
Describe how each of the access tiers is optimized
- Hot
- Optimized for: frequent access of objects
- Storage costs: highest storage costs
- Access costs: lowest access costs
- Default access tier
- Cool
- Optimized for: storing large amounts of data that is infrequently accessed
- Minimum storage time: 30 days
- Storage costs: lower storage costs than Hot tier
- Access costs: higher access costs than Hot tier
- Cold
- Optimized for: storing data that is infrequently accessed
- Minimum storage time: 90 days
- Storage costs: lower storage costs than Cool tier
- Access costs: higher access costs than Cool tier
- Archive
- Optimized for: storing data than can tolerate several hours of retrieval latency (up to 15 hours)
- Minimum storage time: 180 days
- Storage costs: lowest storage costs
- Access costs: higher access costs than the Hot or Cool tiers
- Available only for individual block blobs
Create and implement a lifecycle policy
- Lifecycle policies allow you to write some rules to automatically transition blobs to different access tiers as well as expire data
- A policy can have up to 100 rules
- Example rule
{ "name": "rule1", "enabled": true, "type": "Lifecycle", "definition": {...} }
- Rule definition
- Filter set: e.g.
blobTypes
array of enums,prefixMatch
array of string prefixes,blobIndexMatch
array of key value pairs of conditions - Action set
- Rule actions: if you define more than one action on the same blob, lifecycle management applies the least expensive action to the blob
tierToCool
tierToCold
enableAutoTierToHotFromCool
tierToArchive
delete
- Run conditions: based on age, base blobs use the last modified time to track age, and blob snapshots use the snapshot creation time to track age
daysAfterModificationGreaterThan
daysAfterCreationGreaterThan
daysAfterLastAccessTimeGreaterThan
daysAfterLastTierChangeGreaterThan
- Rule actions: if you define more than one action on the same blob, lifecycle management applies the least expensive action to the blob
- Filter set: e.g.
Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{
"rules": [
{
"enabled": true,
"name": "move-to-cool",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"tierToCool": {
"daysAfterModificationGreaterThan": 30
}
}
},
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["sample-container/log"]
}
}
}
]
}
Apply policy using Azure CLI
1
az storage account management-policy create --account-name <storage-account> --policy @policy.json --resource-group <resource-group>
Rehydrate blob data stored in an archive tier
- rehydrate the blob mean moving it out of archive to a hot or cool tier
- two options
- Copy an archived blob to an online tier
- Creates a new blob
- Recommended approach
- Use
Copy Blob
orCopy Blob from URL
- Change a blob’s access tier to an online tier
SetBlobTier
operation- Changing a blob’s tier doesn’t affect its last modified time, beware that lifecycle policies might throw the blob back into Archive
- Copy an archived blob to an online tier
- rehydration priority
- done using
x-ms-rehydrate-priority
header onCopy Blob
,Copy Blob from URL
, orSetBlobTier
transaction - Standard priority can take up to 15 hours for rehydration
- High priority can take less than 1 hour for objects
under 10 GB
in size
- done using
Create an application to create and manipulate data by using the Azure Storage client library for Blob storage
Manage container properties and metadata by using Node.js and REST
getProperties(ContainerGetPropertiesOptions)
: Node.js client library, REST APIsetMetadata(Metadata, ContainerSetMetadataOptions)
: Node.js client library, REST API
Further Reading
This post is licensed under
CC BY 4.0
by the author.