Post

Azure Blob Storage Overview

You will need to be familiar with Azure Blob Storage if you are taking the AZ-204 Azure Developer Associate certification exam.

This is some of the exam preparation notes I have taken for Azure Blob Storage.

You can check out my other exam prepation notes for other Azure services covered on the AZ-204 exam here.

Identify the different types of storage accounts and the resource hierarchy for blob storage

Types of Storage Accounts

  • Standard general purpose v2
  • Premium Block blobs
    • Improved performance with Solid State Drives (SSD)
  • Premium Page blobs
    • Improved performance with Solid State Drives (SSD)
  • Premium File shares
    • Improved performance with Solid State Drives (SSD)

Blob Storage Resource Hierarchy

  • Storage Account
    • Provides an endpoint to access your data composed of the name of your storage account
    • Like http://mystorageaccount.blob.core.windows.net
  • Container
    • Sits inside a Storage Account, similar to a directory, blobs are placed within a container
    • A storage account can include an unlimited number of containers
    • A container can store an unlimited number of blobs
    • A container name must be a valid DNS name, as it forms part of the unique URI (Uniform resource identifier) used to address the container or its blobs
    • Like https://myaccount.blob.core.windows.net/mycontainer
  • Blob File
    • Three types of blobs supported
      • Block blob
        • store text and binary data
        • managed individually
        • optimized for uploading large amounts of data efficiently
        • can store up to about 190.7 TiB
      • Page blob
        • a collection of 512 byte pages
        • optimized for random read and write operations
        • store files up to 8 TB in size
        • store virtual hard drive (VHD) files and serve as disks for Azure virtual machines
      • Append blob
        • a collection of blocks like block blobs
        • optimized for append operations
        • ideal for scenarios such as logging data from virtual machines
    • The URI for a blob:
      • like https://myaccount.blob.core.windows.net/mycontainer/myblob
      • or like https://myaccount.blob.core.windows.net/mycontainer/myvirtualdirectory/myblob

Explain how data is securely stored

  • Server side encryption

    • Offers data encryption at rest
    • This is automated, enabled for all storage accounts and can’t be disabled
    • 256-bit Advanced Encryption Standard (AES) encryption
    • Federal Information Processing Standards (FIPS) 140-2 compliant
    • Encryption keys
      • Encrypted with Microsoft managed keys by default
      • You can manage encryption with your own keys if you want
        • Customer Managed key option
          • For encrypting and decrypting data in Blob Storage and in Azure Files
          • Keys must be stored in Azure Key Vault or Azure Key Vault Managed Hardware Security Model (HSM)
        • Customer Provided Key option
          • For Blob Storage operations
          • A client can include an encryption key on a read/write request for granular control over how blob data is encrypted and decrypted
  • Azure Storage client libraries for Blob Storage and Queue Storage also provide client side encryption

    • Blob Storage and Queue Storage client libraries uses AES in order to encrypt user data
    • two versions of client-side encryption
      • Version 2 uses Galois/Counter Mode (GCM) mode with AES. The Blob Storage and Queue Storage SDKs support client-side encryption with v2
      • Version 1 uses Cipher Block Chaining (CBC) mode with AES. The Blob Storage, Queue Storage, and Table Storage SDKs support client-side encryption with v1.
  • Redundancy

    • Azure provides different options for protecting your data from disk, server rack and data center failures
      • Locally Redundant Storage
        • Cheapest option for Redundancy
        • Stores your files in 3 copies across locations within the same data center
      • Geo Redundant Storage
        • More expensive than LRS
        • Stores your files in 6 copies across locations
          • 3 in a local region
          • 3 in a different region
      • Zone Redundant Storage
        • More expensive than GRS
        • Stores your files in 3 copies across locations in a different data center within the same region
      • Geo Zone Redundant Storage
        • More expensive than ZRS
        • Stores your files in 6 copies across locations
          • 3 in a local region in a different data center
          • 3 in a different region

Describe how each of the access tiers is optimized

  • Hot
    • Optimized for: frequent access of objects
    • Storage costs: highest storage costs
    • Access costs: lowest access costs
    • Default access tier
  • Cool
    • Optimized for: storing large amounts of data that is infrequently accessed
    • Minimum storage time: 30 days
    • Storage costs: lower storage costs than Hot tier
    • Access costs: higher access costs than Hot tier
  • Cold
    • Optimized for: storing data that is infrequently accessed
    • Minimum storage time: 90 days
    • Storage costs: lower storage costs than Cool tier
    • Access costs: higher access costs than Cool tier
  • Archive
    • Optimized for: storing data than can tolerate several hours of retrieval latency (up to 15 hours)
    • Minimum storage time: 180 days
    • Storage costs: lowest storage costs
    • Access costs: higher access costs than the Hot or Cool tiers
    • Available only for individual block blobs

Create and implement a lifecycle policy

  • Lifecycle policies allow you to write some rules to automatically transition blobs to different access tiers as well as expire data
  • A policy can have up to 100 rules
  • Example rule { "name": "rule1", "enabled": true, "type": "Lifecycle", "definition": {...} }
  • Rule definition
    • Filter set: e.g. blobTypes array of enums, prefixMatch array of string prefixes, blobIndexMatch array of key value pairs of conditions
    • Action set
      • Rule actions: if you define more than one action on the same blob, lifecycle management applies the least expensive action to the blob
        • tierToCool
        • tierToCold
        • enableAutoTierToHotFromCool
        • tierToArchive
        • delete
      • Run conditions: based on age, base blobs use the last modified time to track age, and blob snapshots use the snapshot creation time to track age
        • daysAfterModificationGreaterThan
        • daysAfterCreationGreaterThan
        • daysAfterLastAccessTimeGreaterThan
        • daysAfterLastTierChangeGreaterThan

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{
  "rules": [
    {
      "enabled": true,
      "name": "move-to-cool",
      "type": "Lifecycle",
      "definition": {
        "actions": {
          "baseBlob": {
            "tierToCool": {
              "daysAfterModificationGreaterThan": 30
            }
          }
        },
        "filters": {
          "blobTypes": ["blockBlob"],
          "prefixMatch": ["sample-container/log"]
        }
      }
    }
  ]
}

Apply policy using Azure CLI

1
az storage account management-policy create --account-name <storage-account> --policy @policy.json --resource-group <resource-group>

Rehydrate blob data stored in an archive tier

  • rehydrate the blob mean moving it out of archive to a hot or cool tier
  • two options
    • Copy an archived blob to an online tier
      • Creates a new blob
      • Recommended approach
      • Use Copy Blob or Copy Blob from URL
    • Change a blob’s access tier to an online tier
      • SetBlobTier operation
      • Changing a blob’s tier doesn’t affect its last modified time, beware that lifecycle policies might throw the blob back into Archive
  • rehydration priority
    • done using x-ms-rehydrate-priority header on Copy Blob, Copy Blob from URL, or SetBlobTier transaction
    • Standard priority can take up to 15 hours for rehydration
    • High priority can take less than 1 hour for objects under 10 GB in size

Create an application to create and manipulate data by using the Azure Storage client library for Blob storage

Manage container properties and metadata by using Node.js and REST

Further Reading

This post is licensed under CC BY 4.0 by the author.