Post

Azure Cosmos DB Overview

You will need to be familiar with Azure Cosmos DB if you are taking the AZ-204 Azure Developer Associate certification exam.

This is some of the exam preparation notes I have taken for Azure Cosmos DB.

You can check out my other exam prepation notes for other Azure services covered on the AZ-204 exam here.

Identify the key benefits provided by Azure Cosmos DB

  • Fully managed NoSQL database
  • Low latency
  • Scalable throughput
  • High availability
  • Support multiple regions and replication of data between them
    • Unlimited elastic write and read scalability
    • 99.999% read and write availability for multi-region databases
    • Guaranteed reads and writes served in less than 10 milliseconds at the 99th percentile

Describe the elements in an Azure Cosmos DB account and how they are organized

  • Database account
    • Holds databases
    • Give a DNS name to access the account
  • Database
    • Holds containers
  • Container: a collection, table, or graph
    • Contains items
    • Stored procedures
    • User defined functions
    • Triggers
    • Conflicts
    • Merge procedures
    • Partitions:
      • Azure Cosmos DB databases scale out (rather than up)
      • Data is stored in partitions
      • Containers are partitioned using a partition key
      • These partitions might be running on different servers
      • To increase partitions, you increase throughput, or they grow automatically as storage increases
      • Physical partitions:
        • underlying storage mechanism for data in Azure Cosmos DB
        • max throughput amount of 10,000 Request Units per second
        • max storage of 50 GB of data
      • Logical partitions:
        • max storage of 20 GB of data
    • Two throughput options:
      • Dedicated throughput: set at container level. throughput is exclusively reserved that container. has standard and autoscale options.
      • Shared throughput: set at database level. throughput is shared with up to 25 containers. excludes containers that are configured with their own dedicated throughput
  • Item: a document, row, node, or edge within a container

Explain the different consistency levels and choose the correct one for your project

  • Database Consistency is all about data sychronization when dealing with multi region databases
  • 5 options (from strongest to weakest levels of consistency)
    1. Strong: users are guaranteed to read the latest committed write
    2. Bounded staleness: guarantees the lag of data between any two regions is always less than a specified amount either
      • The number of versions (K) of the item
      • The time interval (T) reads might lag behind the writes
    3. Session: within a single client session, reads are guaranteed to honor the read-your-writes, and write-follows-reads guarantees
    4. Consistent prefix: no ordering guarantee for reads, but multiple writes within the same transaction are guaranteed to be read together if they have been synced
    5. Eventual: no ordering guarantee for reads
  • Availability and performance tradeoffs for each option
    • The weaker the consistency:
      • The higher the availibility
      • The lower the latency
      • The higher the throughput
    • The stronger the consistency:
      • The lower the availibility
      • The higher the latency
      • The lower the throughput
  • Read consistency applies to a single read operation scoped within a partition-key range or a logical partition

Explore the APIs supported in Azure Cosmos DB and choose the appropriate API for your solution

  • Database APIs supported:
    • NoSQL (native to Azure Cosmos DB)
    • MongoDB (implements the wire protocol of open source database engine)
    • PostgreSQL (implements the wire protocol of open source database engine)
    • Cassandra (implements the wire protocol of open source database engine)
    • Gremlin (implements the wire protocol of open source database engine)
    • Table (implements the wire protocol of open source database engine)

Describe how request units impact costs

  • You pay for throughput provisioned and storage consumed on an hourly basis
  • Cost of all database operations is normalized in Azure Cosmos DB and expressed by request units
  • A request unit represents the system resources such as CPU, IOPS, and memory that are required to perform the database operation are are calculated on each operation
  • The cost to fetch a single item by its ID and partition key value, for a 1-KB item is 1RU
  • RUs are charged differently based on your database account mode
    • Provisioned throughput mode:
      • Provisions the number of RUs for your application on a per-second basis in increments of 100 RUs per second
      • To scale, you can increase or decrease the number of RUs at any time in increments or decrements of 100 RUs
      • Provision throughput at a container and database granularity level
    • Serverless mode:
      • No throughput provisioned upfront
      • At the end of your billing period, you get billed for the number of request units that have been consumed by your database operations
    • Autoscale mode:
      • Automatically and instantly scale the throughput (RU/s) of your database or container based on its usage
      • Suited for mission-critical workloads that have variable or unpredictable traffic patterns, and require SLAs on high performance and scale

Create Azure Cosmos DB resources by using the Azure portal.

Identify classes and methods used to create resources

Initialise a CosmosClient

1
2
3
4
5
const { CosmosClient } = require("@azure/cosmos");

const endpoint = "https://your-account.documents.azure.com";
const key = "<database account masterkey>";
const client = new CosmosClient({ endpoint, key });

Create a Database

1
2
3
const { database } = await client.databases.createIfNotExists({
  id: "Test Database",
});

Create a Container

1
2
3
const { container } = await database.containers.createIfNotExists({
  id: "Test Database",
});

Read an Item using a Partion Key

1
const item = await container.item("id", "1").read();

Create an Item

1
2
const city = { id: "1", name: "Olympia", state: "WA", isCapitol: true };
const item = await container.items.create(city);

Upsert an Item (Update or Create if it doesn’t exist)

1
2
const task = { id: "1", name: "Learn Cosmos DB", isComplete: true };
const item = await container.items.upsert(task);

Query a Container

1
2
3
4
5
6
7
8
9
const { resources } = await container.items
  .query({
    query: "SELECT * from c WHERE c.isCapitol = @isCapitol",
    parameters: [{ name: "@isCapitol", value: true }],
  })
  .fetchAll();
for (const city of resources) {
  console.log(`${city.name}, ${city.state} is a capitol `);
}

Delete an Item

1
await container.item("1").delete();

Change Feed

  • https://learn.microsoft.com/en-us/azure/azure-functions/functions-add-output-binding-cosmos-db-vs-code?pivots=programming-language-javascript
  • https://learn.microsoft.com/en-us/azure/cosmos-db/change-feed
  • https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/read-change-feed

  • Push model using Azure Functions (preferred), also supported using change feed processor (just for .NET and Java)

  • Pull model using client library
    • https://learn.microsoft.com/en-us/javascript/api/overview/azure/cosmos-readme?view=azure-node-latest#change-feed-pull-model

Write stored procedures, triggers, and user-defined functions by using JavaScript

Further Reading

This post is licensed under CC BY 4.0 by the author.