log
Swift Code Chronicles

Amazon DynamoDB: A Comprehensive Guide to Core Concepts

Published on January 17, 2025
Updated on January 30, 2025
66 min read
AWS

Amazon DynamoDB is a serverless NoSQL database that delivers high performance and scalability for applications of all sizes. This article explains the core concepts of DynamoDB in detail, including primary key design, data types, table definitions, CRUD operations, transactions, Global Secondary Indexes (GSI), and best practices for table design.


1. Primary Key

Structure of Primary Keys

In DynamoDB, every table requires a primary key. There are two types of primary keys:

  1. Partition Key (PK): A single attribute key that determines which partition the data is stored in.
  2. Partition Key + Sort Key (SK): A composite key consisting of two attributes. The partition key determines the partition, and the sort key specifies the order within the partition.

In both cases, the primary key must ensure uniqueness.

Example:

{
  "PK": "USER#12345",
  "SK": "USER_INFO"
}

Recommended Primary Key Design

The Partition Key + Sort Key format is recommended. For example, when storing user data, set the PK to USER#<user_id> and the SK to USER_INFO.

  1. The first part of the PK represents a constant indicating the data type (e.g., USER), and the second part represents the user ID.
  2. The SK represents the type or context of the data using constants or dynamic values.

This design ensures uniqueness and enables efficient querying.


2. Data Types

DynamoDB supports the following data types:

  • Scalar types: String (S), Number (N), Binary (B), Boolean (BOOL)
  • Document types: Map (M), List (L)
  • Set types: String Set (SS), Number Set (NS), Binary Set (BS)

Notes:

  • Flexibility: Document types like Map and List are ideal for storing nested data.
  • Indexing: Only scalar types can be used as partition keys or sort keys.

Example:

{
  "PK": "USER#12345",
  "SK": "USER_INFO",
  "Name": "Alice",
  "Age": 30,
  "Preferences": {
    "Language": "English",
    "TimeZone": "UTC+9"
  },
  "Tags": ["Developer", "Writer"]
}

3. Table Definition

DynamoDB tables can be defined using CloudFormation or AWS CDK.

CloudFormation Example:

Resources:
  <asset_name>:
    Type: AWS::DynamoDB:Table
    Properties:
      TableName: <table_name>
      AttributeDefinitions:
        - AttributeName: <pk_name>
          AttributeType: <pk_type>
        - AttributeName: <sk_name>
          AttributeType: <sk_type>
      KeySchema:
        - AttributeName: <pk_name>
          AttributeType: HASH
        - AttributeName: <sk_name>
          AttributeType: RANGE
      BillingMode: PAY_PER_REQUEST

CDK Example:

import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';

const table = new dynamodb.Table(this, '<asset_name>', {
  tableName: '<table_name>',
  partitionKey: { name: '<pk_name>', type: dynamodb.AttributeType.<pk_type> },
  sortKey: { name: '<sk_name>', type: dynamodb.AttributeType.STRING.<sk_type> },
  billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
});

Parameter Explanation:

  • <asset_name>: Resource name in CloudFormation/CDK.
  • <table_name>: DynamoDB table name.
  • <pk_name>: Name of the partition key.
  • <pk_type>: Type of the partition key.
  • <sk_name>: Name of the sort key.
  • <sk_type>: Type of the sort key.

4. Database Operations

Initialization

Install the required dependencies:

npm install @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb

Initialize the DynamoDB client:

// dynamodb.util.ts
import { DynamoDB } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';

const dbClient = new DynamoDB({});

const marshallOptions = {
  convertEmptyValues: false, // Default: false
  removeUndefinedValues: false, // Default: false
  convertClassInstanceToMap: false, // Default: false
};

const unmarshallOptions = {
  wrapNumbers: false, // Default: false
};

const translateConfig = { marshallOptions, unmarshallOptions };

const docClient = DynamoDBDocumentClient.from(dbClient, translateConfig);

export { docClient };

Key Operations

Add Data

Add a new user to the user_table:

import { docClient } from './dynamodb.util';
import { PutCommand } from '@aws-sdk/lib-dynamodb';

const command = new PutCommand({
  TableName: 'user_table',
  Item: {
    pk: 'USER#12345',
    sk: 'USER_INFO',
    name: 'Alice',
    age: 30,
  },
});
const result = await docClient.send(command);

Delete Data

Delete a user with ID 12345:

import { docClient } from './dynamodb.util';
import { DeleteCommand } from '@aws-sdk/lib-dynamodb';

const command = new DeleteCommand({
  TableName: 'user_table',
  Key: {
    pk: 'USER#12345',
    sk: 'USER_INFO',
  },
});
const result = await docClient.send(command);

Update Data

Update the name attribute of a user with ID 12345:

import { docClient } from './dynamodb.util';
import { UpdateCommand } from '@aws-sdk/lib-dynamodb';

const command = new UpdateCommand({
  TableName: 'user_table',
  Key: {
    pk: 'USER#12345',
    sk: 'USER_INFO',
  },
  UpdateExpression: 'set #name = :name',
  ExpressionAttributeNames: {
    '#name': 'name',
  },
  ExpressionAttributeValues: {
    ':name': 'Bob',
  },
});
const result = await docClient.send(command);

Retrieve Single Item

Retrieve a user with ID 12345:

import { docClient } from './dynamodb.util';
import { GetCommand } from '@aws-sdk/lib-dynamodb';

const command = new GetCommand({
  TableName: 'user_table',
  Key: {
    pk: 'USER#12345',
    sk: 'USER_INFO',
  },
});
const result = await docClient.send(command);

Retrieve Multiple Items

Retrieve users whose age is between 20 and 30:

import { docClient } from './dynamodb.util';
import { ScanCommand } from '@aws-sdk/lib-dynamodb';

const command = new ScanCommand({
  TableName: 'user_table',
  FilterExpression: '#sk = :sk and #age between :start and :end',
  ExpressionAttributeNames: {
    '#sk': 'sk',
    '#age': 'age',
  },
  ExpressionAttributeValues: {
    ':sk': 'USER',
    ':start': 20,
    ':end': 30,
  },
});
const result = await docClient.send(command);

Note: Scans are less efficient than using indexes. Use GSI (Global Secondary Index) whenever possible.

Simultaneous updates to multiple data using transactions

DynamoDB supports ACID transactions, allowing multiple operations to succeed or fail together.

The following example updates the name attributes of two users at the same time:

import { docClient } from './dynamodb.util';
import { TransactWriteCommand } from '@aws-sdk/lib-dynamodb';

const command = new TransactWriteCommand({
  TransactItems: [
    {
      Update: {
        TableName: 'user_table',
        Key: {
          pk: 'USER#12345',
          sk: 'USER_INFO',
        },
        UpdateExpression: 'set #name = :name',
        ExpressionAttributeNames: {
          '#name': 'name',
        },
        ExpressionAttributeValues: {
          ':name': 'Bob',
        },
      },
    },
    {
      Update: {
        TableName: 'user_table',
        Key: {
          pk: 'USER#56789',
          sk: 'USER_INFO',
        },
        UpdateExpression: 'set #name = :name',
        ExpressionAttributeNames: {
          '#name': 'name',
        },
        ExpressionAttributeValues: {
          ':name': 'Lisa',
        },
      },
    },
  ],
});
const result = await docClient.send(command);

5. Global Secondary Index (GSI)

Overview of GSI

A global secondary index (GSI) allows you to execute queries using a key that is different from the existing partition key or sort key of the table. This improves query performance.

For example, when retrieving users with age between 20 and 30 in user_table, you can retrieve data efficiently by utilizing a GSI without performing a full table scan.

Defining a GSI

Example using CloudFormation:

Resources:
  WorkTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: user_table
      AttributeDefinitions:
        - AttributeName: pk
          AttributeType: S
        - AttributeName: sk
          AttributeType: S
        - AttributeName: age # add age attribute
          AttributeType: N # type: number
      KeySchema:
        - AttributeName: pk
          KeyType: HASH
        - AttributeName: sk
          KeyType: RANGE
      BillingMode: PAY_PER_REQUEST
      GlobalSecondaryIndexes:
        - IndexName: UserAgeIndex # GSI name
          KeySchema:
            - AttributeName: sk # GSI pk
              KeyType: HASH
            - AttributeName: age # GSI sk
              KeyType: RANGE
          Projection:
            ProjectionType: ALL

Query Example Using GSI

Here’s an example of running a query using the above GSI (UserAgeIndex):

import { docClient } from './dynamodb.util';
import { QueryCommand } from '@aws-sdk/lib-dynamodb';

const command = new QueryCommand({
  TableName: 'user_table',
  IndexName: 'UserAgeIndex',
  KeyConditionExpression: '#sk = :sk and #age between :start and :end',
  ExpressionAttributeNames: {
    '#sk': 'sk',
    '#age': 'age',
  },
  ExpressionAttributeValues: {
    ':sk': 'USER',
    ':start': 20,
    ':end': 30,
  },
  ScanIndexForward: false, // false: descending order, true: ascending order (default is true)
});
const result = await docClient.send(command);

6. Best Practices for Table Design

The design of DynamoDB differs from relational databases, as it focuses on optimizing access patterns. Below is an example:

Data Structure:

  • Departments: Department ID, Department Name
  • Employees: Employee ID, Employee Name, Email
  • Attendance Records: Date, Check-in Time, Check-out Time, Break Time

Search Requirements:

  • Retrieve a list of departments.
  • Retrieve employee information by Employee ID.
  • Retrieve employees belonging to a specific department using Department ID.
  • Retrieve monthly attendance records of an employee using Employee ID and month/year.

In relational databases, the design would look like this:

rds

For DynamoDB, the design would look like this:

dynamodb

Retrieve a List of Departments

To retrieve all department entries from the EmployeeTable, use a query where the primary key pk is set to DEPARTMENT:

import { docClient } from './dynamodb.util';
import { QueryCommand } from '@aws-sdk/lib-dynamodb';

const command = new QueryCommand({
  TableName: 'EmployeeTable',
  KeyConditionExpression: 'pk = :pk',
  ExpressionAttributeValues: {
    ':pk': 'DEPARTMENT',
  },
});
const result = await docClient.send(command);

Retrieve Employee Information by ID

To retrieve specific employee information, set pk to Employee#<employee_id> and sk to INFO:

import { docClient } from './dynamodb.util';
import { GetCommand } from '@aws-sdk/lib-dynamodb';

const command = new GetCommand({
  TableName: 'EmployeeTable',
  Key: {
    pk: 'Employee#<employee_id>',
    sk: 'INFO',
  },
});
const result = await docClient.send(command);

Retrieve Employees by Department ID

Using a GSI (DepartmentIndex), retrieve a list of employees belonging to a specific department by querying the departmentId:

import { docClient } from './dynamodb.util';
import { QueryCommandInput } from '@aws-sdk/lib-dynamodb';

const command = new QueryCommandInput({
  TableName: 'EmployeeTable',
  IndexName: 'DepartmentIndex',
  KeyConditionExpression: 'departmentId = :departmentId',
  ExpressionAttributeValues: {
    ':departmentId': '<department_id>',
  },
});
const result = await docClient.send(command);

Retrieve Monthly Attendance Records of an Employee

Filter attendance records by setting pk to Employee#<employee_id> and using a prefix WORK#<month/year> for the sk:

import { docClient } from './dynamodb.util';
import { QueryCommandInput } from '@aws-sdk/lib-dynamodb';

const command = new QueryCommandInput({
  TableName: 'EmployeeTable',
  KeyConditionExpression: 'pk = :pk and begins_with(sk, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'Employee#<employee_id>',
    ':sk': 'WORK#202501', // Attendance records for January 2025
  },
});
const result = await docClient.send(command);

Best Practices

  • Define Access Patterns Clearly: DynamoDB design revolves around “how data will be accessed.”
  • Single-Table Design: Store different types of data in one table using partition and sort keys.
  • Use Indexes: Leverage GSI and LSI for flexible queries.
  • Minimize Scans: Use keys or indexes to query data efficiently.
  • Utilize Transactions: Ensure data consistency with DynamoDB’s ACID transactions.

With these principles, you can maximize DynamoDB’s potential for scalable and efficient data operations. Let me know if you’d like to explore further details or specific use cases!

About

A personal blog sharing technical insights, experiences and thoughts

Quick Links

Contact

  • Email: hushukang_blog@proton.me
  • GitHub

© 2025 Swift Code Chronicles. All rights reserved