SQLExplain
How to Design a Database Schema for an Online Learning Platform in PostgreSQL
How to Design a Database Schema for an Online Learning Platform
Introduction
Platforms like Udemy, Coursera, and Teachable are deceptively complex from a data modelling perspective. On the surface you have courses and students — but behind that are structured lesson hierarchies with sections and sort orders, per-lesson progress tracking, quiz engines with scoring and attempt limits, assignment submissions with instructor feedback, and certificate generation once a student reaches 100% completion. Get the schema wrong and you end up either over-normalising (a separate table per role, per content type) or under-normalising (progress buried inside an enrollment JSON blob).
In this guide we'll build a complete PostgreSQL schema for an online learning platform from scratch. Along the way we'll cover:
- Why a single
userstable with arolecolumn beats separatestudentsandinstructorstables - How to model the course → section → lesson hierarchy without adding a
sectionstable - The difference between
lesson_progress(per-lesson) andenrollments.progress_pct(aggregate) and why you need both - How
DISTINCT ONgives you the most recent quiz attempt per student in one clean query - Five production queries covering progress dashboards, quiz results, instructor revenue, and certificate eligibility
Step 1: Identify Your Entities
An online learning platform needs to manage content (what's being taught), access (who's enrolled), activity (what progress has been made), and outcomes (grades, certificates).
| Table | Description |
|---|---|
users | Students, instructors, and admins — single table with a role column |
courses | Course listings with category, difficulty, price, and publish status |
lessons | Individual lessons, ordered within named sections inside a course |
enrollments | Student–course relationship tracking status and aggregate progress |
lesson_progress | Per-lesson completion record for each enrollment |
quizzes | Quiz definitions attached to a course, with passing score and attempt limit |
quiz_questions | Individual questions with type and correct answer |
quiz_attempts | Student quiz submissions recording score and pass/fail |
assignments | Written or project-based tasks with a due date and max score |
assignment_submissions | Student submissions with grade and instructor feedback |
reviews | Star ratings and comments — one per student per course |
certificates | Issued on course completion, with a URL to the generated credential |
Step 2: Define the Relationships
- An instructor (user with role
INSTRUCTOR) creates many courses; each course has many lessons and may have many quizzes and assignments - A student (role
STUDENT) enrolls in many courses; each enrollment tracks overallprogress_pct - Each enrollment has many lesson_progress rows — one per lesson — recording whether that lesson is complete
- A quiz belongs to a course and has many quiz_questions; a student can make multiple quiz_attempts up to
max_attempts - An assignment belongs to a course; a student has at most one assignment_submission per assignment (enforced by a
UNIQUEconstraint) - A review is written by a student for a course — one per student per course (
UNIQUEconstraint) - A certificate is issued per enrollment once the student reaches full completion;
UNIQUEonenrollment_idensures it's issued only once
DBA Tip: A single
userstable with arolecolumn is almost always the right choice over separatestudentsandinstructorstables. Users can change roles (a student becomes an instructor), and you avoid duplicated contact/auth columns. If role-specific data grows significantly, extend with ainstructor_profilesorstudent_profilessatellite table rather than splitting the core identity table.
Step 3: Build the Schema
Key design choices throughout:
UUIDprimary keys viagen_random_uuid()— safe for distributed deployments and public-facing API IDsVARCHAR + CHECKinstead of ENUM forrole,content_type,status— easy to add new values without a schema migrationlesson_progressis insert-only — a row is created when a student completes a lesson;is_completeddefaults toTRUE. The enrollment'sprogress_pctis the derived aggregatequiz_questions.correct_answer TEXT— stores the option key for MCQ ('A','B') or the expected text for short-answer; complex scoring lives in the application layerUNIQUE(assignment_id, user_id)on submissions andUNIQUE(user_id, course_id)on reviews — both enforced at DB level, not just application level
-- Enable UUID generation
CREATE EXTENSION IF NOT EXISTS pgcrypto;
-- ============================================================
-- USERS (students, instructors, admins in one table)
-- ============================================================
CREATE TABLE users (
user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
full_name VARCHAR(150) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
role VARCHAR(20) NOT NULL DEFAULT 'STUDENT'
CHECK (role IN ('STUDENT', 'INSTRUCTOR', 'ADMIN')),
bio TEXT,
avatar_url TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ============================================================
-- COURSES
-- ============================================================
CREATE TABLE courses (
course_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
instructor_id UUID NOT NULL REFERENCES users(user_id),
title VARCHAR(200) NOT NULL,
description TEXT,
category VARCHAR(100), -- e.g. 'Data Science', 'Web Development'
difficulty VARCHAR(20) NOT NULL DEFAULT 'Beginner'
CHECK (difficulty IN ('Beginner', 'Intermediate', 'Advanced')),
price NUMERIC(10, 2) NOT NULL DEFAULT 0,
thumbnail_url TEXT,
is_published BOOLEAN NOT NULL DEFAULT FALSE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ============================================================
-- LESSONS (ordered within named sections)
-- ============================================================
CREATE TABLE lessons (
lesson_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
course_id UUID NOT NULL REFERENCES courses(course_id) ON DELETE CASCADE,
section_name VARCHAR(150), -- logical grouping, e.g. 'Module 1: Foundations'
title VARCHAR(200) NOT NULL,
content_type VARCHAR(20) NOT NULL DEFAULT 'Video'
CHECK (content_type IN ('Video', 'Article', 'Audio', 'Interactive')),
content_url TEXT, -- video URL, article body, etc.
duration_min INT, -- NULL for articles/interactive lessons
sort_order INT NOT NULL DEFAULT 0, -- global order within the course
is_preview BOOLEAN NOT NULL DEFAULT FALSE, -- free preview without enrollment
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ============================================================
-- ENROLLMENTS (student ↔ course, with aggregate progress)
-- ============================================================
CREATE TABLE enrollments (
enrollment_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
course_id UUID NOT NULL REFERENCES courses(course_id),
status VARCHAR(20) NOT NULL DEFAULT 'ACTIVE'
CHECK (status IN ('ACTIVE', 'COMPLETED', 'DROPPED', 'EXPIRED')),
progress_pct NUMERIC(5, 2) NOT NULL DEFAULT 0
CHECK (progress_pct BETWEEN 0 AND 100),
enrolled_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
completed_at TIMESTAMPTZ,
UNIQUE (user_id, course_id) -- one enrollment per student per course
);
-- ============================================================
-- LESSON PROGRESS (granular per-lesson completion)
-- ============================================================
CREATE TABLE lesson_progress (
progress_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
enrollment_id UUID NOT NULL REFERENCES enrollments(enrollment_id) ON DELETE CASCADE,
lesson_id UUID NOT NULL REFERENCES lessons(lesson_id) ON DELETE CASCADE,
is_completed BOOLEAN NOT NULL DEFAULT TRUE,
completed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (enrollment_id, lesson_id) -- one completion record per lesson per enrollment
);
-- ============================================================
-- QUIZZES
-- ============================================================
CREATE TABLE quizzes (
quiz_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
course_id UUID NOT NULL REFERENCES courses(course_id) ON DELETE CASCADE,
title VARCHAR(200) NOT NULL,
description TEXT,
passing_score_pct NUMERIC(5, 2) NOT NULL DEFAULT 70
CHECK (passing_score_pct BETWEEN 0 AND 100),
max_attempts INT NOT NULL DEFAULT 3, -- -1 = unlimited
time_limit_min INT, -- NULL = no time limit
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ============================================================
-- QUIZ QUESTIONS
-- ============================================================
CREATE TABLE quiz_questions (
question_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
quiz_id UUID NOT NULL REFERENCES quizzes(quiz_id) ON DELETE CASCADE,
question_text TEXT NOT NULL,
question_type VARCHAR(20) NOT NULL DEFAULT 'MCQ'
CHECK (question_type IN ('MCQ', 'True/False', 'Short Answer')),
option_a TEXT, -- MCQ options; NULL for short-answer questions
option_b TEXT,
option_c TEXT,
option_d TEXT,
correct_answer TEXT NOT NULL, -- e.g. 'A', 'True', or expected short text
points INT NOT NULL DEFAULT 1,
sort_order INT NOT NULL DEFAULT 0
);
-- ============================================================
-- QUIZ ATTEMPTS
-- ============================================================
CREATE TABLE quiz_attempts (
attempt_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
quiz_id UUID NOT NULL REFERENCES quizzes(quiz_id),
user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
score_pct NUMERIC(5, 2) NOT NULL,
passed BOOLEAN NOT NULL,
started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
submitted_at TIMESTAMPTZ
);
-- ============================================================
-- ASSIGNMENTS
-- ============================================================
CREATE TABLE assignments (
assignment_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
course_id UUID NOT NULL REFERENCES courses(course_id) ON DELETE CASCADE,
title VARCHAR(200) NOT NULL,
instructions TEXT,
due_date TIMESTAMPTZ,
max_score INT NOT NULL DEFAULT 100,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ============================================================
-- ASSIGNMENT SUBMISSIONS (one per student per assignment)
-- ============================================================
CREATE TABLE assignment_submissions (
submission_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
assignment_id UUID NOT NULL REFERENCES assignments(assignment_id) ON DELETE CASCADE,
user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
submission_url TEXT, -- link to file, repo, or written response
grade NUMERIC(6, 2), -- NULL until graded
feedback TEXT, -- instructor comments
submitted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
graded_at TIMESTAMPTZ,
UNIQUE (assignment_id, user_id) -- one submission per student
);
-- ============================================================
-- REVIEWS (one per student per course)
-- ============================================================
CREATE TABLE reviews (
review_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
course_id UUID NOT NULL REFERENCES courses(course_id),
rating SMALLINT NOT NULL CHECK (rating BETWEEN 1 AND 5),
comment TEXT,
reviewed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (user_id, course_id) -- one review per student per course
);
-- ============================================================
-- CERTIFICATES (issued once on completion)
-- ============================================================
CREATE TABLE certificates (
certificate_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
enrollment_id UUID UNIQUE NOT NULL REFERENCES enrollments(enrollment_id),
user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
course_id UUID NOT NULL REFERENCES courses(course_id),
certificate_url TEXT, -- URL to generated PDF or image credential
issued_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ============================================================
-- INDEXES
-- ============================================================
CREATE INDEX idx_courses_instructor ON courses(instructor_id);
CREATE INDEX idx_courses_category ON courses(category, is_published);
CREATE INDEX idx_lessons_course_order ON lessons(course_id, sort_order);
CREATE INDEX idx_enrollments_user ON enrollments(user_id);
CREATE INDEX idx_enrollments_course ON enrollments(course_id);
CREATE INDEX idx_lesson_progress_enroll ON lesson_progress(enrollment_id);
CREATE INDEX idx_quiz_attempts_user_quiz ON quiz_attempts(user_id, quiz_id);
CREATE INDEX idx_reviews_course ON reviews(course_id);
CREATE INDEX idx_certificates_user ON certificates(user_id);
Step 4: ER Diagram
The schema organises cleanly around three concerns — content, access, and outcomes:
Scroll or pinch to zoom · drag to pan · double-click to zoom in
The enrollments table is the load-bearing centre of the student experience — it links a student to a course and anchors both lesson_progress rows (granular) and the certificate (final outcome).
Step 5: Example Queries
Query 1 — Top-Rated Courses by Category
Ranks courses within each category using a window function, so you can show "Best in Data Science" and "Best in Web Development" in a single query.
SELECT
category,
title,
instructor_name,
avg_rating,
total_reviews,
RANK() OVER (
PARTITION BY category
ORDER BY avg_rating DESC, total_reviews DESC
) AS rank_in_category
FROM (
SELECT
c.course_id,
c.category,
c.title,
u.full_name AS instructor_name,
ROUND(AVG(r.rating), 2) AS avg_rating,
COUNT(r.review_id) AS total_reviews
FROM courses c
JOIN users u ON c.instructor_id = u.user_id
LEFT JOIN reviews r ON c.course_id = r.course_id
WHERE c.is_published = TRUE
GROUP BY c.course_id, c.category, c.title, u.full_name
HAVING COUNT(r.review_id) >= 3
) ranked
ORDER BY category, rank_in_category;
Query 2 — Student Course Progress Breakdown
Shows exactly how many lessons a student has completed vs the total, alongside their aggregate progress percentage and time since enrollment.
SELECT
c.title AS course_title,
e.status,
e.progress_pct,
COUNT(l.lesson_id) AS total_lessons,
COUNT(lp.progress_id) AS completed_lessons,
ROUND(
COUNT(lp.progress_id) * 100.0
/ NULLIF(COUNT(l.lesson_id), 0), 1
) AS calculated_pct,
EXTRACT(DAY FROM NOW() - e.enrolled_at)::INT AS days_since_enrollment
FROM enrollments e
JOIN courses c ON e.course_id = c.course_id
JOIN lessons l ON c.course_id = l.course_id
LEFT JOIN lesson_progress lp
ON e.enrollment_id = lp.enrollment_id
AND l.lesson_id = lp.lesson_id
WHERE e.user_id = 'your-student-uuid-here'
GROUP BY c.title, e.status, e.progress_pct, e.enrolled_at
ORDER BY e.enrolled_at DESC;
Analyst Tip: Notice
calculated_pct(computed fromlesson_progress) vsprogress_pct(stored onenrollments). The stored value is kept for fast list-page queries. If they ever diverge,calculated_pctis the source of truth — use it to audit and repairprogress_pctin a maintenance job.
Query 3 — Latest Quiz Attempt Per Student
DISTINCT ON is a PostgreSQL gem — it returns the first row per group after ordering, giving you each student's most recent attempt in a single scan without a self-join or subquery.
SELECT DISTINCT ON (qa.user_id, qa.quiz_id)
u.full_name,
u.email,
qz.title AS quiz_title,
qa.score_pct,
qa.passed,
qa.submitted_at AS last_attempt_at,
COUNT(*) OVER (
PARTITION BY qa.user_id, qa.quiz_id
) AS total_attempts
FROM quiz_attempts qa
JOIN users u ON qa.user_id = u.user_id
JOIN quizzes qz ON qa.quiz_id = qz.quiz_id
WHERE qz.course_id = 'your-course-uuid-here'
ORDER BY qa.user_id, qa.quiz_id, qa.submitted_at DESC;
Query 4 — Instructor Revenue Stats
Joins enrollments to courses to aggregate total earnings per course. FILTER on status breaks down active vs dropped enrollments without needing a subquery.
SELECT
c.title AS course_title,
c.price,
COUNT(e.enrollment_id) AS total_enrollments,
COUNT(e.enrollment_id) FILTER (WHERE e.status = 'ACTIVE') AS active_students,
COUNT(e.enrollment_id) FILTER (WHERE e.status = 'COMPLETED') AS completions,
ROUND(
COUNT(e.enrollment_id) FILTER (WHERE e.status = 'COMPLETED')
* 100.0 / NULLIF(COUNT(e.enrollment_id), 0), 1
) AS completion_rate_pct,
ROUND(COUNT(e.enrollment_id) * c.price, 2) AS gross_revenue
FROM courses c
LEFT JOIN enrollments e ON c.course_id = e.course_id
WHERE c.instructor_id = 'your-instructor-uuid-here'
AND c.is_published = TRUE
GROUP BY c.course_id, c.title, c.price
ORDER BY gross_revenue DESC;
Query 5 — Students Eligible for a Certificate
Finds students who have completed every lesson in a course AND passed at least one quiz attempt, but haven't yet received a certificate. These are the rows your certificate-generation job should process.
SELECT
e.enrollment_id,
u.user_id,
u.full_name,
u.email,
c.title AS course_title,
e.progress_pct
FROM enrollments e
JOIN users u ON e.user_id = u.user_id
JOIN courses c ON e.course_id = c.course_id
WHERE e.status = 'ACTIVE'
AND e.progress_pct = 100
-- All lessons completed
AND NOT EXISTS (
SELECT 1
FROM lessons l
LEFT JOIN lesson_progress lp
ON lp.lesson_id = l.lesson_id
AND lp.enrollment_id = e.enrollment_id
WHERE l.course_id = c.course_id
AND lp.progress_id IS NULL -- lesson has no completion record
)
-- At least one passing quiz attempt
AND EXISTS (
SELECT 1
FROM quiz_attempts qa
JOIN quizzes qz ON qa.quiz_id = qz.quiz_id
WHERE qa.user_id = u.user_id
AND qz.course_id = c.course_id
AND qa.passed = TRUE
)
-- Certificate not yet issued
AND NOT EXISTS (
SELECT 1
FROM certificates cert
WHERE cert.enrollment_id = e.enrollment_id
)
ORDER BY e.enrollment_id;
Extensibility
| Feature | Extension |
|---|---|
| Discussion forums | Add forum_threads and forum_replies tables linked to a course_id or lesson_id |
| Live sessions | Add a live_sessions table with a scheduled time, meeting URL, and a session_attendees junction |
| Coupons & discounts | Add a coupons table (code, discount_type, amount, expiry) and reference it in enrollments |
| Learning paths | Add a learning_paths table grouping ordered courses, with a learning_path_enrollments junction |
| Instructor payouts | Add a payouts table recording transfer amounts and dates; derive payout amounts from enrollment aggregates |
| Course sections as a table | If section metadata grows (section-level quizzes, section thumbnails), promote section_name into a proper course_sections table that lessons FK into |
DBA Tip:
lesson_progresswill be your highest-insert table at scale — one row per lesson per student. Consider partitioning it byenrollment_idrange or archiving completed enrollments to a cold-storage table. For real-time progress bars, a Redis counter perenrollment_idupdated on each lesson completion avoids hitting PostgreSQL on every page load.
Summary
We've built a complete online learning platform schema in PostgreSQL across 12 tables covering:
- A single
userstable with arolecolumn that cleanly handles students, instructors, and admins without duplication - A two-level progress model — granular
lesson_progressrows per lesson, plus a fast aggregateprogress_pctonenrollmentsfor dashboards - A quiz engine with configurable attempt limits, multiple question types, and per-attempt score tracking using
DISTINCT ONto find the latest result efficiently - DB-enforced uniqueness on assignment submissions and reviews, so data integrity doesn't depend solely on application logic
- Certificate eligibility driven entirely from the database — completeness checks via
NOT EXISTS, pass checks viaEXISTS, all in a single query your background job can poll
The schema gives you a working foundation you can extend incrementally — forums, live sessions, coupons, and learning paths all slot in without restructuring the core tables.
Try this schema in a live sandbox
Create a free SQLExplain account and land directly in a sandbox pre-loaded with the Online Learning Schema from this post — no password, no setup.
