SQL - What Is Data, What Is a Database, and How DBMS Makes It All Work

Welcome to this tutorial series on SQL and databases. Before we jump into writing queries or clicking around in tools, we need to get our foundations rock solid. That starts with one simple question that sounds basic but matters a lot in everything we do with databases - what is data?

Prefer watching instead of reading? You can watch the full walkthrough below, or keep scrolling to read the complete article.

What is data?

In simple words, data is just facts about any object in focus. Your name, age, height, weight - all of that is data about you. The number of steps you walked today is data. A picture on your phone is data. A PDF invoice from last month is data. Even a tiny log entry that says 2025-08-29 10:15:07 - user signed in is also data. Data shows up as text, numbers, dates, images, audio, video, and even sensor streams you never see directly.

The key thing to keep in mind is that raw data by itself can be messy and random. A folder stuffed with photos named IMG_3938 and IMG_3939 is still data, but it might not be helpful until you bring some order to it. Once you structure that data - say by tagging photos with a date, location, and people in the image - you move from random bits to something you can search and use in a meaningful way.

You will also hear a few related words that are helpful:

Information - data that has been processed to be useful. A single body weight number is data. A monthly trend showing your weight going down is information.
Metadata - data about data. A picture might have width, height, camera model, GPS location. That is metadata.
Structured data - data that follows a clear schema, like rows and columns in a table.
Semi-structured data - data that has some structure, like JSON or XML, but not a fixed table format.
Unstructured data - data without a predefined model, like free-text documents, raw images, or audio files.

Think of data like ingredients in your kitchen. Flour, eggs, sugar, milk - that is data. A recipe that turns those ingredients into a cake is you applying structure and process. Databases are like your pantry plus a neat labeling system, and SQL is the set of tools and steps you use to get exactly the result you want.

What is a database?

A database is a systematic collection of data. Not a random pile. Not a chaotic list. Systematic means someone took the time to decide how data should be organized, stored, and retrieved. When data is structured in a database, you can search it faster, keep it consistent, and share it safely with many users and apps without everyone stepping on each other.

Databases give you a few superpowers:

Organization - you store related data together and label it with a schema so it is easy to find.
Consistency - you set rules that keep data valid. No age of -50, no email without an at symbol.
Concurrency - many people can access and change data at the same time without chaos.
Durability - your data persists even if the app crashes or the power goes out.
Security - not everyone should see everything. A database can enforce that.

People sometimes ask: why not just use files or spreadsheets? You can for small stuff, and sometimes that is fine. But as soon as you have more than one user, or you need rules, or you want complex search, or you want to avoid duplicate entries blowing up your reports, a database is the smarter choice. A database keeps your data in one source of truth and makes it reliable.

What is a DBMS - Database Management System

A Database Management System, or DBMS, is the software layer that sits between your data and you. It is a collection of programs that lets you access the database, manipulate data, and present that data the way you need it. It also controls who can see what and who can change what. Without a DBMS, a database would just be files on disk. With a DBMS, you get a full toolset to talk to your data safely and efficiently.

What does a DBMS actually do behind the scenes?

Query processing - turns your SQL into a plan it can execute efficiently.
Storage management - decides how data is laid out on disk and in memory.
Transaction management - makes sure your changes are all-or-nothing and consistent.
Concurrency control - lets many users read and write at the same time without conflicts.
Security and access control - manages users, roles, and permissions.
Backup and recovery - protects you when things go wrong.

Let me ground this with real examples, because that is where DBMS power becomes obvious.

Example 1 - The online telephone directory

Imagine an online directory that lets you search by name, city, or phone number. Behind the scenes, a DBMS stores millions of records like FirstName, LastName, City, Zip, Phone. When you type Patel in the search box, the system runs a query that looks for last names that match Patel, sorts results by city if needed, and shows you a clean list. The DBMS keeps indexes on last name and zip code so searches are fast. If two people edit the same contact, the DBMS handles that safely so your changes do not overwrite someone else. If the power goes out mid-edit, the DBMS rolls back partial changes so the data never sits half broken.

Example 2 - Your electricity provider

Your electricity provider manages billing, payments, outage reports, meter readings, and service requests. Every bill is tied to a customer record, an address, and usage data from your meter. The DBMS enforces rules like a bill amount cannot be negative or a payment must be linked to an existing bill. It keeps historical data so they can check last year’s usage compared to this year. When a region reports an outage, the DBMS helps the system find all affected customers fast. Every reading and payment is stored in transactions so nothing gets lost if a server restarts during a busy day.

Example 3 - A social network like Facebook

A social network stores members, their friends, posts, comments, messages, reactions, photos, ads, and more. That is a lot of interconnected data. Think about your feed. It pulls posts from your friends, groups you are in, and pages you follow. It joins that data with your reactions and comments, then filters out things you hid. The DBMS makes it possible to read and write at massive scale while keeping relationships intact. When you hit Like, that is a new row in a table. When you unfriend someone, a row that links two people gets removed. The system uses indexes, caching, and tuned queries so everything feels snappy.

We could list examples all day - inventory systems, booking platforms, health records, online learning portals. If there is data that many people need to read and write with rules and history, there is a DBMS doing the heavy lifting.

DBMS did not show up yesterday - a quick history

Database management is not new. The first generation of DBMS tech showed up in the 1960s. Charles W. Bachman created the Integrated Data Store, often called IDS, and he received the Turing Award for his work. Back then, databases focused on performance and stored pointers that connected records directly. It was fast for specific patterns but hard to change because the structure was tightly wired into the application code.

Soon after, we got the network and hierarchical models. IBM IMS is a famous hierarchical system that still runs mission critical workloads. In 1970, Edgar F. Codd proposed the relational model, which brought the idea of storing data in tables with rows and columns and querying with a high level language. That led to SQL - Structured Query Language - and to the systems we most commonly use today like MySQL, Oracle Database, Microsoft SQL Server, and PostgreSQL.

Over time, database tech evolved a lot. We got better indexing strategies, transactions with ACID guarantees, replication, partitioning, and tooling that makes day to day work easier. The expected features grew too - bigger data sizes, more users, and stricter security. But at the core there is still the same mission: store data in a structured way and make it easy and safe to get it out again.

Types of DBMS - the family tree

The DBMS family evolved over decades. If you imagine a timeline, you would see different models rising and falling based on the problems they solved best. Here is a text version of the classic diagram people show in slides:


1960s: Hierarchical model
        |
        +-- 1970s: Network model
                    |
                    +-- 1970s onward: Relational model
                                   |
                                   +-- 1990s onward: Object-relational features

From hierarchical, to network, to relational, with object-relational features added later as needs grew.

Now let us look at each model in plain language with real world examples.

Hierarchical DBMS - parent child trees

A hierarchical DBMS stores data in a tree. Think of it like folders inside folders on your computer. Each parent can have many children, but a child has exactly one parent. You navigate by following branches from the root to the leaf where a record lives. This makes some queries very fast because the path is clear, but it makes many-to-many relationships difficult or impossible without duplicating data.

A practical example is the Windows Registry. It is still around today and in older versions like Windows XP it was easy to visualize. Configuration settings are stored as keys and subkeys, forming a tree with values attached at various levels. For settings that naturally fit a single path - like Computer - HKEY_LOCAL_MACHINE - Software - Vendor - App - Version - this structure is perfect. The structure is simple, predictable, and fast to navigate with the right key path in hand.

Where this model struggles is when an item logically belongs to more than one parent. Imagine a user belonging to multiple departments. In a tree, the user can live in only one place, so you either duplicate the user record or you adjust the design with cross references that do not feel native to the model. That is one reason relational databases became popular for business data.

Network DBMS - many to many with pointers

The network model supports many-to-many relationships directly. Instead of a strict parent child hierarchy, records are connected through sets and pointers. This gives you flexibility and speed for certain kinds of traversals, but it also makes the structure more complex. Developers have to understand the web of links in order to query the data effectively.

RDM Server is a classic example of a system that implements the network model. An industry example where network databases made sense is a bill of materials for manufacturing. A component can be part of many products, and a product can have many components. The links between parts are first class and the queries often walk those links. In systems like this, pointer-based navigation can be fast and predictable, which is why these models stuck around in some niches.

That said, as business apps grew and teams needed more ad hoc querying, a simpler model for describing data won hearts and minds. Enter the relational model.

Relational DBMS - tables, rows, and columns

A relational DBMS stores data in tables, also called relations. Each table has columns with defined data types and constraints, and each row represents a record. You describe relationships between tables with keys. You write queries in SQL to join tables and get exactly the data you need. This model maps well to how people think about structured data and it is the most widely used DBMS type today.

A common misunderstanding is that relational databases do not support many-to-many relationships. They absolutely do - you model them with a join table. For example, if members can join many groups and groups can have many members, you have a Members table, a Groups table, and a MemberGroup table that holds pairs of member_id and group_id. That table makes the relationship explicit and easy to query.

Relational systems usually support a rich set of data types like integers, decimals, variable length strings, dates, timestamps, and sometimes JSON. They also enforce rules through constraints. A primary key uniquely identifies a row. A foreign key ensures a value points to a valid row in another table. Not null, unique, and check constraints keep data clean. These features help a lot as databases grow.

Popular relational DBMS software includes MySQL, Oracle Database, and Microsoft SQL Server. Each has its own strengths, licensing, and tooling, but they all let you work with data using SQL and they all follow the core relational principles. This is the model we will spend the most time with in this series.

Object oriented relational DBMS - richer types and behaviors

Over time, developers wanted to store more complex data directly, not just simple rows and columns. Object oriented relational DBMS - often called object-relational - adds support for richer types and behaviors while keeping the table structure. You can store objects or complex types, define functions that act on those types, and sometimes inherit types in limited ways. At the same time you still get SQL, joins, constraints, and transactions.

PostgreSQL is a great example here. It supports arrays, JSON and JSONB, ranges, hstore for key value data, and custom types. You can write functions and operators that work with those types. If you need to store a user profile that includes a list of skills, preferences, or a flexible set of settings, you can model that cleanly without breaking out into a separate system. This blend of traditional relational structure with modern data needs is why PostgreSQL is widely loved.

A quick note so the terms do not get confused: there are also pure object databases that store objects directly without tables. That is not what we are talking about here. Object-relational means relational at the core with object-like features layered in.

So what is SQL - Structured Query Language

SQL is the standard language for working with relational databases. You will hear people pronounce it as S-Q-L or as Sequel. Both are common. With SQL, you can insert data, search data, update data, and delete data. That is the CRUD cycle - Create, Read, Update, Delete. SQL goes beyond CRUD though. You can define tables and constraints, control access, create indexes, and tune queries so they run faster.

Relational databases like MySQL, Oracle, Microsoft SQL Server, Sybase, and PostgreSQL all use SQL. The core syntax is very similar across systems because there is an ANSI and ISO standard behind it. But each database also adds its own flavor - extra functions, data types, and proprietary extensions. For example, limiting rows in MySQL uses LIMIT 10, while SQL Server uses SELECT TOP 10. String concatenation might use different operators. Date functions vary slightly. The good news is once you learn standard SQL, picking up the differences is straightforward.

A first query

Here is a simple SQL statement based on the example from the video transcript:

SELECT * 
FROM members
WHERE age > 30;

This says: pick all columns from the members table for rows where the age is greater than 30. You can swap the star for specific columns to keep results lean, like SELECT id, first_name, last_name FROM members WHERE age > 30.

DDL, DML, DCL, TCL - what those acronyms mean

DDL - Data Definition Language. Commands like CREATE TABLE and ALTER TABLE define your schema.
DML - Data Manipulation Language. Commands like INSERT, UPDATE, DELETE change the actual data.
DCL - Data Control Language. Commands like GRANT and REVOKE set permissions.
TCL - Transaction Control Language. Commands like COMMIT and ROLLBACK manage transactions.

CRUD in action

Let us create a tiny members table, then do basic CRUD to make it real.

-- Create a members table
CREATE TABLE members (
  id          INT PRIMARY KEY,
  first_name  VARCHAR(50) NOT NULL,
  last_name   VARCHAR(50) NOT NULL,
  age         INT CHECK (age >= 0),
  email       VARCHAR(255) UNIQUE
);

-- Insert a row
INSERT INTO members (id, first_name, last_name, age, email)
VALUES (1, 'Asha', 'Patel', 34, 'asha.patel@example.com');

-- Read data
SELECT id, first_name, last_name, age
FROM members
WHERE age > 30;

-- Update a row
UPDATE members
SET age = 35
WHERE id = 1;

-- Delete a row
DELETE FROM members
WHERE id = 1;

Notice how readable that is. SQL is declarative, which means you describe what you want, not how to do it step by step. The DBMS figures out the best plan to get your result. That is a huge win for developer productivity.

Why the database approach beats flat files for serious work

Back in the day, many apps stored data as flat files. Think CSVs on disk or custom binary files. That still works for certain small tasks. But the database approach has strong advantages when you go beyond toy examples.

Less duplication - with a relational design, you normalize data to avoid storing the same thing in multiple places. That cuts down on conflicts and errors.
Strong rules - constraints and keys protect data quality so you do not discover corrupt records a year later.
Concurrency - many users can read and write at the same time without corrupting files.
Better querying - SQL lets you filter, sort, group, and join on demand. You do not have to write custom code for every new question.
Security - users can see only the tables and rows they are allowed to see.
Backups and recovery - DBMS tools make it easier to back up data and restore quickly.
Transactions - you get all-or-nothing changes. No half written updates if a process fails.

That said, there is no shame in starting with a spreadsheet for a tiny project. Just know where the limits are. As soon as multiple users are adding and changing data, as soon as your logic needs rules, and as soon as you care about history and reliability, it is time for a real database.

SQL dialects - same language with regional accents

The core SQL grammar is consistent. You will always find SELECT, INSERT, UPDATE, DELETE, JOIN, and basic functions. Where vendors differ is in extras. Here are a few examples so you are not surprised when you switch systems:

Limiting rows - MySQL and PostgreSQL use LIMIT and OFFSET. SQL Server uses TOP and OFFSET FETCH.
String concat - MySQL and PostgreSQL use || or CONCAT, SQL Server uses +.
Upserts - PostgreSQL has INSERT ... ON CONFLICT, MySQL has INSERT ... ON DUPLICATE KEY UPDATE, SQL Server uses MERGE or custom patterns.
Date math - all vendors support it, but function names and intervals vary.
JSON support - PostgreSQL has strong JSON types and operators, MySQL supports JSON with its own functions, SQL Server has JSON support through functions as well.

If you stick to standard features when possible, your queries will be easier to move between systems. When you need vendor specific features for speed or convenience, document them clearly so the next person knows what is going on.

Putting it together - from data to database to DBMS to SQL

Let us wrap the story so far:

Data is the raw fact - a name, a number, a timestamp, a file.
A database is a structured collection of that data so it is orderly and searchable.
A DBMS is the software that lets you access, change, and protect the database.
SQL is the language you use to talk to the DBMS when the model is relational or object-relational.

With those foundations clear, the rest of SQL becomes way easier. When you write a SELECT, you know you are asking the DBMS to read rows from your tables. When you add a foreign key, you are telling the DBMS to keep a relationship honest. When you plan an index, you are asking the DBMS to arrange data so your common queries run faster.

More examples and mental models you can trust

The library analogy

Picture a library. Books are data. Shelves and sections are the database structure. The catalog system is your schema. The librarian is the DBMS that knows where everything lives and keeps track of who checked out what. SQL is how you ask the librarian for a book - by title, by author, by subject. If you just piled books on the floor, you would still have data but no reliable way to find the right book on a busy day.

The contact list analogy

Your phone’s contact list is a mini database. Each contact is a row. Fields like name, phone, email are columns. When you search "Sam", the phone runs a query over those columns and shows matching rows. When one app updates a contact and another app reads it at the same time, the system is handling concurrent access under the hood. That is the spirit of a DBMS on a tiny scale.

A quick many-to-many design

Let us design members and groups. This is a classic pattern you will use constantly.

CREATE TABLE members (
  id SERIAL PRIMARY KEY,
  first_name VARCHAR(50) NOT NULL,
  last_name  VARCHAR(50) NOT NULL
);

CREATE TABLE groups (
  id SERIAL PRIMARY KEY,
  name VARCHAR(100) NOT NULL UNIQUE
);

CREATE TABLE member_group (
  member_id INT NOT NULL REFERENCES members(id),
  group_id  INT NOT NULL REFERENCES groups(id),
  joined_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (member_id, group_id)
);

That member_group table is the magic. It is a simple bridge that turns many-to-many into two one-to-many links. Your queries then join across it to get all groups for a member or all members for a group.

What comes next in this series

Now that you have a clear picture of data, databases, DBMS types, and SQL, we are ready to play with real queries. In the next parts, we will create tables, write SELECTs that answer real questions, and build up from simple filters to joins, grouping, subqueries, and indexes. We will also look at how to keep data clean with keys and constraints.

If you want to jump ahead or save links for later, here are related topics you will probably enjoy:

Quick recap: DBMS stands for Database Management System. There are four classic types you should know - hierarchical, network, relational, and object oriented relational. The most widely used today is the relational model, which stores data in tables and uses SQL as the standard query language. SQL lets you insert, search, update, and delete records, and it gives you the tools to define and protect your schema. Compared to flat files, databases make storing and working with data safer, cleaner, and easier to scale.

Keep this page handy, and when you are ready, move on to building your first schema and writing queries that answer real business questions. See you in the next chapter.