SQL은 Structured Query Language의 약자로 High LV 프로그래밍 언어이다. 실제로 사람이 충분히 이해가 가능하다.
SQL은 크게 DDL(Data Definition Language)와 DML(Data Manipulation Language)로 나뉘는데 쉽게 말해 DDL은 Table의 생성, 삭제, 변경등을 하는 것이고 DML은 Data inserty/delete 등을 하는 것이다.
Table, Column을 만들 때 Data 값을 모르면 NULL값을 하기도 한다.
Database(DB)는 set of Relations(Tables)라고 생각하면 된다.
Multi Table에 Query를 날릴 땐 Join문을 쓴다.
같은 결과를 나타내는 쿼리의 종류는 많으니 어떻게 조합하는지는 User에게 달렸다.
기본적으로 SQL의 SFW Query라고 하는 구조는 Select / From / Where 구문이다.
SQL command 는 대소문자 구분을 하지 않고, 작은 따옴표만을 사용한다.
이제 간단하게 SQL문 실습을 좀 해보자~
PosrgreSQL Database에 기본적인 Table Schema와 SQL문 날리는 것 연습을 해보자
실습에 쓰일 DB의 Schema는 아래와 같다.
Table / Schema 만드는 법은 위 첨부된 PDF를 참고하면 된다.
1. Cartesian Product
Retrieve all possible comb of employees and pjt
Select * From Employees E, Projects p;
말 그대로, E / P Table을 조합해서 반환할 수 있는 모든 Row의 조합이 반환된다.
2. Inner Join
Retrieve the employee names and their departmet names
SELECT E.name, D.department_name
FROM Employees E
INNER JOIN Departments D ON E.department_id = D.department_id;
or
select name,department_name
from employees as e, departments as d
where e.department_id = d.department_id;
Department / Employee의 PF/FK 관계를 이용해서 Join Table을 만들고, 거기서 이름정보를 가져왔다.
아니면 Where 절로도 같은 결과를 만들어낼 수 있다. 이렇듯 한 결과를 만들어내는 쿼리의 종류는 다양할 수 있다는 것.
3. Outer Join
Outer Join의 특색을 살리기 위해 Table에 NULL값을 좀 추가하였다.
INSERT INTO Departments (department_name, location)
VALUES ('Legal', 'Boston');
INSERT INTO Employees (name, department_id, salary, hire_date)
VALUES ('Frank', NULL, 70000, '2022-01-01');
Retrieve all departments, including those with no employees and display the employee names where available.
( Left Outer Join )
SELECT D.department_name, E.name AS employee_name
FROM Departments D
LEFT JOIN Employees E ON D.department_id = E.department_id;
Retrieve all employees, including those not assigned to any department and display the department names where available.
( Right Outer Join )
SELECT D.department_name, E.name AS employee_name
FROM Departments D
RIGHT JOIN Employees E ON D.department_id = E.department_id;
Retrieve pairs of employees who work in the same department.
( Self Join )
select e1.name, e2.name
from employees as e1 join employees as e2 on e1.department_id = e2.department_id
where e1.employee_id <> e2.employee_id;
select e1.name, e2.name
from employees as e1 join employees as e2 on e1.department_id = e2.department_id
where e1.employee_id < e2.employee_id;
이렇게 Where 조건을 바꾸면 중복값을 줄일 수 있다.
Self - Join이라는게 뭐 별거 없고, 그냥 같은 Table끼리 join 시켜서 Data를 출력해내는 것이다.
4. Aggregation and Group by
Retrieve the average salary of employees for each department name
SELECT D.department_name, AVG(E.salary) AS avg_salaray
FROM Employees E
INNER JOIN Departments D ON E.department_id = D.department_id
GROUP BY D.department_name;
여기서 무턱대고,
select AVG(salary) as avg,department_name
from employees,departments
group by department_name;
이렇게 하면 안 된다.
왜냐면 from 절에 emp,dept 이렇게 하면 모든 경우의 수가 다 나오는 Table이 만들어져서
AVG 값이 모두 같은 값이 나오게 된다.
그니까, from 절은 결국 어떤 Table에서~ Data를 추출할 것이냐를 뜻하기에 이것을 Join문으로 표현해주고 Aggregation function을 써야한다.
5. Having
Retrieve the average salary of employees for each department where the total salary of all employees exceeds 100,000.
SELECT D.department_name, AVG(E.salary) AS avg_salary
FROM Employees E
INNER JOIN Departments D ON E.department_id = D.department_id
GROUP BY department_name
HAVING SUM(E.salary) > 100000;
✅ WHERE vs HAVING 차이
필터링 시점 | GROUP BY 전에 적용 (원본 데이터 필터링) | GROUP BY 후에 적용 (집계 결과 필터링) |
Aggregation 사용 | ❌ Aggregation 함수 사용 불가 | ⭕ Aggregation 함수 사용 가능 |
사용 예제 | WHERE salary > 5000 | HAVING SUM(salary) > 100000 |
좀 어거지로 만들어 본, Having / Where 사용예시
SELECT avg(e.salary) as AVG, d.department_name
FROM employees as e JOIN departments as d ON e.department_id = d.department_id
WHERE e.name <>'Eve'
GROUP BY d.department_name
HAVING sum(e.department_id) <> 2 and d.department_name <> 'HR';
6. Order By
Retrieve employees in ascending order of their salaries
select e.name,e.salary
from employees as e
order by e.salary;
Retrieve the department with the highest average salary, where the total salary of all employees in that department exceeds 100,000.
select d.department_name,avg(e.salary) as avg_sal
from employees as e join departments as d on e.department_id = d.department_id
group by d.department_name
having sum(e.salary) > 100000
order by avg_sal desc limit 1;
사실 이 정도만 해도 어지간한 SQL문은 다 푼다.
7. View
A reusable virtual table that does not store data physically but provides a result set based on a query
CREATE VIEW DepartmentSalaryView AS
SELECT D.department_name, SUM(E.salary) AS total_salary
FROM Employees E
INNER JOIN Departments D ON E.department_id = D.department_id
GROUP BY D.department_name;
A view that stores the query results physically, improving performance for large data sets but requiring manual refresh to update the data.
CREATE MATERIALIZED VIEW DepartmentEmployeeCount AS
SELECT D.department_name, COUNT(E.employee_id) AS employee_count
FROM Employees E
INNER JOIN Departments D ON E.department_id = D.department_id
GROUP BY D.department_name;
REFRESH MATERIALIZED VIEW DepartmentEmployeeCount;
그냥 Select * from view 방식으로 조회가 가능하다. 위에 것들은!
8. With
A temporary result set that simplifies complex queries by allowing intermediate data processing within
a single query.
WITH DepartmentSalary AS (
SELECT D.department_name, SUM(E.salary) AS total_salary
FROM Employees E
INNER JOIN Departments D ON E.department_id = D.department_id
GROUP BY D.department_name
)
SELECT department_name, total_salary
FROM DepartmentSalary
WHERE total_salary > 100000;
With은 해당 쿼리에서만 임시적으로 View처럼 동작한다.
✅ 1. WITH와 View 비교
지속성 | 일시적 (Temporary) → 해당 쿼리에서만 사용 가능 | 영구적 (Permanent) → DB에 저장됨 |
사용 가능 범위 | 해당 SQL문에서만 사용 가능 | 다른 SQL문에서도 사용 가능 |
구현 목적 | 복잡한 쿼리를 가독성 있게 작성 | 자주 사용하는 쿼리를 저장해 재사용 |
사용법 | WITH CTE AS (...) SELECT ... FROM CTE; | CREATE VIEW view_name AS SELECT ...; |
Retrieve employees who earn more than the average salary in their department
SELECT name, salary
FROM Employees E
WHERE salary > (SELECT AVG(E2.salary)
FROM Employees E2
WHERE E.department_id = E2.department_id);
Retrieve employees who belong to departments located in either San Francisco or New York
SELECT name
FROM Employees
WHERE department_id IN (SELECT department_id
FROM Departments
WHERE location IN ('San Francisco', 'New York'));
이렇게 복잡한 쿼리는 Sub-Query개념을 이용해서 풀기도 한다.
Retreive the employees in the 'Engineering' department
SELECT name
FROM Employees E
WHERE EXISTS (SELECT 1
FROM Departments D
WHERE D.department_id = E.department_id
AND D.department_name = 'Engineering');
요것은 Where + Exist로 Filtering 하는 Case. Subquery 쓰는 것과 결과는 같다.
'Data Science > SQL & Database Management System' 카테고리의 다른 글
Buffer Pool Design[4] (0) | 2025.02.10 |
---|---|
Database Storage[3] (0) | 2025.02.09 |
Introduction to DBMS[1] (0) | 2025.02.09 |
Ubuntu와 친해지기 (0) | 2024.11.11 |
4. Relational Algebra and Calculus (0) | 2024.04.24 |