컴퓨터 비전

컴퓨터 비전(Computer Vision)은 컴퓨터가 이미지와 동영상을 이해하게 하는 AI 분야다. 이미지 분류, 객체 탐지, 세분화, 얼굴 인식 등에 활용된다.

주요 태스크

태스크	설명	모델
이미지 분류	이미지가 어떤 클래스인지	ResNet, EfficientNet
객체 탐지	이미지 내 객체 위치+분류	YOLO, Faster R-CNN
세분화	픽셀 단위 분류	U-Net, SegFormer
생성	이미지 생성	DALL-E, Stable Diffusion

OpenCV 기본 예시

python

import cv2
import numpy as np

# 이미지 읽기
img = cv2.imread('image.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 엣지 검출 (Canny)
edges = cv2.Canny(gray, 100, 200)

# 얼굴 탐지 (Haar Cascade)
face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x,y), (x+w, y+h), (255,0,0), 2)

YOLO 객체 탐지

python

from ultralytics import YOLO

model = YOLO('yolov8n.pt')  # 사전 학습된 모델
results = model('image.jpg')
for r in results:
    print(r.boxes.cls)  # 탐지된 객체 클래스
    print(r.boxes.conf) # 신뢰도

참고문헌

•Gonzalez & Woods. Digital Image Processing, 4th Ed.
•YOLO 공식 문서: docs.ultralytics.com

주요 태스크

OpenCV 기본 예시

YOLO 객체 탐지

관련 개념

참고문헌

문서 목록