Large Language Model Meta AI - 인공지능 > 언어모델 | AI Insight Note

LLaMA(Large Language Model Meta AI)는 2023년 Meta AI가 공개한 오픈소스 대규모 언어 모델 시리즈다. 상업적으로 제한적인 기존 모델과 달리 연구·상업용으로 개방되어 오픈소스 LLM 생태계의 기반이 되었다.

버전별 특징

버전	파라미터	특징
LLaMA 1 (2023.02)	7B~65B	연구용 공개
LLaMA 2 (2023.07)	7B~70B	상업용 허가, Chat 버전 포함
LLaMA 3 (2024.04)	8B~70B	성능 대폭 향상, 128K 컨텍스트
LLaMA 3.1 (2024.07)	8B~405B	405B 모델, 다국어 강화
LLaMA 3.2 (2024.09)	1B~90B	멀티모달, 경량 모델

핵심 기술

1. RMSNorm: 레이어 정규화 (LayerNorm 대신)
2. SwiGLU: 활성화 함수 (ReLU/GELU 대신)
3. RoPE: 회전 위치 인코딩
4. GQA: 그룹 쿼리 어텐션 (LLaMA 2 이상)
5. KV 캐시 최적화

Ollama로 로컬 실행

bash

# 설치 및 실행
ollama pull llama3.2
ollama run llama3.2

# Python API
import ollama

response = ollama.chat(
    model='llama3.2',
    messages=[{'role': 'user', 'content': '한국의 수도는?'}]
)
print(response['message']['content'])

HuggingFace로 사용

python

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, device_map="auto"
)

messages = [{"role": "user", "content": "Python으로 피보나치 수열을 짜줘"}]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(input_ids, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))

Large Language Model Meta AILLaMA

버전별 특징

핵심 기술

Ollama로 로컬 실행

HuggingFace로 사용

관련 개념

관련 노트

프론티어 AI 모델Frontier AI Models

에이전틱 AIAgentic AI

AutoGPTAutoGPT