ad_official.log

control diffusion

Sat, 09 May 2026 08:13:56 GMT

normalization

1안

1 / 1 / 0.2 로 나눈다.

Waymo Open Motion Dataset (WOMD)

Wed, 04 Mar 2026 05:01:48 GMT

들어가기전에: Waymo Open Motion Dataset (WOMD) 란?

WOMD라고도 부름
여섯 도시에서 채집된 로그
학습 486 995, 검증 44 097, 테스트 44 920
각 시나리오에는 150m by 150m HD roadgraph가 잘려서 포함되어 있음

nuplan VS WOMD

항목	WOMD v1.2.1	nuPlan (2025‑06 기준)
총 주행 시간	약 574 시간 (세그먼트 103 k × 20 s)	약 1 282 시간 (로그 15 k × 평균 300 s)
도시 수	6개 샌프란시스코·피닉스·LA·시애틀·디트로이트·마운틴뷰	4개 라스베이거스·보스턴·피츠버그·싱가포르

센서·주석	WOMD v1.2.1	nuPlan
LiDAR	1개 중거리 + 4개 단거리 → 포인트클라우드 임베딩 제공 (원본 점 구름은 제공 X)	동일 구성 + 원본 점 구름(10 Hz) 128 h 분량 제공
카메라	8대 360° 영상 → 이미지 임베딩만 제공	6–7대 멀티뷰 원본 영상(부분 제공)
객체 클래스	3종 (차·보행자·자전거)	9종 + 교통신호 상태·차선 ID
지도 정보	고해상도 Lane Graph (차선 중심선, 경계)	Lane Graph + 미션 루트(출발·목적지), 교차로 태그

Waymo Sim Agents Challenge

어떤 데이터셋 split으로 평가해?

WOMD *검증·테스트 split에 포함된 * (검증 44 097, 테스트 44 920 )
- 수만 개의 시나리오를 모두 돌려 얻은
- 시나리오‑별 점수를 단순 평균한 값
시뮬레이션은 자동차 제어 Loop(50~100 ms) 안에 끝나야 한다 .

내 학습 모델 굴리는 방법

input
- 과거 1초 log
- HD map
output
- multi agent에 대해, closed loop로 1 step씩 80 step(0.1초 간격 이므로 8초) 미래 궤적을 생성함.
  - 위 행동을 32번 반복함

하드웨어, FPS 조건?

 학습·추론에 쓰는 GPU·CPU 종류나 개수에는 아무 제한이 없습니다.

2. 평량적 평가 metric

2.1. Realism Meta Metric (RMM)

실제 교통 데이터와, 우리 학습 모델의 데이터가 얼마나 유사한가? 를 평가.
- 실제 교통 데이터의 분포를 얼마나 잘 모사했는가?
데이터 분포 모사가 목적이지, 안전하고 빠르게 가는 차량을 만드는게 목적이 아님
- 예: 충돌을 0으로 만드는 편법은 오히려 Distribution Mismatch를 키워 Map·Kinematic 점수를 깎을 수 있습니다

Realism Meta 측정 방식

Realism Meta 는 총 28개 항목에 대해서 종합적으로 평가하는 metric
28개 항목은 아래 3가지 카테고리로 나누어짐
Kinematic (8개)
- 속도·가속도·각속도처럼 ‘차가 어떻게 움직였나’
Interactive (9개)
- 충돌 횟수, TTC 같이 *‘차들끼리 어떻게 상호작용했나’ *
Map-based (11개)
- 차선 이탈, 도로 밖 주행 등 ‘지형 규칙을 얼마나 지켰나’
위 28개 항목에 대해 점수($\ell_i$)를 각각 매긴 후, weighted($w_i$) sum을 수행합니다.

$$ \text{Meta}= -\sum_{i=1}^{28} w_i,\ell_i, \quad \ell_i=-!\sum_{b} p_{\text{log}}(b),\log p_{\text{sim}}(b) $$

다만 $\ell_i$ 를 어떻게 계산하는건지, 어려움. 아래에 설명합니다.

가장 쉬운 선속도에 대한 $\ell_i$ 를 설명

총 평가 항목

그룹	#	이름(축약)	직관적 설명	단위
Kinematic	1	Linear Speed	순간 속도 (	\mathbf{v}	)	m/s
	2	Linear Accel	(	\dot{\mathbf{v}}	)	m/s²
	3	Linear Jerk	(	\ddot{\mathbf{v}}	)	m/s³
	4	Angular Speed	요(rate) (	\omega	)	rad/s
	5	Angular Accel	(	\dot{\omega}	)	rad/s²
	6	Angular Jerk	(	\ddot{\omega}	)	rad/s³
	7	Heading Change Rate	1‑스텝 헤딩 차 $\Delta\theta$	rad
	8	Heading Accel	$\Delta^2\theta$	rad
Interactive	9	Nearest‑Dist	가장 가까운 객체까지 거리	m
	10	Collisions	충돌 여부(0/1)	-
	11	TTC	Time‑to‑Collision	s
	12	Inv‑TTC	$1/\text{TTC}$ 보조항목	1/s
	13	Relative Speed	앞차와 속도 차	m/s
	14	Overtake Count	추월 횟수	#
	15	Cut‑in Count	차선 끼어들기 횟수	#
	16	Follow‑Headway	앞차 기준 시간 간격	s
	17	Gap Closing Rate	거리 감소 속도	m/s
Map‑based	18	Road‑Edge Dist	도로 가장자리 거리	m
	19	Off‑road Flag	도로 이탈 여부	-
	20	Centerline Dist	차선 중심선 거리	m
	21	Lane‑Change Count	차선 변경 횟수	#
	22	Traffic‑Light Stop	정지선 지킴 여부	-
	23	Stop‑Sign Compliance	정지표지 준수여부	-
	24	Speed‑Limit Violation	제한속도 초과 비율	%
	25	Wrong‑Way Flag	역주행 여부	-
	26	Intersection‑Encroach	교차로 중앙 침범 거리	m
	27	Turn‑Heading Error	회전 각 편차	rad
	28	Trajectory‑Smoothness	궤적 곡률 변화	1/m

(28개 평가 항목

실제 구현은 9 개 ’코어’ 지표를 사용한 초기 버전과, 위처럼 28 개로 확장한 V1(2023.12) 양쪽이 존재
9 개 코어 항목(선속·선가속·각속·각가속·최근접 거리·충돌·TTC·도로 가장자리 거리·오프로드)은 모든 버전 공통

2.2. 참고: minADE (최소 평균 변위 오차)

뭘 측정하는거야?

가장 잘 맞는 후보가 1개라도 포함돼 있다면 minADE 점수가 높게 나오므로, minADE는 “포괄성(coverage)”을 측정합니다.
한계가 있는 metric이라고 합니다.

dasfdsaf

Sat, 24 Jan 2026 04:57:18 GMT

<위 너의 지적> 2) Ego 과거(ego_agent_past): 보통 OK(가정 의존) ✅/⚠️ ego_agent_past

ego는 “항상 존재” 전제라서 보통 무효점 자체가 없다고 가정합니다.

work()에서는 ego_agent_past에 대해 별도의 “무효 프레임 정리”를 추가로 하진 않습니다. ➡️ nuPlan이 past를 항상 정상 길이/정상 값으로 준다는 전제에서는 OK ➡️ 만약 어떤 시나리오에서 past가 부족해 “실제로 없는 시점”을 0으로 채워야 한다면, 현재 work()는 그걸 따로 보장하진 않습니다(그런 케이스가 실제로 발생하는지에 따라 판단).

그러므로, "어떤 시나리오에서 past가 부족해 “실제로 없는 시점”을 0으로 채우는 로직"을 추가합니다. ego_agent_past 는 과거~현재 순서대로 저장하는 데이터입니다. 그래서, 애초에 데이터셋에서, 과거에 무효점이었다가 유효점으로 채워진 패턴만 가능합니다. (유효점 사이에 무효점이 존재하는 경우는 애초에 없습니다.)

[GPT에 요청] <위 너의 지적>을 아래 코드에 직접 구현 반영해줄래? 위 목적을 완벽하게 달성하면서도 변경점을 최소화하는게 가장 훌륭한 코딩이야.

import numpy as np from tqdm import tqdm import matplotlib import time

matplotlib.use('Agg') # GUI 백엔드 사용 안함 (메모리 절약) import matplotlib.pyplot as plt import contextlib

from nuplan.planning.scenario_builder.nuplan_db.nuplan_scenario import NuPlanScenario from nuplan.common.maps.nuplan_map.nuplan_map import NuPlanMap from nuplan.planning.simulation.history.simulation_history_buffer import SimulationHistoryBuffer import copy from typing import Deque from nuplan.planning.simulation.observation.observation_type import Observation import os import torch from nuplan.common.maps.maps_datatypes import TrafficLightStatusData from nuplan.common.actor_state.tracked_objects import TrackedObjects from typing import Dict, Tuple, Union, List, Optional from nuplan.common.actor_state.state_representation import Point2D from nuplan.common.actor_state.ego_state import EgoState import draw_machine

matplotlib 설정 추가

plt.rcParams['figure.max_open_warning'] = 0 # 경고 메시지 비활성화 matplotlib.rcParams['figure.max_open_warning'] = 0 from nuplan.planning.training.preprocessing.feature_builders.vector_builder_utils import ( MapObjectPolylines, LaneSegmentTrafficLightData) from diffusion_planner.data_process.roadblock_utils import route_roadblock_correction from diffusion_planner.data_process.agent_process import ( build_ego_past_feature, build_neighbor_past_feature, build_static_feature, sampled_tracked_objects_to_array_list, sampled_ego_objects_to_array_list, sampled_static_objects_to_array_list, agent_future_all_process, ) from diffusion_planner.data_process.map_process import get_neighbor_vector_set_map, map_process from diffusion_planner.data_process.ego_process import get_ego_past_array_from_scenario, get_ego_future_array_from_scenario, calculate_additional_ego_states from diffusion_planner.data_process.utils import convert_data_dict_to_device_tensors, get_npc_route_roadblock_ids, get_neighbor_track_tokens

[ADDED] 통계 저장용

import json from nuplan.common.actor_state.tracked_objects_types import TrackedObjectType # 타입 판정용 from diffusion_planner.data_process.road_safety_process import extract_stop_sign_points, extract_crosswalk_points

class DataProcessor(object):

def __init__(self, config):

    self._save_dir = getattr(config, "save_path", None)
    self.config = config
    self.past_time_horizon = 2  # [seconds]
    self.num_past_poses = 10 * self.past_time_horizon
    self.future_time_horizon = 8  # [seconds]
    self.num_future_poses = 10 * self.future_time_horizon
    self.set_coord_as_center = config.set_coord_as_center
    self.caching_max_agent_num = config.caching_max_agent_num
    self.max_agent_num = config.max_agent_num
    self._use_filter_radius, self._filter_radius = self._read_filter_radius_settings_from_config(
        config)
    self.caching_max_static_num = config.caching_max_static_num
    self.max_static_num = config.max_static_num
    # [변경] 타입별 상한 신설: 보행자/자전거
    self.max_pedestrians = None  #getattr(config, "max_pedestrians", 7)  #128)
    self.max_bicycles = None  #getattr(config, "max_bicycles", 3)  #64)
    self.all_car_token_to_rr_ids: Optional[Dict[str,
                                                Optional[List[str]]]] = None
    self.init_cur_fut_agents_world_8_list: Optional[List[np.ndarray]] = None
    self._map_elements = [
        'LANE', 'LEFT_BOUNDARY', 'RIGHT_BOUNDARY', 'ROUTE_LANES'
    ]  # name of map features to be extracted.
    self._caching_max_map_elements = {
        'LANE': config.caching_max_lane_num,
        'LEFT_BOUNDARY': config.caching_max_lane_num,
        'RIGHT_BOUNDARY': config.caching_max_lane_num,
        'ROUTE_LANES': config.caching_max_lane_num
    }  # maximum number of elements to extract per feature layer.
    self._max_map_elements = {
        'LANE': config.max_lane_num,
        'LEFT_BOUNDARY': config.max_lane_num,
        'RIGHT_BOUNDARY': config.max_lane_num,
        'ROUTE_LANES': config.max_lane_num
    }  # maximum number of elements to extract per feature layer.
    self._map_points_num = {
        'LANE': config.lane_len,
        'LEFT_BOUNDARY': config.lane_len,
        'RIGHT_BOUNDARY': config.lane_len,
        'ROUTE_LANES': config.lane_len
    }  # maximum number of points per feature to extract per feature layer.

@staticmethod
def _build_origin_world_pose(
        ego_cur_pose_np: np.
    ndarray,  # shape: (3,) = [x_world, y_world, yaw_world]
) -> np.ndarray:
    """현재 샘플의 “ego 기준 좌표계 원점”이 세계좌표계에서 어디인지 (x, y, cos, sin)으로 만든다.

    우리가 저장하는 대부분의 값은
    “현재 ego 위치를 (0,0) 원점으로 둔 좌표계(ego 기준 좌표계)”에서 표현됩니다.

    그런데 나중에 이 값을 다시 세계좌표계로 복원하려면,
    “그 원점이 세계좌표계에서는 어디였는지”가 반드시 필요합니다.

    이 함수는 그 정보를 아래 형태로 만들어 줍니다.

    - origin_world_pose = [x_world, y_world, cos(yaw_world), sin(yaw_world)]
      shape: (4,)

    여기서 (x_world, y_world, yaw_world)는 입력 `ego_cur_pose_np`에서 가져옵니다.
    `ego_cur_pose_np`는 이미 코드에서 ego 기준 좌표계 변환의 기준점(원점)으로 쓰는 값이므로,
    이 값을 그대로 저장하면 “데이터를 만들 때 사용한 기준점”과 완전히 일치합니다.

    Args:
        ego_cur_pose_np (np.ndarray):
            shape: (3,)
            - [x_world, y_world, yaw_world]
            - 세계좌표계에서의 현재 ego 위치/방향(라디안)

    Returns:
        np.ndarray:
            shape: (4,)
            - [x_world, y_world, cos(yaw_world), sin(yaw_world)]
            - dtype: float32
    """
    if not isinstance(ego_cur_pose_np, np.ndarray):
        raise TypeError(
            f"`ego_cur_pose_np`는 np.ndarray 여야 합니다. got {type(ego_cur_pose_np)}"
        )
    if ego_cur_pose_np.shape != (3,):
        raise ValueError(
            f"`ego_cur_pose_np` shape는 (3,) 이어야 합니다. got {ego_cur_pose_np.shape}"
        )

    x_world: float = float(ego_cur_pose_np[0])
    y_world: float = float(ego_cur_pose_np[1])
    yaw_world: float = float(ego_cur_pose_np[2])

    # origin_world_pose: shape (4,) = [x, y, cos(yaw), sin(yaw)]
    origin_world_pose: np.ndarray = np.array(
        [x_world, y_world,
         np.cos(yaw_world),
         np.sin(yaw_world)],
        dtype=np.float32,
    )
    return origin_world_pose

@staticmethod
def _slice_neighbor_cur_fut_horizon_11dim(
    neighbor_cur_fut_all_gt_11_dim: np.ndarray,  # shape: (N, T_all, 11)
    iteration: int,
    future_len: int,
) -> np.ndarray:
    """neighbor의 (현재~미래) 전체 시퀀스에서, 특정 iteration 기준으로 (현재+미래) 구간만 고정 길이로 뽑는다.

    이 함수가 필요한 이유
    ---------------------
    observation_adapter()에서는 scenario 전체 길이만큼의 (현재~미래) 데이터를 미리 만들어두고,
    매 step마다 iteration 위치에서 앞으로 future_len 만큼을 잘라서 씁니다.

    그런데 “유효점과 유효점 사이에 무효점이 있으면 안 된다” 규칙을 적용할 때,
    **현재 step에서 필요한 101개(=past_len + future_len)**만 보고 처리해야 합니다.
    (멀리 뒤의 미래 프레임이 섞이면, 그 정보 때문에 앞 구간이 잘못 채워질 수 있습니다.)

    그래서 이 함수는:
    - neighbor_cur_fut_all_gt_11_dim (N, T_all, 11) 에서
    - [iteration ... iteration + future_len] (총 1+future_len 프레임)
      을 뽑아서 (N, 1+future_len, 11) 로 반환합니다.
    - 범위를 벗어나는 프레임은 0으로 패딩합니다.

    Args:
        neighbor_cur_fut_all_gt_11_dim (np.ndarray):
            shape: (N, T_all, 11)
            - N: agent 수
            - T_all: scenario 전체 타임 길이(현재 포함)
            - 11: [x, y, cos, sin, vx, vy, width, length, onehot(3)]
        iteration (int):
            현재 step 인덱스(0 기반).
            이 값이 “현재 프레임” 위치라고 가정합니다.
        future_len (int):
            “현재 이후”로 몇 프레임을 쓸지 (현재 제외).
            예: 80 이면 출력 길이는 81(=현재1 + 미래80)

    Returns:
        np.ndarray:
            shape: (N, 1 + future_len, 11)
            - index 0: 현재 프레임
            - index 1..future_len: 미래 프레임
            - 부족한 구간은 0으로 채워짐
    """
    if neighbor_cur_fut_all_gt_11_dim.ndim != 3 or neighbor_cur_fut_all_gt_11_dim.shape[
            -1] != 11:
        raise ValueError(
            f"`neighbor_cur_fut_all_gt_11_dim` shape는 (N, T_all, 11)이어야 합니다. "
            f"got {neighbor_cur_fut_all_gt_11_dim.shape}")
    if iteration < 0:
        raise ValueError(f"`iteration`은 0 이상이어야 합니다. got {iteration}")
    if future_len < 0:
        raise ValueError(f"`future_len`은 0 이상이어야 합니다. got {future_len}")

    N: int = int(neighbor_cur_fut_all_gt_11_dim.shape[0])
    T_all: int = int(neighbor_cur_fut_all_gt_11_dim.shape[1])
    out_len: int = int(1 + future_len)

    # out: (N, 1+future_len, 11)
    out: np.ndarray = np.zeros((N, out_len, 11),
                               dtype=neighbor_cur_fut_all_gt_11_dim.dtype)

    start: int = int(iteration)
    end: int = int(iteration + out_len)

    if start >= T_all:
        return out

    copy_end: int = int(min(T_all, end))
    copy_len: int = int(copy_end - start)  # 복사 가능한 길이

    out[:, :copy_len, :] = neighbor_cur_fut_all_gt_11_dim[:,
                                                          start:copy_end, :]
    return out

def _build_neighbor_future_gt_from_past_and_cur_fut_11dim(
    self,
    neighbor_agents_past: np.ndarray,  # shape: (N, Tp, 11)
    neighbor_cur_fut_gt_11_dim: np.
    ndarray,  # shape: (N, 1+Tf, 11)  (0번이 현재)
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    """neighbor의 past와 (현재+미래)를 합쳐서, 규칙을 만족하도록 정리한 뒤 최종 출력들을 만든다.

    반드시 만족해야 하는 규칙(한 agent 기준)
    --------------------------------------
    1) 현재 점(past의 마지막)이 무효면:
       - 101개 전체(과거~현재~미래)가 전부 무효(=전부 0)여야 합니다.

    2) 현재 점이 유효면:
       - 101개 전체에서 “유효점과 유효점 사이에 무효점”이 있으면 안 됩니다.
       - 즉, 유효한 프레임들이 중간에 끊기면,
         그 끊긴 구간을 x/y/cos/sin/vx/vy의 “직선 중간값”으로 채워서
         유효 구간이 한 덩어리로 이어지게 만듭니다.
       - 유효 구간 밖(prefix/suffix)은 0으로 둡니다.

    Args:
        neighbor_agents_past (np.ndarray):
            shape: (N, Tp, 11)
            - Tp = time_len (예: 21)
            - 마지막 index (Tp-1)이 “현재 프레임”
        neighbor_cur_fut_gt_11_dim (np.ndarray):
            shape: (N, 1+Tf, 11)
            - index 0이 “현재 프레임”
            - index 1..Tf 가 “미래”
            - Tf = future_len (예: 80)

    Returns:
        Tuple[np.ndarray, np.ndarray, np.ndarray]:
            - fixed_neighbor_agents_past: shape (N, Tp, 11)
            - neighbor_future_gt_11_dim:  shape (N, Tf, 11)  (현재 제외)
            - neighbor_future_gt_3_dim:   shape (N, Tf, 3)   [x, y, yaw]
    """
    fixed_neighbor_agents_past, neighbor_future_gt_11_dim = self._merge_and_interpolate_neighbor_11dim(
        neighbor_agents_past=neighbor_agents_past,
        neighbor_cur_fut_gt_11_dim=neighbor_cur_fut_gt_11_dim,
    )

    # 11 -> 3 (yaw는 cos/sin으로 계산)
    neighbor_future_gt_3_dim: np.ndarray = self._traj11_to_traj3_yaw(
        neighbor_future_gt_11_dim)

    return fixed_neighbor_agents_past, neighbor_future_gt_11_dim, neighbor_future_gt_3_dim

def _enforce_no_invalid_between_valid_in_ego_past(
    self,
    ego_agent_past: np.ndarray,  # shape: (Tp, 11)
    *,
    eps: float = 1e-8,
) -> np.ndarray:
    """ego_agent_past(과거~현재)만 가지고도 “유효-무효-유효”가 생기지 않게 정리한다.

    observation_adapter()에서는 ego 미래 GT를 출력하지 않기 때문에,
    past(과거~현재)만으로 규칙을 강제합니다.

    규칙
    ----
    - 현재 프레임(=past의 마지막)이 유효일 때:
      · past 안에서 유효점과 유효점 사이에 무효점(0)이 끼면 안 됩니다.
      · 그래서 past에서 유효한 프레임들의 첫/마지막 위치를 찾고,
        그 사이에 비어 있는 프레임이 있으면 x/y/cos/sin/vx/vy를 “직선 중간값”으로 채웁니다.
      · 유효 구간 밖은 0으로 둡니다.

    - 현재 프레임이 무효면:
      · 안전하게 past 전체를 0으로 만듭니다.
        (ego는 보통 항상 존재하지만, 이상 케이스를 막기 위해서입니다.)

    Args:
        ego_agent_past (np.ndarray):
            shape: (Tp, 11)
            Tp는 보통 21 (2초 과거 + 현재)
        eps (float):
            0과 아주 가까운 값을 “0”처럼 볼 때 쓰는 기준

    Returns:
        np.ndarray:
            shape: (Tp, 11)
            규칙을 만족하도록 정리된 ego_agent_past (새 배열)
    """
    if ego_agent_past.ndim != 2 or ego_agent_past.shape[-1] != 11:
        raise ValueError(
            f"`ego_agent_past` shape는 (Tp, 11)이어야 합니다. got {ego_agent_past.shape}"
        )

    Tp: int = int(ego_agent_past.shape[0])
    if Tp == 0:
        return ego_agent_past

    traj: np.ndarray = ego_agent_past.astype(np.float32,
                                             copy=True)  # (Tp, 11)
    current_index: int = Tp - 1

    # valid_mask_1d: (Tp,)
    valid_mask_1d: np.ndarray = (np.abs(traj[:, :8]) > eps).any(axis=1)

    # 현재가 무효면 past 전체를 0으로
    if not bool(valid_mask_1d[current_index]):
        return np.zeros_like(traj)

    valid_idx: np.ndarray = np.nonzero(valid_mask_1d)[0]
    if valid_idx.size == 0:
        return np.zeros_like(traj)

    region_mask: np.ndarray = np.zeros((Tp,), dtype=bool)

    if valid_idx.size == 1:
        region_mask[int(valid_idx[0])] = True
    else:
        first_valid: int = int(valid_idx[0])
        last_valid: int = int(valid_idx[-1])
        region_mask[first_valid:last_valid + 1] = True

        # 중간 구멍이 있으면 x/y/cos/sin/vx/vy를 채움 (0~5)
        if last_valid - first_valid + 1 > valid_idx.size:
            xs: np.ndarray = valid_idx.astype(np.float64)  # (K,)
            seg_idx: np.ndarray = np.arange(first_valid,
                                            last_valid + 1,
                                            dtype=np.float64)

            for dim_idx in range(6):  # 0~5
                ys: np.ndarray = traj[valid_idx, dim_idx].astype(np.float64,
                                                                 copy=False)
                interp_vals: np.ndarray = np.interp(seg_idx, xs, ys)
                traj[first_valid:last_valid + 1,
                     dim_idx] = interp_vals.astype(np.float32, copy=False)

    # 타입/크기는 “현재 프레임 값”을 대표로 씀
    type_vec: np.ndarray = traj[current_index,
                                8:11].astype(np.float32, copy=False)  # (3,)
    rep_size: np.ndarray = traj[current_index,
                                6:8].astype(np.float32, copy=False)  # (2,)

    # one-hot은 유효 구간에만
    traj[:, 8:11] = 0.0
    traj[region_mask, 8:11] = type_vec

    # 유효 구간 밖은 완전 0
    traj[~region_mask, :] = 0.0

    # width/length 채우기 + cos/sin 정리
    traj_b: np.ndarray = traj[None, :, :]  # (1, Tp, 11)
    region_b: np.ndarray = region_mask[None, :]  # (1, Tp)
    rep_size_b: np.ndarray = rep_size[None, :]  # (1, 2)

    traj_b = self._fill_width_length_with_representative_size(
        traj_11=traj_b,
        valid_mask=region_b,
        rep_size=rep_size_b,
    )
    traj_b = self._normalize_cos_sin_in_traj_11(
        traj_11=traj_b,
        valid_mask=region_b,
    )
    return traj_b[0]  # (Tp, 11)

def _get_map_query_radius_m(self) -> float:
    """지도/도로시설(정지표지, 횡단보도 등)을 조회할 때 사용할 반경(m)을 반환합니다.

    이 반경은 '필터링을 켜고/끄는 옵션(use_filter_radius)'과 성격이 다릅니다.

    - use_filter_radius 는 '에이전트/정적 객체를 거리로 잘라낼지'를 제어합니다.
    - 하지만 지도/도로시설은 반경이 없으면 가져오는 데이터가 너무 커져서
      속도/메모리 문제가 생길 수 있습니다.

    그래서 지도/도로시설 조회 반경은 use_filter_radius 와 무관하게
    항상 filter_radius 값을 그대로 사용하도록 분리합니다.

    Returns:
        float:
            meter 단위 지도/도로시설 조회 반경.
    """
    return float(self._filter_radius)

@staticmethod
def _read_filter_radius_settings_from_config(
        config: object) -> Tuple[bool, float]:
    """config에서 필터링 설정(use_filter_radius, filter_radius)을 읽습니다.

    이 클래스는 오직 config에 아래 두 값이 "명시적으로 존재"할 때만 동작하도록 강제합니다.
      - use_filter_radius (bool)
      - filter_radius (float, meter)

    둘 중 하나라도 없으면, 잘못된 설정으로 조용히 다른 기본값을 쓰는 일을 막기 위해
    즉시 에러를 내고 중단합니다.

    Args:
        config: args_util.get_args()가 반환한 인자 객체(보통 argparse.Namespace)

    Returns:
        Tuple[bool, float]:
            - use_filter_radius: bool
            - filter_radius_m: float (meter)

    Raises:
        RuntimeError: use_filter_radius 또는 filter_radius가 config에 없을 때
    """
    missing: List[str] = []
    if not hasattr(config, "use_filter_radius"):
        missing.append("use_filter_radius")
    if not hasattr(config, "filter_radius"):
        missing.append("filter_radius")

    if len(missing) > 0:
        raise RuntimeError(
            "필수 설정이 config에 없습니다. "
            "args_util.py에 `--use_filter_radius`와 `--filter_radius`를 반드시 선언해야 합니다. "
            f"missing={missing}")

    use_filter_radius = bool(getattr(config, "use_filter_radius"))
    filter_radius_m = float(getattr(config, "filter_radius"))
    return use_filter_radius, filter_radius_m

def _get_effective_filter_radius_m(self) -> Optional[float]:
    """현재 설정(use_filter_radius)에 따라 실제로 사용할 반경을 반환합니다.

    - use_filter_radius=True: filter_radius(m)를 반환합니다.
    - use_filter_radius=False: None을 반환해서, 호출 측에서 필터링을 건너뛰게 합니다.

    Returns:
        Optional[float]:
            - float: meter 단위 반경
            - None: 필터링을 하지 않음
    """
    if bool(self._use_filter_radius):
        return float(self._filter_radius)
    return None
@staticmethod
def _adjust_ego_future_outputs_to_center_frame(
    ego_state: EgoState,
    ego_future_gt_3_dim: np.ndarray,  # shape: (T, 3)
    ego_future_gt_11_dim: np.ndarray,  # shape: (T, 11)
    *,
    set_coord_as_center: bool,
) -> Tuple[np.ndarray, np.ndarray]:
    """ego 미래 궤적 출력이 rear axle 기준일 때, center 기준으로 x,y만 보정한다.

    주의(이번 수정의 핵심)
    ---------------------
    - 무효 프레임(앞 8차원이 전부 0인 프레임)은 (0,0,0,0,...) 상태를 유지해야 한다.
    - 따라서 set_coord_as_center=True 여도 무효 프레임에는 x,y 이동(offset)을 적용하지 않는다.
      (유효 프레임에만 적용)

    Args:
        ego_state (EgoState):
            현재 ego 상태.
        ego_future_gt_3_dim (np.ndarray):
            shape: (T, 3)
            [x, y, yaw] 형태의 ego 미래 궤적(ego 로컬 좌표계).
        ego_future_gt_11_dim (np.ndarray):
            shape: (T, 11)
            [x, y, cos, sin, vx, vy, width, length, onehot(3)] 형태의 ego 미래 궤적.
        set_coord_as_center (bool):
            True면 center 기준으로 보정, False면 입력 그대로 반환.

    Returns:
        Tuple[np.ndarray, np.ndarray]:
            (ego_future_gt_3_dim, ego_future_gt_11_dim)
            - 둘 다 입력 배열을 **in-place로** 수정한 뒤 그대로 반환합니다.
    """
    if not set_coord_as_center:
        return ego_future_gt_3_dim, ego_future_gt_11_dim

    # 미래 길이가 0이면 할 게 없음
    if ego_future_gt_11_dim.size == 0:
        return ego_future_gt_3_dim, ego_future_gt_11_dim

    # (1) 월드 좌표에서 rear_axle -> center 변위
    dx_world: float = float(ego_state.center.x - ego_state.rear_axle.x)
    dy_world: float = float(ego_state.center.y - ego_state.rear_axle.y)

    # (2) 현재 heading 기준 로컬 좌표로 회전 (world -> ego heading frame)
    heading: float = float(ego_state.rear_axle.heading)
    c: float = float(np.cos(heading))
    s: float = float(np.sin(heading))

    # local = R^T * world,  R = [[c, -s],[s, c]]
    offset_x_local: float = dx_world * c + dy_world * s
    offset_y_local: float = -dx_world * s + dy_world * c

    # ✅ (핵심) 유효/무효 마스크
    # - 무효 프레임 정의: [x, y, cos, sin, vx, vy, width, length] 8개가 전부 0이면 무효
    eps: float = 1e-8
    # valid_mask: shape (T,)
    valid_mask: np.ndarray = (np.abs(ego_future_gt_11_dim[:, :8]) > eps).any(axis=1)

    # 유효 프레임에만 원점 이동 적용
    ego_future_gt_3_dim[valid_mask, 0] -= offset_x_local
    ego_future_gt_3_dim[valid_mask, 1] -= offset_y_local
    ego_future_gt_11_dim[valid_mask, 0] -= offset_x_local
    ego_future_gt_11_dim[valid_mask, 1] -= offset_y_local

    return ego_future_gt_3_dim, ego_future_gt_11_dim


@staticmethod
def _normalize_cos_sin_in_traj_11(
    traj_11: np.ndarray,  # shape: (N, T, 11)
    valid_mask: np.ndarray,  # shape: (N, T), True면 유효 프레임
    *,
    eps: float = 1e-6,
    cos_index: int = 2,
    sin_index: int = 3,
) -> np.ndarray:
    """(cos, sin) 채널을 "진짜 cos/sin"처럼 보이도록 길이를 1로 맞춥니다.

    왜 필요한가
    ----------
    중간 프레임을 채우는 과정에서 cos/sin을 숫자 그대로 선형으로 섞으면,
    (cos, sin)이 단위원 위(길이 1)에 있지 않을 수 있습니다.
    그러면 모델 입장에서 "방향 정보"가 애매해질 수 있습니다.

    이 함수는 다음 규칙으로 정리합니다.
    1) valid_mask=True 인 프레임만 처리합니다.
    2) (cos^2 + sin^2)의 제곱근이 eps보다 크면,
       cos와 sin을 그 길이로 나눠서 길이를 1로 맞춥니다.
    3) 길이가 너무 작으면(=방향 정보가 사실상 없는 경우),
       해당 프레임의 cos/sin을 0으로 둡니다.
    4) valid_mask=False 인 프레임은 cos/sin을 0으로 둡니다.

    Args:
        traj_11 (np.ndarray):
            shape (N, T, 11)
        valid_mask (np.ndarray):
            shape (N, T)
        eps (float):
            0 나누기 방지용 작은 값
        cos_index (int):
            cos 채널 인덱스(기본 2)
        sin_index (int):
            sin 채널 인덱스(기본 3)

    Returns:
        np.ndarray:
            shape (N, T, 11)
            traj_11을 직접 수정한 뒤 그대로 반환합니다.
    """
    if traj_11.ndim != 3 or traj_11.shape[-1] != 11:
        raise ValueError(
            f"`traj_11` shape는 (N, T, 11)이어야 합니다. got {traj_11.shape}")
    if valid_mask.shape != traj_11.shape[:2]:
        raise ValueError(
            f"`valid_mask` shape는 (N, T)이어야 합니다. got {valid_mask.shape}, expected {traj_11.shape[:2]}"
        )

    cos_v = traj_11[:, :, cos_index]  # (N, T)
    sin_v = traj_11[:, :, sin_index]  # (N, T)

    # norm: (N, T)
    norm = np.sqrt(cos_v * cos_v + sin_v * sin_v)

    # 나눗셈 안전장치
    norm_safe = np.where(norm > eps, norm, 1.0)

    cos_unit = cos_v / norm_safe
    sin_unit = sin_v / norm_safe

    # valid_mask=True 이고 norm>eps 인 곳만 (cos_unit, sin_unit) 사용
    good = valid_mask & (norm > eps)

    traj_11[:, :, cos_index] = np.where(good, cos_unit,
                                        0.0).astype(traj_11.dtype,
                                                    copy=False)
    traj_11[:, :, sin_index] = np.where(good, sin_unit,
                                        0.0).astype(traj_11.dtype,
                                                    copy=False)
    return traj_11

# [ADDED] 통계 유틸 함수들
# =========================
@staticmethod
def _count_valid_neighbors_by_type(
        neighbor_agents_past: np.ndarray,  # (N, Tp, 11)
) -> Tuple[int, int, int]:
    """마지막 시점의 에이전트 상태로 유효/타입을 판정해 수를 셉니다.

    규칙:
      - 유효성: 마지막 시점의 앞 8차원(kinematics/size)이 모두 0이면 무효로 간주
        · 즉, valid = any(|state_last[:8]| > eps)
      - 타입: 마지막 3차원(one-hot) = [vehicle, pedestrian, bicycle]
        · 임계값 0.5 초과를 1로 해석(부동소수 오차 대비)

    Args:
        neighbor_agents_past (np.ndarray):
            - shape: (N, Tp, 11)
            - 마지막 차원 11 = [x, y, cos, sin, vx, vy, width, length, onehot_vehicle, onehot_ped, onehot_bike]

    Returns:
        Tuple[int, int, int]: (vehicle_count, pedestrian_count, bicycle_count)

    Raises:
        ValueError: 입력의 마지막 차원 크기가 11이 아닌 경우.
    """
    if neighbor_agents_past.ndim != 3 or neighbor_agents_past.shape[
            -1] != 11:
        raise ValueError(
            f"`neighbor_agents_past` shape는 (N, Tp, 11)이어야 합니다. "
            f"got {neighbor_agents_past.shape}")

    # 마지막 시점만 사용
    last: np.ndarray = neighbor_agents_past[:, -1, :]  # (N, 11)

    # 유효성 마스크: 앞 8차원 중 하나라도 |.| > eps 이면 유효
    eps = 1e-8
    valid_mask: np.ndarray = (np.abs(last[:, :8]) > eps).any(axis=1)  # (N,)

    # 타입 one-hot (vehicle, pedestrian, bicycle)
    type_oh: np.ndarray = last[:, 8:11]  # (N, 3)
    veh_mask = type_oh[:, 0] > 0.5
    ped_mask = type_oh[:, 1] > 0.5
    bik_mask = type_oh[:, 2] > 0.5

    vehicle_count = int(np.sum(valid_mask & veh_mask))
    pedestrian_count = int(np.sum(valid_mask & ped_mask))
    bicycle_count = int(np.sum(valid_mask & bik_mask))

    return vehicle_count, pedestrian_count, bicycle_count

@staticmethod
def _compute_lane_speed_stats(
    vector_map_output: Dict[str,
                            np.ndarray],) -> Tuple[float, Optional[float]]:
    """차선 관련 통계를 계산한다.

    분모는 '유효 차선' 개수:
        - vector_map_output['lanes'] 의 각 차선 텐서 합(|.|) > 0

    통계:
        - 속도제한 차선 비율(%):
            100 * (#(유효 ∧ has_speed_limit True)) / (#유효)
        - 속도제한 차선들의 평균 제한속도(km/h):
            mean(lanes_speed_limit[유효 ∧ True]) * 3.6
            (없으면 None 반환)

    Args:
        vector_map_output: map_process(...) 가 반환한 dict

    Returns:
        Tuple[float, Optional[float]]: (ratio_percent, mean_speed_kmh or None)
    """
    lanes: np.ndarray = vector_map_output[
        'lanes']  # (lane_num, lane_len, 12)
    has_speed: np.ndarray = vector_map_output[
        'lanes_has_speed_limit']  # (lane_num, 1) bool
    speed_mps: np.ndarray = vector_map_output[
        'lanes_speed_limit']  # (lane_num, 1) float

    # 유효 차선 판정:
    #   - 앞 8채널(x, y, vec, left/right 등)이 전부 0이면 패딩으로 간주
    #   - 즉, lanes[..., :8]의 모든 값이 0인 lane 은 무시
    lanes_front8: np.ndarray = lanes[..., :8]  # (lane_num, lane_len, 8)
    lanes_valid_mask = (np.abs(lanes_front8).sum(axis=(1, 2))
                        > 0)  # (lane_num,)
    if lanes_valid_mask.sum() == 0:
        return 0.0, None

    has_speed_mask = (has_speed.reshape(-1).astype(bool)
                     ) & lanes_valid_mask  # (lane_num,)
    ratio_percent = float(100.0 * has_speed_mask.sum() /
                          lanes_valid_mask.sum())

    mean_speed_kmh: Optional[float] = None
    if has_speed_mask.any():
        mean_speed_kmh = float(
            speed_mps.reshape(-1)[has_speed_mask].mean() * 3.6)

    return ratio_percent, mean_speed_kmh

def _save_sample_stats_json(
    self,
    map_name: str,
    token: str,
    stats: Dict[str, Union[int, float, None]],
) -> None:
    """샘플별 통계를 `/_.stats.json` 으로 저장한다.

    원자적 저장을 위해 `.tmp`로 쓴 뒤 최종 파일명으로 교체한다.

    Args:
        map_name: 맵 이름
        token: 시나리오 토큰
        stats: 저장할 통계 딕셔너리
    """
    if not self._save_dir:
        return
    os.makedirs(self._save_dir, exist_ok=True)
    json_temp_folder = os.path.join(self._save_dir, "json_temp")
    os.makedirs(json_temp_folder, exist_ok=True)
    out_path = os.path.join(json_temp_folder,
                            f"{map_name}_{token}.stats.json")
    tmp_path = out_path + ".tmp"
    with open(tmp_path, "w") as f:
        json.dump(stats, f, indent=2)
    os.replace(tmp_path, out_path)

def _get_car_token_to_rr_ids(
    self,
    all_car_token_to_rr_ids: Dict[str, Optional[List[str]]],
    neighbor_track_token: List[str]  # len = chosen_agent_num
) -> Dict[str, List[str]]:  # len = chosen_car_num
    car_token_to_rr_ids: Dict[str, Optional[List[str]]] = {}
    for token in neighbor_track_token:
        if token in all_car_token_to_rr_ids:
            car_token_to_rr_ids[token] = all_car_token_to_rr_ids[token]
    return car_token_to_rr_ids

def _get_past_cur_ego_feature(
    self,
    scenario: Optional[NuPlanScenario] = None,
    history_buffer: Optional[SimulationHistoryBuffer] = None,
    *,
    set_coord_as_center: bool = False,
) -> Tuple[EgoState, Point2D, float, np.ndarray, np.ndarray,
           Optional[np.ndarray]]:
    """시나리오 또는 history buffer 에서 ego 궤적을 공통 포맷으로 추출한다.

    (중간 설명은 기존 docstring 유지하되, 아래 한 줄만 추가 개념으로 보면 됩니다)
    - set_coord_as_center=True 이면:
      ego 기준 좌표 변환의 기준점을 rear axle이 아니라 ego center로 잡습니다.
      즉, ego_point2d / ego_heading / ego_cur_pose_np 가 center 기준으로 설정됩니다.
    """
    if (scenario is None and history_buffer is None) or \
       (scenario is not None and history_buffer is not None):
        raise ValueError("scenario 또는 history_buffer 중 정확히 하나만 전달해야 합니다.")

    if scenario is not None:
        ego_state: EgoState = scenario.initial_ego_state
    else:
        ego_state = history_buffer.current_state[
            0]  # type: ignore[union-attr]

    # ✅ 기준점 선택: rear_axle(default) vs center
    if set_coord_as_center:
        ref = ego_state.center
    else:
        ref = ego_state.rear_axle

    ego_point2d = Point2D(ref.x, ref.y)
    ego_heading: float = float(ref.heading)
    ego_cur_pose_np = np.array([ref.x, ref.y, ref.heading],
                               dtype=np.float64)  # shape: (3,)

    if scenario is not None:
        (past_cur_ego_world_10,
         past_cur_time_np) = get_ego_past_array_from_scenario(
             scenario,
             self.num_past_poses,
             self.past_time_horizon,
         )
    else:
        ego_state_buffer: Deque[EgoState] = history_buffer.ego_state_buffer
        past_cur_ego_world_10 = sampled_ego_objects_to_array_list(
            ego_state_buffer)
        past_cur_time_np = None

    assert past_cur_ego_world_10.shape[0] == self.num_past_poses + 1, \
        f"Expected past_cur_ego_world_10 shape[0] == {self.num_past_poses + 1}, got {past_cur_ego_world_10.shape[0]}"

    return (
        ego_state,
        ego_point2d,
        ego_heading,
        ego_cur_pose_np,
        past_cur_ego_world_10,
        past_cur_time_np,
    )

def _prepare_car_token_to_rr_ids(
    self,
    scenario: NuPlanScenario,
    use_route_lanes: bool = False,
    neighbor_track_token: Optional[List[str]] = None,
) -> Dict[str, List[str]]:
    if use_route_lanes and self.all_car_token_to_rr_ids is None:
        present_tracked_objects: TrackedObjects \
            = scenario.initial_tracked_objects.tracked_objects
        past_tracked_objects: List[TrackedObjects] = [
            tracked_objects.tracked_objects
            for tracked_objects in scenario.get_past_tracked_objects(
                iteration=0,
                time_horizon=self.past_time_horizon,
                num_samples=self.num_past_poses)
        ]
        past_cur_tracked_objects = past_tracked_objects + [
            present_tracked_objects
        ]
        self.all_car_token_to_rr_ids: Dict[
            str, List[str]] = get_npc_route_roadblock_ids(
                scenario,
                past_cur_tracked_objects,
                neighbor_track_token=None)
    elif not use_route_lanes:
        self.all_car_token_to_rr_ids = {}
    # len = chosen_car_num
    car_token_to_rr_ids: Dict[str,
                              List[str]] = self._get_car_token_to_rr_ids(
                                  self.all_car_token_to_rr_ids,
                                  neighbor_track_token)
    return car_token_to_rr_ids

def _get_cur_fut_agents_world_8_list(
    self,
    scenario: NuPlanScenario,
    token_to_id: Dict[str, int],
    do_inference: bool,
):
    if do_inference:
        if self.init_cur_fut_agents_world_8_list is None:
            scenario_duration: float = scenario.duration_s.time_s + self.future_time_horizon
            num_samples = int(scenario_duration * 10.0)
            """
            self.init_cur_fut_agents_world_8_list: List[np.ndarray]
                - 길이: 1 + num_samples
                - 각 원소 shape: (frame_agents_num_t, 8)
            """
            (self.init_cur_fut_agents_world_8_list,
             _) = self._get_future_tracked_objects_array_list(
                 scenario,
                 token_to_id=token_to_id,
                 iteration=0,
                 future_time_horizon=scenario_duration,
                 num_samples=num_samples)

        # 깊은 복사 후, 선택 에이전트들만 뽑아서 ego 기준으로 변환
        cur_fut_agents_world_8_list = copy.deepcopy(
            self.init_cur_fut_agents_world_8_list)
    else:
        (cur_fut_agents_world_8_list,
         _) = self._get_future_tracked_objects_array_list(
             scenario, token_to_id=token_to_id, iteration=0)
    return cur_fut_agents_world_8_list

# Use for inference
def observation_adapter(
    self,
    iteration: int,
    history_buffer: SimulationHistoryBuffer,
    traffic_light_data: List[TrafficLightStatusData],
    map_api: NuPlanMap,
    device='cpu',
    scenario: Optional[NuPlanScenario] = None,
    use_route_lanes: bool = False,
    squeeze: bool = False,
) -> Dict[str, torch.Tensor]:

    (ego_state, ego_point2d, ego_heading, ego_cur_pose_np,
     past_cur_ego_world_10, _) = self._get_past_cur_ego_feature(
         history_buffer=history_buffer,
         set_coord_as_center=self.set_coord_as_center,
     )

    # ✅ 추가: ego 기준 좌표계 원점의 세계좌표 포즈 저장
    # origin_world_pose: shape (4,) = [x_world, y_world, cos(yaw), sin(yaw)]
    origin_world_pose: np.ndarray = self._build_origin_world_pose(
        ego_cur_pose_np)

    ego_agent_past = build_ego_past_feature(
        past_cur_ego_world_10=past_cur_ego_world_10,
        ego_cur_pose_np=ego_cur_pose_np,
    )
    # Past observations including the current
    observation_buffer: Deque[
        Observation] = history_buffer.observation_buffer

    (
        past_cur_agents_world_8_list,
        past_cur_agents_types_list,
        present_static_feat_5,
        static_types_list,
        token_to_id,
        _,
        _,
    ) = self._get_past_cur_agents_feature(
        observation_buffer=observation_buffer)

    (neighbor_agents_past, agents_cur_frame_indices, neighbors_id,
     neighbor_track_token) = build_neighbor_past_feature(
         past_cur_agents_world_8_list=past_cur_agents_world_8_list,
         past_cur_agents_types_list=past_cur_agents_types_list,
         max_agent_num=self.max_agent_num,
         ego_cur_pose_np=ego_cur_pose_np,
         max_pedestrians=self.max_pedestrians,
         max_bicycles=self.max_bicycles,
         token_to_id=token_to_id,
         filter_radius=None,
     )

    ego_time_len = ego_agent_past.shape[0]
    neighbor_time_len = neighbor_agents_past.shape[1]
    assert ego_time_len == neighbor_time_len == self.num_past_poses + 1, \
        f"Expected time length {self.num_past_poses + 1}, got ego {ego_time_len}, neighbor {neighbor_time_len}"

    cur_fut_agents_world_8_list = self._get_cur_fut_agents_world_8_list(
        scenario, token_to_id, do_inference=True)

    # (N, 1 + Tf_all, 11)
    neighbor_cur_fut_all_gt_11_dim = agent_future_all_process(
        ego_cur_pose_np=ego_cur_pose_np,
        cur_fut_agents_world_8_list=cur_fut_agents_world_8_list,
        neighbor_token_id=neighbors_id,
        neighbor_agents_past=neighbor_agents_past,
    )

    # ✅ (요구조건 a) “현재 iteration 기준”으로 (현재+미래 80)만 뽑아서
    #    (past 21)과 합친 101개 기준으로 규칙을 강제
    neighbor_cur_fut_horizon_gt_11_dim = self._slice_neighbor_cur_fut_horizon_11dim(
        neighbor_cur_fut_all_gt_11_dim=neighbor_cur_fut_all_gt_11_dim,
        iteration=iteration,
        future_len=self.num_future_poses,
    )

    # ✅ 규칙 적용 + 최종 neighbor_future_gt_3_dim / neighbor_future_gt_11_dim 생성
    neighbor_agents_past, neighbor_future_gt_11_dim, neighbor_future_gt_3_dim = \
        self._build_neighbor_future_gt_from_past_and_cur_fut_11dim(
            neighbor_agents_past=neighbor_agents_past,
            neighbor_cur_fut_gt_11_dim=neighbor_cur_fut_horizon_gt_11_dim,
        )

    # (선택) 기존처럼 “전체 길이” future_all도 계속 내보내고 싶다면:
    # - 여기서는 추가적인 101 규칙 적용이 아니라, raw 변환만 제공합니다.
    # - shape: (N, Tf_all, 3)
    neighbor_future_all_gt_11_dim = neighbor_cur_fut_all_gt_11_dim[:,
                                                                   1:, :]  # 현재(0) 제외
    neighbor_future_all_gt_3_dim = self._traj11_to_traj3_yaw(
        neighbor_future_all_gt_11_dim)

    static_objects = build_static_feature(
        present_static_feat_5=present_static_feat_5,
        static_types_list=static_types_list,
        max_static_num=self.max_static_num,
        ego_cur_pose_np=ego_cur_pose_np,
        filter_radius=self._get_effective_filter_radius_m(),
    )
    key_to_array = {
        "origin_world_pose": origin_world_pose,  # (4,)
        "ego_agent_past": ego_agent_past,  # (time_len, 11)
        "neighbor_agents_past": neighbor_agents_past,
        # (chosen_agent_num, time_len, 11)

        # ✅ (요구조건 a) 최종 출력(규칙 적용된) future GT
        "neighbor_future_gt_3_dim": neighbor_future_gt_3_dim,
        # (chosen_agent_num, future_len, 3)
        "neighbor_future_gt_11_dim": neighbor_future_gt_11_dim,
        # (chosen_agent_num, future_len, 11)

        # 기존 키 유지(필요시):
        "neighbor_future_all_gt_3_dim": neighbor_future_all_gt_3_dim,
        # (chosen_agent_num, future_all_len, 3)
        "static_objects": static_objects,  # (chosen_static_num, 10)
    }

    key_to_road_safety = self._get_road_safety_features(
        scenario=scenario,
        ego_cur_pose_np=ego_cur_pose_np,
    )
    key_to_array.update(key_to_road_safety)

    (
        route_roadblock_ids,
        elements_to_obj_polylines,
        elements_to_traffic_light,
        speed_limit_dict,
        lanes_roadblock_id_list,
    ) = self._prepare_map(
        scenario=scenario,
        ego_state=ego_state,
        ego_point2d=ego_point2d,
        ego_heading=ego_heading,
        map_api=map_api,
        traffic_light_data=traffic_light_data,
    )

    car_token_to_rr_ids: Dict[
        str, List[str]] = self._prepare_car_token_to_rr_ids(
            scenario=scenario,
            use_route_lanes=use_route_lanes,
            neighbor_track_token=neighbor_track_token,
        )

    neighbor_agents_current = neighbor_agents_past[:, -1, :]
    map_key_to_array = map_process(
        route_roadblock_ids, car_token_to_rr_ids, neighbor_track_token,
        neighbor_agents_current, ego_cur_pose_np, elements_to_obj_polylines,
        elements_to_traffic_light, speed_limit_dict,
        lanes_roadblock_id_list, self._map_elements, self._max_map_elements,
        self._map_points_num)
    key_to_array.update(map_key_to_array)

    key_to_array = convert_data_dict_to_device_tensors(
        key_to_array, device, squeeze)

    key_to_array["neighbor_track_token"] = neighbor_track_token

    return key_to_array

@staticmethod
def zero_out_random_time_prefix(
        neighbor_agents_past: np.ndarray) -> np.ndarray:
    """주어진 neighbor_agents_past 텐서에서
    (max_agent_num, time_len, feature_dim) 형태를 가정하고,
    0 ~ time_len-1 사이에서 랜덤 target을 뽑아
    neighbor_agents_past[:, :target, :8] 구간을 0으로 만드는 함수.

    Args:
        neighbor_agents_past (np.ndarray):
            입력 텐서. shape = (num_agents, time_len, 11)

    Returns:
        np.ndarray:
            특정 시간 구간을 0으로 채운 텐서. shape 동일.
    """
    num_agents, time_len, feature_dim = neighbor_agents_past.shape

    # 0부터 time_len-1 사이 랜덤 target 선택
    target: int = np.random.randint(0, time_len // 2)
    print("target:", target)

    # 복사본을 만들어 수정 (원본을 바꾸고 싶으면 copy 제거)
    modified_past: np.ndarray = neighbor_agents_past.copy()

    # 첫 8개 feature만 0으로 세팅
    modified_past[:, :target, :8] = 0.0

    return modified_past

def _merge_and_interpolate_ego_11dim(
    self,
    ego_agent_past: np.ndarray,  # shape: (Tp, 11)
    ego_future_gt_11_dim: np.ndarray,  # shape: (Tf, 11)
    *,
    eps: float = 1e-8,
) -> Tuple[np.ndarray, np.ndarray]:
    """ego 과거~현재 + 미래를 합친 뒤, 유효~유효 사이에 무효(0)가 끼지 않게 만든다.

    - ego는 원래 대부분 "항상 존재"하지만,
      시나리오 길이가 짧아 미래 끝이 0으로 패딩되는 경우 등이 있어서
      안전하게 한 번 더 규칙을 강제한다.

    규칙
    ----
    - 현재 프레임(=past의 마지막)이 유효일 때만 처리한다.
    - 전체 101개를 과거→미래로 봤을 때,
      유효 프레임과 유효 프레임 사이에 무효 프레임이 끼면
      그 중간을 x/y/cos/sin/vx/vy의 "직선 중간값"으로 채워서
      연속 유효 구간을 만든다.
    - 구간 밖(prefix/suffix)은 0으로 둔다.

    Args:
        ego_agent_past (np.ndarray):
            shape: (Tp, 11)  # Tp=21
        ego_future_gt_11_dim (np.ndarray):
            shape: (Tf, 11)  # Tf=80

    Returns:
        Tuple[np.ndarray, np.ndarray]:
            - new_ego_agent_past: shape (Tp, 11)
            - new_ego_future_11:  shape (Tf, 11)
    """
    if ego_agent_past.ndim != 2 or ego_agent_past.shape[-1] != 11:
        raise ValueError(
            f"`ego_agent_past` shape는 (Tp, 11)이어야 합니다. got {ego_agent_past.shape}"
        )
    if ego_future_gt_11_dim.ndim != 2 or ego_future_gt_11_dim.shape[
            -1] != 11:
        raise ValueError(
            f"`ego_future_gt_11_dim` shape는 (Tf, 11)이어야 합니다. got {ego_future_gt_11_dim.shape}"
        )

    Tp: int = int(ego_agent_past.shape[0])
    Tf: int = int(ego_future_gt_11_dim.shape[0])
    if Tp == 0 or Tf == 0:
        return ego_agent_past, ego_future_gt_11_dim

    # full: (Tp+Tf, 11) == (101, 11)
    full: np.ndarray = np.concatenate(
        [ego_agent_past, ego_future_gt_11_dim], axis=0).astype(np.float32,
                                                               copy=True)
    T_full: int = int(full.shape[0])
    current_index: int = Tp - 1

    # 유효 프레임: 앞 8개 중 하나라도 0이 아니면 유효
    valid_mask_1d: np.ndarray = (np.abs(full[:, :8])
                                 > eps).any(axis=1)  # (T_full,)

    # 현재가 유효일 때만 규칙(b)을 강제
    if not bool(valid_mask_1d[current_index]):
        return ego_agent_past, ego_future_gt_11_dim

    valid_idx: np.ndarray = np.nonzero(valid_mask_1d)[0]  # (K,)
    if valid_idx.size == 0:
        return ego_agent_past, ego_future_gt_11_dim

    region_mask: np.ndarray = np.zeros((T_full,), dtype=bool)

    if valid_idx.size == 1:
        region_mask[int(valid_idx[0])] = True
    else:
        first_valid: int = int(valid_idx[0])
        last_valid: int = int(valid_idx[-1])
        region_mask[first_valid:last_valid + 1] = True

        # 중간 구멍이 있으면 x/y/cos/sin/vx/vy를 채움
        if last_valid - first_valid + 1 > valid_idx.size:
            xs: np.ndarray = valid_idx.astype(np.float64)
            seg_idx: np.ndarray = np.arange(first_valid,
                                            last_valid + 1,
                                            dtype=np.float64)

            for dim_idx in range(6):
                ys: np.ndarray = full[valid_idx, dim_idx].astype(np.float64,
                                                                 copy=False)
                interp_vals: np.ndarray = np.interp(seg_idx, xs, ys)
                full[first_valid:last_valid + 1,
                     dim_idx] = interp_vals.astype(np.float32, copy=False)

    # type/size는 현재 프레임 값을 대표값으로 쓴다.
    type_vec: np.ndarray = ego_agent_past[-1,
                                          8:11].astype(np.float32,
                                                       copy=False)  # (3,)
    rep_size: np.ndarray = ego_agent_past[-1,
                                          6:8].astype(np.float32,
                                                      copy=False)  # (2,)

    full[:, 8:11] = 0.0
    full[region_mask, 8:11] = type_vec

    # 구간 밖은 완전히 0으로
    full[~region_mask, :] = 0.0

    # width/length 채우기 + cos/sin 정리 (배치 함수 재사용을 위해 (1,T,11)로 바꿈)
    full_b: np.ndarray = full[None, :, :]  # (1, T_full, 11)
    region_b: np.ndarray = region_mask[None, :]  # (1, T_full)
    rep_size_b: np.ndarray = rep_size[None, :]  # (1, 2)

    full_b = self._fill_width_length_with_representative_size(
        traj_11=full_b,
        valid_mask=region_b,
        rep_size=rep_size_b,
    )
    full_b = self._normalize_cos_sin_in_traj_11(
        traj_11=full_b,
        valid_mask=region_b,
    )
    full = full_b[0]  # (T_full, 11)

    new_ego_agent_past: np.ndarray = full[:Tp, :]
    new_ego_future_11: np.ndarray = full[Tp:, :]
    return new_ego_agent_past, new_ego_future_11

@staticmethod
def _traj11_to_traj3_yaw(traj_11: np.ndarray) -> np.ndarray:
    """11차원 궤적을 [x, y, yaw] 3차원으로 바꾼다.

    Args:
        traj_11 (np.ndarray):
            - (T, 11) 또는 (N, T, 11)
            - 11차원 = [x, y, cos, sin, vx, vy, width, length, onehot(3)]

    Returns:
        np.ndarray:
            - (T, 3) 또는 (N, T, 3)
            - 3차원 = [x, y, yaw]
    """
    if traj_11.ndim == 2:
        yaw = np.arctan2(traj_11[:, 3], traj_11[:, 2])
        return np.stack([traj_11[:, 0], traj_11[:, 1], yaw], axis=-1)
    if traj_11.ndim == 3:
        yaw = np.arctan2(traj_11[:, :, 3], traj_11[:, :, 2])
        return np.stack([traj_11[:, :, 0], traj_11[:, :, 1], yaw], axis=-1)
    raise ValueError(
        f"`traj_11`은 (T,11) 또는 (N,T,11) 이어야 합니다. got {traj_11.shape}")

def _merge_and_interpolate_neighbor_11dim(
    self,
    neighbor_agents_past: np.ndarray,  # (max_agent_num, Tp, 11)
    neighbor_cur_fut_gt_11_dim: np.ndarray,
    # (max_agent_num, Tf, 11)  # 0번이 현재
) -> Tuple[np.ndarray, np.ndarray]:
    """neighbor 과거/현재와 현재/미래를 이어 붙인 뒤, "중간에 비었다가 다시 살아나는" 문제를 막는다.

    이 함수가 보장하는 규칙(한 agent 기준)
    -----------------------------------
    (a) 현재 프레임(=past의 마지막)이 무효라면:
        - 과거~미래 전체 프레임을 전부 0으로 만든다.
        - 즉, "현재는 없는데 미래에 갑자기 나타나는" 케이스를 없앤다.

    (b) 현재 프레임이 유효라면:
        - 전체 시퀀스를 과거→미래로 봤을 때,
          유효 프레임과 유효 프레임 사이에 무효(0) 프레임이 끼지 않게 만든다.
        - 방법:
          1) 전체 시퀀스에서 유효 프레임들의 첫 index(first_valid)와 마지막 index(last_valid)를 찾는다.
          2) first_valid~last_valid 구간은 "연속 유효 구간"으로 만들고,
             그 안에 비어 있던 프레임은 x/y/cos/sin/vx/vy를 직선 중간값으로 채운다.
          3) 구간 밖(prefix/suffix)은 그대로 0으로 둔다.

    Args:
        neighbor_agents_past (np.ndarray):
            shape: (N, Tp, 11)
        neighbor_cur_fut_gt_11_dim (np.ndarray):
            shape: (N, Tf, 11)
            index 0이 "현재"라고 가정한다.

    Returns:
        Tuple[np.ndarray, np.ndarray]:
            - new_neighbor_agents_past: shape (N, Tp, 11)
            - new_neighbor_future_11:  shape (N, Tf-1, 11)  # 현재 제외
    """
    # 기본 shape 검사
    if neighbor_agents_past.ndim != 3 or neighbor_cur_fut_gt_11_dim.ndim != 3:
        raise ValueError(
            f"`neighbor_agents_past` / `neighbor_cur_fut_gt_11_dim`는 "
            f"(max_agent_num, time_len, 11) 형태여야 합니다. "
            f"got {neighbor_agents_past.shape}, {neighbor_cur_fut_gt_11_dim.shape}"
        )
    if neighbor_agents_past.shape[-1] != 11 or \
            neighbor_cur_fut_gt_11_dim.shape[-1] != 11:
        raise ValueError(
            f"두 입력의 마지막 차원은 11이어야 합니다. "
            f"got {neighbor_agents_past.shape[-1]}, {neighbor_cur_fut_gt_11_dim.shape[-1]}"
        )
    if neighbor_agents_past.shape[0] != neighbor_cur_fut_gt_11_dim.shape[0]:
        raise ValueError(
            "neighbor_agents_past 와 neighbor_cur_fut_gt_11_dim 의 agent 축 크기가 다릅니다."
        )

    max_agent_num: int = int(neighbor_agents_past.shape[0])
    past_len: int = int(neighbor_agents_past.shape[1])  # Tp
    fut_len_with_current: int = int(
        neighbor_cur_fut_gt_11_dim.shape[1])  # Tf (0번 포함)

    # future 프레임이 아예 없으면 그대로 반환
    if fut_len_with_current == 0:
        return neighbor_agents_past, neighbor_cur_fut_gt_11_dim

    # current(0번) 프레임은 항상 한 번 제거
    # neighbor_future_wo_current: (N, Tf-1, 11)
    neighbor_future_wo_current: np.ndarray = neighbor_cur_fut_gt_11_dim[:,
                                                                        1:, :]

    # 에이전트가 0명이면 보간 없이 바로 반환
    if max_agent_num == 0:
        return neighbor_agents_past, neighbor_future_wo_current

    # full_traj_11: (N, Tp + (Tf-1), 11)
    full_traj_11: np.ndarray = np.concatenate(
        [neighbor_agents_past, neighbor_future_wo_current], axis=1)
    T_full: int = int(full_traj_11.shape[1])
    current_index: int = past_len - 1  # full_traj에서 현재는 past의 마지막

    # (1) size 대표값 계산 (기본: 과거~현재만 사용)
    # stable_size: (N, 2) = [width_rep, length_rep]
    stable_size: np.ndarray = self._estimate_stable_neighbor_sizes(
        full_traj_11=full_traj_11,
        past_len=past_len,
        use_future=False,
    )

    # (2) "유효 프레임" 마스크 계산 (앞 8개 값 중 하나라도 0이 아니면 유효)
    full_off_p_mask, _ = self._get_agents_past_cur_mask_np(full_traj_11)
    full_valid_mask: np.ndarray = ~full_off_p_mask  # (N, T_full)

    # (3) rule (a): 현재 프레임이 무효면 전체 0
    current_valid_mask: np.ndarray = full_valid_mask[:,
                                                     current_index]  # (N,)
    invalid_agents: np.ndarray = ~current_valid_mask

    # 결과 버퍼
    full_traj_interp: np.ndarray = full_traj_11.astype(np.float32,
                                                       copy=True)

    # 각 agent별 "연속 유효 구간" 마스크
    # region_mask_all: (N, T_full)
    region_mask_all: np.ndarray = np.zeros((max_agent_num, T_full),
                                           dtype=bool)

    # 현재가 무효인 agent는 전부 0으로 만들고 끝
    if np.any(invalid_agents):
        full_traj_interp[invalid_agents, :, :] = 0.0
        # region_mask_all은 그대로 False

    # (4) rule (b): 현재가 유효인 agent는 유효~유효 사이 구멍을 채움
    for agent_idx in range(max_agent_num):
        if invalid_agents[agent_idx]:
            continue

        agent_valid_idx: np.ndarray = \
        np.nonzero(full_valid_mask[agent_idx])[0]
        if agent_valid_idx.size == 0:
            # (현재는 유효인데 valid_idx가 0인 경우는 거의 없지만, 안전하게 0 처리)
            full_traj_interp[agent_idx, :, :] = 0.0
            continue

        first_valid: int = int(agent_valid_idx[0])
        last_valid: int = int(agent_valid_idx[-1])

        if agent_valid_idx.size == 1:
            # 유효 프레임이 1개면 "유효-무효-유효" 자체가 성립하지 않으므로 그대로 둠
            region_mask_all[agent_idx, first_valid] = True
        else:
            # first~last를 "연속 유효 구간"으로 선언
            region_mask_all[agent_idx, first_valid:last_valid + 1] = True

            # 중간에 빈 프레임이 있을 때만 x/y/cos/sin/vx/vy를 직선 중간값으로 채움
            if last_valid - first_valid + 1 > agent_valid_idx.size:
                xs: np.ndarray = agent_valid_idx.astype(np.float64)  # (K,)
                seg_idx: np.ndarray = np.arange(first_valid,
                                                last_valid + 1,
                                                dtype=np.float64)

                for dim_idx in range(6):  # 0~5: [x, y, cos, sin, vx, vy]
                    ys: np.ndarray = full_traj_11[agent_idx,
                                                  agent_valid_idx,
                                                  dim_idx].astype(
                                                      np.float64,
                                                      copy=False)
                    interp_vals: np.ndarray = np.interp(seg_idx, xs, ys)
                    full_traj_interp[agent_idx, first_valid:last_valid + 1,
                                     dim_idx] = interp_vals.astype(
                                         np.float32, copy=False)

        # 타입(one-hot)은 agent당 하나로 고정해서 "연속 유효 구간"에만 채움
        type_candidates: np.ndarray = full_traj_11[agent_idx, :,
                                                   8:11]  # (T_full, 3)
        type_valid_mask: np.ndarray = (np.abs(type_candidates).sum(axis=1)
                                       > 0)
        if np.any(type_valid_mask):
            type_vec: np.ndarray = type_candidates[type_valid_mask][
                0].astype(np.float32, copy=False)  # (3,)
        else:
            type_vec = np.zeros((3,), dtype=np.float32)

        full_traj_interp[agent_idx, :, 8:11] = 0.0
        full_traj_interp[agent_idx, region_mask_all[agent_idx],
                         8:11] = type_vec

        # 연속 유효 구간 밖은 완전히 0으로
        full_traj_interp[agent_idx, ~region_mask_all[agent_idx], :] = 0.0

    # (5) width/length를 대표값으로 통일해서 "연속 유효 구간"에만 채움
    full_traj_interp = self._fill_width_length_with_representative_size(
        traj_11=full_traj_interp,
        valid_mask=region_mask_all,  # (N, T_full)
        rep_size=stable_size,  # (N, 2)
    )

    # (6) cos/sin 길이를 1로 정리 (연속 유효 구간에만 적용)
    full_traj_interp = self._normalize_cos_sin_in_traj_11(
        traj_11=full_traj_interp,
        valid_mask=region_mask_all,
    )

    # 과거/현재와 미래로 다시 분리
    new_neighbor_agents_past: np.ndarray = full_traj_interp[:, :past_len, :]
    new_neighbor_future_11: np.ndarray = full_traj_interp[:,
                                                          past_len:, :]  # 현재 제외된 미래

    return new_neighbor_agents_past, new_neighbor_future_11

@staticmethod
def _get_agents_past_cur_mask_np(
        neighbor_agents_past: np.ndarray,  # (agents_num, time_len, 11)
) -> Tuple[np.ndarray, np.ndarray]:
    """에이전트 과거/현재 시퀀스에서 유효성 마스크를 **NumPy 입출력**으로 계산한다.

    정의
    ----
    - 프레임 유효(on/off) 판정(포인트 단위):
      마지막 차원 앞 8개([x, y, cos, sin, vx, vy, width, length]) 값 중
      하나라도 0이 아니면 **유효(True)**, 모두 0이면 **무효(False)**.
    - 에이전트 유효(on/off) 판정(에이전트 단위):
      해당 에이전트의 모든 프레임이 무효이면 **무효(True)**.

    Args:
        neighbor_agents_past (np.ndarray):
            에이전트 과거/현재 시퀀스. shape = (agents_num, time_len, 11)

    Returns:
        Tuple[np.ndarray, np.ndarray]:
            - agents_past_cur_off_p_mask (np.ndarray): shape = (agents_num, time_len), dtype=bool
              각 프레임이 **무효(True)** 인지 여부(포인트 단위 마스크).
            - agents_past_cur_off_mask (np.ndarray): shape = (agents_num,), dtype=bool
              에이전트 전체가 **무효(True)** 인지 여부(에이전트 단위 마스크).

    Raises:
        ValueError: 입력이 (N, T, 11) 형태가 아니거나 마지막 차원(<8)일 때.
    """
    if neighbor_agents_past.ndim != 3 or neighbor_agents_past.shape[-1] < 8:
        raise ValueError(
            f"`neighbor_agents_past`는 (agents_num, time_len, 11) 형태여야 하며 "
            f"마지막 차원은 최소 8이어야 합니다. got {neighbor_agents_past.shape}")

    # (agents_num, time_len, 8)  — 0이 아니면 True
    agents_past_current_is_not_zero = (neighbor_agents_past[..., :8] != 0)

    # (agents_num, time_len) — 8개 값 중 하나라도 0이 아니면 유효
    agents_past_current_not_zero_num = agents_past_current_is_not_zero.sum(
        axis=-1)
    agents_past_cur_off_p_mask = (agents_past_current_not_zero_num == 0
                                 )  # 무효(True)

    # (agents_num) — 에이전트 단위: 유효 프레임 수가 0이면 무효(True)
    agents_past_cur_on_p_mask = ~agents_past_cur_off_p_mask
    agents_past_cur_off_mask = (agents_past_cur_on_p_mask.sum(axis=-1) == 0)

    return agents_past_cur_off_p_mask.astype(
        bool), agents_past_cur_off_mask.astype(bool)

def _get_past_cur_agents_feature(
    self,
    scenario: Optional[NuPlanScenario] = None,
    observation_buffer: Optional[Deque[Observation]] = None,
) -> Tuple[
        List[np.ndarray],  # past_cur_agents_world_8_list
        List[List[TrackedObjectType]],  # past_cur_agents_types_list
        np.ndarray,  # present_static_feat_5
        List[TrackedObjectType],  # static_types_list
        Dict[str, int],  # token_to_id
        Optional[TrackedObjects],  # present_tracked_objects
        Optional[List[TrackedObjects]],  # past_cur_tracked_objects
]:
    """과거+현재 에이전트 / 정적 객체 정보를 공통 포맷으로 추출하는 함수.

    두 가지 입력 경로를 지원합니다.

    1) 오프라인 전처리 (scenario 기반)
        - nuPlanScenario 에서
          · 과거+현재 프레임의 동적 객체(차량/보행자/자전거) 배열
          · 현재 프레임의 정적 객체(표지판, 배리어 등) 배열
          을 추출합니다.

    2) 온라인 시뮬레이션 (observation_buffer 기반)
        - 시뮬레이터의 observation_buffer(연속 관측치)에서
          같은 형태의 정보를 뽑아냅니다.

    두 입력을 동시에 쓰거나, 둘 다 안 주면 오류를 발생시킵니다.

    Args:
        scenario (Optional[NuPlanScenario]):
            - 오프라인 전처리용 nuPlan 시나리오.
            - 과거/현재의 TrackedObjects 를 직접 얻을 때 사용.
        observation_buffer (Optional[Deque[Observation]]):
            - 시뮬레이션 중 관측 버퍼(과거 → 현재 순서).
            - 각 원소는 보통 `DetectionsTracks` 타입이며,
              그 안에 `.tracked_objects` 가 들어 있습니다.

    Returns:
            - past_cur_agents_world_8_list:
                · 길이: num_frames
                · 각 원소: (frame_agents_num, 8) float 배열
                · 각 행 = 한 에이전트, 열 = ID/속도/방향/크기/위치 등
            - past_cur_agents_types_list:
                · 길이: num_frames
                · 각 프레임에서 에이전트 타입(차량/보행자/자전거) 리스트
            - present_static_feat_5:
                · 모양: (cur_static_num, 5)
                · [x, y, heading, width, length] (현재 프레임의 정적 객체)
            - static_types_list:
                · 길이: cur_static_num
                · 각 정적 객체의 타입 리스트
            - token_to_id: Dict[str, int]:
                · 현재 프레임에 등장하는 에이전트 토큰 → 정수 ID
                - (딕셔너리)
            - present_tracked_objects:
                · scenario 경로일 때만 유효(현재 프레임의 TrackedObjects)
                · observation_buffer 경로에서는 None
            - past_cur_tracked_objects:
                · scenario 경로일 때만 유효(과거+현재 TrackedObjects 리스트)
                · observation_buffer 경로에서는 None
    """
    # 입력 유효성 검사
    if (scenario is None and observation_buffer is None) or \
       (scenario is not None and observation_buffer is not None):
        raise ValueError(
            "scenario 또는 observation_buffer 중 정확히 하나만 전달해야 합니다.")
    # --------------------------------------------------
    # 1) scenario 기반 (오프라인 전처리 / work()에서 사용)
    # --------------------------------------------------
    if scenario is not None:
        # 현재 프레임의 동적 객체
        present_tracked_objects = scenario.initial_tracked_objects.tracked_objects

        # 과거 프레임의 동적 객체들
        past_tracked_objects: List[TrackedObjects] = [
            tracked_objects.tracked_objects
            for tracked_objects in scenario.get_past_tracked_objects(
                iteration=0,
                time_horizon=self.past_time_horizon,
                num_samples=self.num_past_poses,
            )
        ]

        # 과거 + 현재를 시간순으로 이어붙인 리스트
        past_cur_tracked_objects = past_tracked_objects + [
            present_tracked_objects
        ]

        agents_source_seq = past_cur_tracked_objects  # List[TrackedObjects]
        static_source = present_tracked_objects  # TrackedObjects
    else:
        # --------------------------------------------------
        # 2) observation_buffer 기반 (온라인 inference / observation_adapter)
        # --------------------------------------------------
        assert observation_buffer is not None  # 타입 체커용

        present_tracked_objects = None
        past_cur_tracked_objects = None

        agents_source_seq = observation_buffer  # Deque[Observation]
        static_source = observation_buffer[-1]  # 가장 최근 프레임, Observation

    # 공통 로직: 에이전트 시퀀스 → 프레임별 에이전트 배열/타입
    past_cur_agents_world_8_list, past_cur_agents_types_list, token_to_id = \
        sampled_tracked_objects_to_array_list(agents_source_seq)

    # 공통 로직: 현재 프레임의 정적 객체 배열/타입
    present_static_feat_5, static_types_list = \
        sampled_static_objects_to_array_list(static_source)

    # 하나의 return 지점
    return (
        past_cur_agents_world_8_list,  # List[np.ndarray], #  (frame_agents_num, 8)
        past_cur_agents_types_list,  # List[List[TrackedObjectType]],
        present_static_feat_5,  # np.ndarray, (len(static_obj), 5)
        static_types_list,  # List[TrackedObjectType],
        token_to_id,  # Dict[str, int],
        present_tracked_objects,  # Optional[TrackedObjects],
        past_cur_tracked_objects,  # Optional[List[TrackedObjects]]
    )

def _prepare_map(
    self,
    scenario: NuPlanScenario,
    ego_state: EgoState,
    ego_point2d: Point2D,
    ego_heading: float,
    map_api: NuPlanMap,
    traffic_light_data: Optional[List[TrafficLightStatusData]] = None,
) -> Tuple[List[str], Dict[str, MapObjectPolylines], Dict[
        str, LaneSegmentTrafficLightData], Dict[str, np.ndarray],
           List[str]]:
    """지도 관련 입력(route/차선/신호/속도제한)을 한 번에 준비하는 공통 유틸.

    공통 흐름:
      1) 시나리오의 route_roadblock_ids 를 가져와 끊어진 구간을 보정한다.
      2) ego 주변의 차선/경계/신호/속도제한 정보를 get_neighbor_vector_set_map 으로 뽑는다.
         - 온라인(inference) 경로: 외부에서 넘어온 traffic_light_data 사용
         - 오프라인(work) 경로: traffic_light_data 가 None 이므로 iteration=0 기준으로 자체 조회

    Args:
        scenario: nuPlan 시나리오 객체.
        ego_state: 현재 ego 상태 (rear_axle 기준).
        ego_point2d: ego 위치 (x, y).
        ego_heading: ego 진행 방향(rad).
        map_api: NuPlanMap 인스턴스.
        traffic_light_data:
            - observation_adapter 경로: 현재 시점의 신호등 리스트를 그대로 전달
            - work 경로: None → 시나리오 0번 iteration 에서 조회
    """
    # 1) route roadblock 보정
    route_roadblock_ids = scenario.get_route_roadblock_ids()
    if route_roadblock_ids != ['']:
        route_roadblock_ids = route_roadblock_correction(
            ego_state, map_api, list(route_roadblock_ids))
    else:
        route_roadblock_ids = []

    # 2) 신호등 데이터 준비
    if traffic_light_data is None:
        traffic_light_data = list(
            scenario.get_traffic_light_status_at_iteration(0))

    # 3) ego 주변 차선/경계/신호/속도제한 추출
    """
1. elements_to_obj_polylines: Dict[str, MapObjectPolylines],
   - 키: 맵 요소 이름 문자열 "LANE", "LEFT_BOUNDARY", "RIGHT_BOUNDARY", "CROSSWALK", ...
   - 값: 해당 요소를 이루는 점들의 모음(MapObjectPolylines)
- 내부 구조: [num_elements, num_points_i, 2]
2. elements_to_traffic_light: Dict[str, LaneSegmentTrafficLightData],
   - 키: 맵 요소 이름 문자열(현재 "LANE"만 사용)
   - 값: 해당 요소에 대응되는 신호등 상태 정보 (LaneSegmentTrafficLightData)
        - 내부 구조: (num_lanes, 4) one-hot
3. speed_limit_dict: Dict[str, np.ndarray],
   - "lane_has_speed_limit": (num_lanes,), bool
   - "lane_speed_limit": (num_lanes,), float32
4. lanes_roadblock_id_list: List[str],
   - 각 차선이 속한 도로 묶음(roadblock) ID 리스트 (길이 = num_lanes)
    """
    (
        elements_to_obj_polylines,
        elements_to_traffic_light,
        speed_limit_dict,
        lanes_roadblock_id_list,
    ) = get_neighbor_vector_set_map(
        map_api,
        self._map_elements,
        ego_point2d,
        ego_heading,
        self._get_map_query_radius_m(),
        traffic_light_data,
    )

    return (
        route_roadblock_ids,
        elements_to_obj_polylines,
        elements_to_traffic_light,
        speed_limit_dict,
        lanes_roadblock_id_list,
    )

def _get_road_safety_features(
    self,
    scenario: NuPlanScenario,
    ego_cur_pose_np: np.ndarray,
) -> Dict[str, np.ndarray]:
    key_to_road_safety = {}
    stop_sign_points = extract_stop_sign_points(
        scenario,
        ego_cur_pose_np,
        self.config.safety_len,
        self._get_map_query_radius_m(),
    )
    crosswalk_points = extract_crosswalk_points(
        scenario,
        ego_cur_pose_np,
        self.config.safety_len,
        self._get_map_query_radius_m(),
    )
    key_to_road_safety["stop_sign_points"] = stop_sign_points
    key_to_road_safety["crosswalk_points"] = crosswalk_points
    return key_to_road_safety

# Use for data preprocess
def work(self, scenarios: List[NuPlanScenario]) -> None:
    for scenario in scenarios:
        map_name = scenario._map_name
        scenario_token = scenario.token
        map_api = scenario.map_api

        (ego_state, ego_point2d, ego_heading, ego_cur_pose_np,
         past_cur_ego_world_10,
         past_cur_time_np) = self._get_past_cur_ego_feature(
             scenario=scenario,
             set_coord_as_center=self.set_coord_as_center,
         )
        # ✅ 추가: ego 기준 좌표계 원점의 세계좌표 포즈 저장
        # origin_world_pose: shape (4,) = [x_world, y_world, cos(yaw), sin(yaw)]
        origin_world_pose: np.ndarray = self._build_origin_world_pose(
            ego_cur_pose_np)

        ego_agent_past = build_ego_past_feature(
            past_cur_ego_world_10=past_cur_ego_world_10,
            ego_cur_pose_np=ego_cur_pose_np,
        )

        # ─────────────────────────────────────────────
        # ✅ (요구조건 b) ego: past+future(101) 기반으로 규칙 적용
        #    그리고 ego_future_gt_3_dim은 “규칙 적용된 11dim”에서 다시 생성
        # ─────────────────────────────────────────────
        (_, ego_future_gt_11_dim_raw) = get_ego_future_array_from_scenario(
            scenario, ego_state, self.num_future_poses,
            self.future_time_horizon)

        ego_agent_past, ego_future_gt_11_dim = self._merge_and_interpolate_ego_11dim(
            ego_agent_past=ego_agent_past,
            ego_future_gt_11_dim=ego_future_gt_11_dim_raw,
        )

        # ✅ 3차원은 반드시 “정리된 11차원”에서 다시 만들기
        ego_future_gt_3_dim = self._traj11_to_traj3_yaw(
            ego_future_gt_11_dim)

        # ✅ center 기준 옵션이면 x,y 원점 보정
        ego_future_gt_3_dim, ego_future_gt_11_dim = self._adjust_ego_future_outputs_to_center_frame(
            ego_state=ego_state,
            ego_future_gt_3_dim=ego_future_gt_3_dim,
            ego_future_gt_11_dim=ego_future_gt_11_dim,
            set_coord_as_center=self.set_coord_as_center,
        )

        (
            past_cur_agents_world_8_list,
            past_cur_agents_types_list,
            present_static_feat_5,
            static_types_list,
            token_to_id,
            present_tracked_objects,
            past_cur_tracked_objects,
        ) = self._get_past_cur_agents_feature(scenario=scenario)

        (neighbor_agents_past, agents_cur_frame_indices, neighbors_id,
         neighbor_track_token) = build_neighbor_past_feature(
             past_cur_agents_world_8_list=past_cur_agents_world_8_list,
             past_cur_agents_types_list=past_cur_agents_types_list,
             max_agent_num=self.caching_max_agent_num,
             ego_cur_pose_np=ego_cur_pose_np,
             max_pedestrians=self.max_pedestrians,
             max_bicycles=self.max_bicycles,
             token_to_id=token_to_id,
             filter_radius=self._get_effective_filter_radius_m(),
         )

        ego_time_len = ego_agent_past.shape[0]
        neighbor_time_len = neighbor_agents_past.shape[1]
        assert ego_time_len == neighbor_time_len == self.num_past_poses + 1, \
            f"Expected time length {self.num_past_poses + 1}, got ego {ego_time_len}, neighbor {neighbor_time_len}"

        cur_fut_agents_world_8_list = self._get_cur_fut_agents_world_8_list(
            scenario, token_to_id, do_inference=False)

        neighbor_cur_fut_gt_11_dim = agent_future_all_process(
            ego_cur_pose_np=ego_cur_pose_np,
            cur_fut_agents_world_8_list=cur_fut_agents_world_8_list,
            neighbor_token_id=neighbors_id,
            neighbor_agents_past=neighbor_agents_past,
        )

        # ─────────────────────────────────────────────
        # ✅ (요구조건 a) neighbor: past+future(101) 기반으로 규칙 적용 후
        #    neighbor_future_gt_3_dim / neighbor_future_gt_11_dim 최종 생성
        # ─────────────────────────────────────────────
        neighbor_agents_past, neighbor_future_gt_11_dim, neighbor_future_gt_3_dim = \
            self._build_neighbor_future_gt_from_past_and_cur_fut_11dim(
                neighbor_agents_past=neighbor_agents_past,
                neighbor_cur_fut_gt_11_dim=neighbor_cur_fut_gt_11_dim,
            )

        static_objects = build_static_feature(
            present_static_feat_5=present_static_feat_5,
            static_types_list=static_types_list,
            max_static_num=self.caching_max_static_num,
            ego_cur_pose_np=ego_cur_pose_np,
            filter_radius=self._get_effective_filter_radius_m(),
        )

        key_to_array = {
            "origin_world_pose": origin_world_pose,  # (4,)
            "ego_agent_past": ego_agent_past,  # (time_len, 11)
            "ego_future_gt_3_dim": ego_future_gt_3_dim,  # (future_len, 3)
            "ego_future_gt_11_dim": ego_future_gt_11_dim,
            # (future_len, 11)
            "neighbor_agents_past": neighbor_agents_past,
            # (chosen_agent_num, time_len, 11)

            # ✅ (요구조건 a) 최종 출력
            "neighbor_future_gt_3_dim": neighbor_future_gt_3_dim,
            # (chosen_agent_num, future_len, 3)
            "neighbor_future_gt_11_dim": neighbor_future_gt_11_dim,
            # (chosen_agent_num, future_len, 11)
            "static_objects": static_objects,  # (chosen_static_num, 10)
        }

        key_to_road_safety = self._get_road_safety_features(
            scenario=scenario,
            ego_cur_pose_np=ego_cur_pose_np,
        )
        key_to_array.update(key_to_road_safety)

        (
            route_roadblock_ids,
            elements_to_obj_polylines,
            elements_to_traffic_light,
            speed_limit_dict,
            lanes_roadblock_id_list,
        ) = self._prepare_map(
            scenario=scenario,
            ego_state=ego_state,
            ego_point2d=ego_point2d,
            ego_heading=ego_heading,
            map_api=map_api,
        )

        car_token_to_rr_ids: Dict[str,
                                  List[str]] = get_npc_route_roadblock_ids(
                                      scenario, past_cur_tracked_objects,
                                      neighbor_track_token)

        neighbor_agents_current = neighbor_agents_past[:, -1, :]

        map_key_to_array = map_process(
            route_roadblock_ids, car_token_to_rr_ids, neighbor_track_token,
            neighbor_agents_current, ego_cur_pose_np,
            elements_to_obj_polylines, elements_to_traffic_light,
            speed_limit_dict, lanes_roadblock_id_list, self._map_elements,
            self._caching_max_map_elements, self._map_points_num)
        key_to_array.update(map_key_to_array)

        chore_data = {
            "map_name": map_name,
            "token": scenario_token,
        }
        key_to_array.update(chore_data)

        if self.config.make_statistics_when_caching:
            veh_cnt, ped_cnt, bic_cnt = self._count_valid_neighbors_by_type(
                neighbor_agents_past=neighbor_agents_past)
            ratio_percent, mean_speed_kmh = self._compute_lane_speed_stats(
                map_key_to_array)
            stats_payload = {
                "vehicle_count":
                    int(veh_cnt),
                "pedestrian_count":
                    int(ped_cnt),
                "bicycle_count":
                    int(bic_cnt),
                "lane_speed_limit_ratio_percent":
                    float(ratio_percent),
                "mean_speed_limit_kmh": (None if mean_speed_kmh is None else
                                         float(mean_speed_kmh)),
            }
            self._save_sample_stats_json(map_name, scenario_token,
                                         stats_payload)

        final_file_name = f"{key_to_array['map_name']}_{key_to_array['token']}"
        ego_agent_past = key_to_array["ego_agent_past"]  # (time_len, 11)
        ego_agent_type = ego_agent_past[:, 8:11]  # (time_len, 3)
        print("ego_agent_type sum all: ",
              ego_agent_type.sum(axis=0))  # (3, )
        ego_future_gt_11_dim = key_to_array[
            "ego_future_gt_11_dim"]  # (future_len, 11)
        ego_agent_type_future = ego_future_gt_11_dim[:, 8:
                                                     11]  # (future_len, 3)
        print("ego_agent_type_future sum all: ",
              ego_agent_type_future.sum(axis=0))  # (3, )

        self.save_to_disk(self._save_dir, final_file_name, key_to_array)

        key_to_array["neighbor_track_token"] = neighbor_track_token

        if self.config.save_image:
            save_dir = os.path.join(self._save_dir, "debug_vis")
            save_path = os.path.join(save_dir, f"{final_file_name}.png")
            os.makedirs(save_dir, exist_ok=True)
            key_to_array["token_to_future_traj_wrt_ego"] = None
            draw_machine.draw_world_model_to_png(key_to_array,
                                                 output_data={},
                                                 save_path=save_path)

@staticmethod
def _estimate_stable_neighbor_sizes(
    full_traj_11: np.ndarray,  # shape: (N, T_full, 11)
    past_len: int,
    *,
    use_future: bool = False,
    eps: float = 1e-3,
    width_max: float = 20.0,
    length_max: float = 60.0,
) -> np.ndarray:  # shape: (N, 2)
    """이웃 에이전트별 width/length 대표값(하나)을 만든다.

    배경
    ----
    nuPlan의 박스 width/length는 프레임마다 조금씩 흔들릴 수 있다.
    그런데 실제 물체의 크기는 시간에 따라 바뀌지 않는 값이므로,
    여러 프레임을 보고 "대표 크기" 하나를 만든 뒤 시간축 전체에 쓰는 편이 안정적이다.

    이 함수가 하는 일
    -----------------
    - 각 에이전트(i)에 대해, 여러 프레임에서 관측된 width/length를 모은다.
    - 그 중에서 "쓸 만한 값"만 남긴 뒤,
      정렬했을 때 가운데 값(중간값)을 대표값으로 선택한다.
      (한두 번 튀는 값이 있어도 평균보다 덜 흔들리기 때문)

    "쓸 만한 값" 조건
    ----------------
    1) 해당 프레임이 패딩이 아님:
       - [x, y, cos, sin, vx, vy] 중 하나라도 0이 아니면 패딩이 아니라고 본다.
    2) width > eps, length > eps
    3) 너무 큰 값은 버린다:
       - width <= width_max, length <= length_max

    시간 구간 선택
    ------------
    - use_future=False:
        과거~현재(past_len 프레임)만 보고 대표값을 만든다.
        (실제로 미래가 없는 환경과 맞추려면 이게 더 안전하다.)
    - use_future=True:
        과거~현재~미래 전체(full_traj_11 전체 프레임)를 보고 대표값을 만든다.
        (완전 오프라인에서 더 많이 평균내고 싶을 때 선택)

    값이 하나도 없을 때(예외 처리)
    ----------------------------
    - 위 조건을 통과한 width/length가 하나도 없다면,
      과거~현재의 마지막 프레임(현재 프레임)의 width/length를 fallback으로 쓴다.
      그것마저 0이면 결과도 0으로 남는다.

    Args:
        full_traj_11 (np.ndarray):
            shape = (N, T_full, 11)
            [x, y, cos, sin, vx, vy, width, length, onehot(3)]
        past_len (int):
            shape 관점에서 과거~현재 길이.
            full_traj_11[:, :past_len, :] 구간이 과거~현재라고 본다.
        use_future (bool):
            True면 미래까지 포함해서 대표 크기를 만든다.
        eps (float):
            0에 매우 가까운 값들을 "없는 값"으로 보기 위한 기준.
        width_max (float):
            말도 안 되게 큰 width를 버리기 위한 상한.
        length_max (float):
            말도 안 되게 큰 length를 버리기 위한 상한.

    Returns:
        np.ndarray:
            shape = (N, 2)
            각 에이전트의 [width_rep, length_rep] (float32).
    """
    if full_traj_11.ndim != 3 or full_traj_11.shape[-1] != 11:
        raise ValueError(
            f"`full_traj_11` shape는 (N, T, 11)이어야 합니다. got {full_traj_11.shape}"
        )
    if past_len <= 0 or past_len > full_traj_11.shape[1]:
        raise ValueError(
            f"`past_len`은 1 이상이며 T_full 이하이어야 합니다. got past_len={past_len}, T_full={full_traj_11.shape[1]}"
        )

    N: int = int(full_traj_11.shape[0])
    T_full: int = int(full_traj_11.shape[1])

    # 대표값 계산에 사용할 구간 길이
    T_src: int = T_full if use_future else int(past_len)

    # src_traj: (N, T_src, 11)
    src_traj: np.ndarray = full_traj_11[:, :T_src, :]

    # 패딩이 아닌 프레임 마스크: (N, T_src)
    #  - [x, y, cos, sin, vx, vy] 중 하나라도 0이 아니면 True
    dynamic_valid: np.ndarray = (np.abs(src_traj[:, :, :6])
                                 > eps).any(axis=-1)

    # size: (N, T_src, 2) = [width, length]
    size: np.ndarray = src_traj[:, :, 6:8]

    # size 값이 "쓸 만한지" 마스크: (N, T_src)
    size_valid: np.ndarray = ((size[:, :, 0] > eps) &
                              (size[:, :, 1] > eps) &
                              (size[:, :, 0] <= float(width_max)) &
                              (size[:, :, 1] <= float(length_max)) &
                              dynamic_valid)

    # fallback: 현재 프레임(과거~현재의 마지막) size
    # fallback_size: (N, 2)
    fallback_size: np.ndarray = full_traj_11[:, past_len - 1,
                                             6:8].astype(np.float32,
                                                         copy=False)

    # out: (N, 2)
    out: np.ndarray = fallback_size.copy()

    # 에이전트별로 대표값(중간값) 계산
    for i in range(N):
        # valid_vals: (K, 2)  K는 유효 샘플 개수(가변)
        valid_vals: np.ndarray = size[i][size_valid[i]]
        if valid_vals.shape[0] == 0:
            continue
        # 중간값(정렬했을 때 가운데 값): (2,)
        out[i] = np.median(valid_vals, axis=0).astype(np.float32,
                                                      copy=False)

    return out.astype(np.float32, copy=False)

@staticmethod
def _fill_width_length_with_representative_size(
        traj_11: np.ndarray,  # shape: (N, T, 11)
        valid_mask: np.ndarray,  # shape: (N, T)
        rep_size: np.ndarray,  # shape: (N, 2)
) -> np.ndarray:
    """width/length 채널(6:8)을 대표값으로 통일해서 넣는다.

    이 함수가 하는 일
    -----------------
    - traj_11의 width/length 채널을 먼저 대표값으로 채운다.
    - 그 다음 valid_mask가 False인 프레임은 width/length를 0으로 만든다.
      (즉, "유효한 프레임에서만" size가 존재하도록 맞춘다.)

    Args:
        traj_11 (np.ndarray):
            shape = (N, T, 11)
            [x, y, cos, sin, vx, vy, width, length, onehot(3)]
        valid_mask (np.ndarray):
            shape = (N, T)
            True면 유효 프레임, False면 패딩/무효 프레임이라고 본다.
        rep_size (np.ndarray):
            shape = (N, 2)
            각 에이전트의 [width_rep, length_rep].

    Returns:
        np.ndarray:
            shape = (N, T, 11)
            width/length가 대표값으로 통일된 traj_11 (입력을 직접 수정하고 그대로 반환).
    """
    if traj_11.ndim != 3 or traj_11.shape[-1] != 11:
        raise ValueError(
            f"`traj_11` shape는 (N, T, 11)이어야 합니다. got {traj_11.shape}")
    if valid_mask.shape != traj_11.shape[:2]:
        raise ValueError(
            f"`valid_mask` shape는 (N, T)이어야 합니다. got {valid_mask.shape}, expected {traj_11.shape[:2]}"
        )
    if rep_size.shape != (traj_11.shape[0], 2):
        raise ValueError(
            f"`rep_size` shape는 (N, 2)이어야 합니다. got {rep_size.shape}, expected {(traj_11.shape[0], 2)}"
        )

    # 대표 size를 모든 프레임에 채우고, 유효 마스크로 무효 프레임은 0 처리
    # traj_11[:, :, 6:8]: (N, T, 2)
    traj_11[:, :, 6:8] = rep_size[:, None, :].astype(traj_11.dtype,
                                                     copy=False)
    traj_11[:, :, 6:8] *= valid_mask[:, :, None].astype(traj_11.dtype,
                                                        copy=False)
    return traj_11

def _get_future_tracked_objects_array_list(
    self,
    scenario: NuPlanScenario,
    token_to_id: Dict[str, int],
    iteration: int = 0,
    future_time_horizon: Optional[float] = None,
    num_samples: Optional[int] = None,
) -> Tuple[List[np.ndarray], Dict[str, int]]:
    """현재 시점부터 일정 시간 동안의 모든 에이전트 상태를
    프레임별 배열 리스트로 뽑아낸다.

    하는 일 요약
    -------------
    1) 주어진 iteration 에서
       - 현재 프레임의 TrackedObjects
       - 그 이후 future_time_horizon 동안, num_samples 개의 미래 TrackedObjects
       를 가져온다.

    2) `sampled_tracked_objects_to_array_list` 를 통해,
       각 프레임을 (frame_agents_num, 8) 형태의 배열로 바꾼다.
       - 각 행: [track_id, vx, vy, heading, width, length, x, y]
       - 프레임마다 에이전트 수(frame_agents_num)는 달라질 수 있다.
       - 리스트 순서는 [현재, t+1, t+2, ...] 시간 순서.

    3) 동시에, track_token(문자열)을 일관된 정수 ID 로 바꿔주는
       token_to_id 매핑 사전도 함께 만든다.

    Args:
        iteration (int, optional):
            기준이 되는 현재 step 인덱스(0 기반).
        future_time_horizon (Optional[float], optional):
            현재 이후로 몇 초까지 볼 것인지. None 이면 self.future_time_horizon 사용.
        num_samples (Optional[int], optional):
            몇 개의 미래 프레임을 뽑을지. None 이면 self.num_future_poses 사용.

    Returns:
        Tuple[List[np.ndarray], Dict[str, int]]:
            - cur_fut_agents_world_8_list: List[np.ndarray]
                · 길이: 1 + num_samples
                · 각 원소 shape: (frame_agents_num_t, 8)
                  [track_id, vx, vy, heading, width, length, x, y]
                · 리스트 순서: [현재, t+1, t+2, ...]
            - token_to_id: Dict[str, int]
                · 전체 프레임에서 등장한 track_token → 정수 ID 매핑 사전.
    """
    present_tracked_objects: TrackedObjects = scenario.get_tracked_objects_at_iteration(
        iteration).tracked_objects

    if future_time_horizon is None:
        future_time_horizon = self.future_time_horizon
    if num_samples is None:
        num_samples = self.num_future_poses

    # 미래 프레임들의 TrackedObjects 리스트
    future_tracked_objects: List[TrackedObjects] = [
        tracked_objects.tracked_objects
        for tracked_objects in scenario.get_future_tracked_objects(
            iteration=iteration,
            time_horizon=future_time_horizon,
            num_samples=num_samples,
        )
    ]

    # [현재] + [미래들] 을 하나의 시퀀스로 합친다.
    sampled_future_observations: List[TrackedObjects] = [
        present_tracked_objects
    ] + future_tracked_objects

    # cur_fut_agents_world_8_list: List[np.ndarray]
    #   - 각 원소: (frame_agents_num, 8)
    # token_to_id: Dict[str, int]
    (cur_fut_agents_world_8_list, _,
     token_to_id) = sampled_tracked_objects_to_array_list(
         sampled_future_observations, token_to_id)

    return cur_fut_agents_world_8_list, token_to_id

def save_to_disk(self, dir: str, final_file_name: str,
                 data: Dict[str, np.ndarray]) -> None:
    final_path = f"{dir}/{final_file_name}.npz"
    tmp_path = final_path + ".tmp"

    os.makedirs(dir, exist_ok=True)

    try:
        with open(tmp_path, "wb") as f:
            np.savez_compressed(f, **data)
            f.flush()
            os.fsync(f.fileno())
        os.replace(tmp_path, final_path)

    except BaseException:  # ✅ Ctrl+C(KeyboardInterrupt)도 여기로 들어옴
        with contextlib.suppress(Exception):
            if os.path.exists(tmp_path):
                os.remove(tmp_path)
        raise

============== """ Module: Agent Data Preprocessing Functions Description: This module contains functions for agents related data processing.

Categories: 1. Get list of agent array from raw data 2. Get agents array for model input """ import numpy as np from typing import Dict, Deque, List, Tuple, Optional, Union from nuplan.common.actor_state.tracked_objects import TrackedObjects, TrackedObject from nuplan.planning.training.preprocessing.utils.agents_preprocessing import AgentInternalIndex from nuplan.common.actor_state.tracked_objects_types import TrackedObjectType from nuplan.planning.simulation.observation.observation_type import DetectionsTracks, Observation from nuplan.common.actor_state.ego_state import EgoState from diffusion_planner.data_process.utils import convert_absolute_quantities_to_relative from nuplan.common.geometry.convert import numpy_array_to_absolute_velocity

=====================

1. Get list of agent array from raw data

=====================

def _extract_agent_array( tracked_objects: TrackedObjects, token_to_id: Dict[str, int], ) -> Tuple[np.ndarray, Dict[str, int], List[TrackedObjectType]]: """단일 시점의 감지 결과에서 에이전트 정보를 배열로 뽑아내는 함수.

이 함수는 한 프레임 안에 존재하는 여러 객체들 중,
지정된 종류(차량, 보행자, 자전거 등)에 해당하는 것만 골라서
**숫자 배열**로 정리해 줍니다. 각 에이전트는 다음과 같은 값들을 가집니다.

- 고유 ID(정수, track_token 을 숫자로 매핑한 값)
- 속도(vx, vy)
- 진행 방향(heading, radian)
- 너비/길이(width, length)
- 위치(x, y)

또한 문자열 기반의 track_token 을 재사용 가능한 정수 ID로 바꾸기 위해
`token_to_id` 딕셔너리를 갱신합니다.

Args:
    tracked_objects (TrackedObjects):
        - 현재 프레임에서 감지된 모든 객체들의 모음입니다.
    token_to_id (Dict[str, int]):
        - track_token(문자열)을 정수 ID로 매핑하는 사전입니다.
        - 새로 등장한 track_token 은 여기서 새로운 ID를 부여받습니다.

Returns:
    Tuple[np.ndarray, Dict[str, int], List[TrackedObjectType]]:
        - a_frame_agents_feat_8 (np.ndarray):
            - 모양: (frame_agents_num, AgentInternalIndex.dim() = 8 )
            - int id , vx, vy, heading, width, length, x, y
            - dtype: float64
            - 각 행은 한 에이전트에 해당하며,
              `AgentInternalIndex` 정의 순서대로 값이 들어 있습니다.
        - token_to_id (Dict[str, int]):
            - 갱신된 track_token → 정수 ID 매핑 사전입니다.
        - agent_types (List[TrackedObjectType]):
            - 각 행(에이전트)에 해당하는 객체 타입 목록입니다.

"""
object_types: List[TrackedObjectType] = [
    TrackedObjectType.VEHICLE,
    TrackedObjectType.PEDESTRIAN,
    TrackedObjectType.BICYCLE,
]
# 선택한 타입(object_types)에 해당하는 에이전트만 모음
agents: List[TrackedObject] = tracked_objects.get_tracked_objects_of_types(
    object_types)
agent_types: List[TrackedObjectType] = []

frame_agents_num: int = len(agents)
# a_frame_agents_feat_8: (frame_agents_num, dim=8)
a_frame_agents_feat_8: np.ndarray = np.zeros(
    (frame_agents_num, AgentInternalIndex.dim()), dtype=np.float64)

max_id_number: int = len(token_to_id)

# 프레임 내 index 순회
for idx, agent in enumerate(agents):
    # 문자열 track_token → 정수 ID로 매핑
    if agent.track_token not in token_to_id:
        token_to_id[agent.track_token] = max_id_number
        max_id_number += 1
    int_id: int = token_to_id[agent.track_token]

    # 각 에이전트의 특성 값을 배열에 채움
    a_frame_agents_feat_8[idx,
                          AgentInternalIndex.track_token()] = float(int_id)
    a_frame_agents_feat_8[idx, AgentInternalIndex.vx()] = agent.velocity.x
    a_frame_agents_feat_8[idx, AgentInternalIndex.vy()] = agent.velocity.y
    a_frame_agents_feat_8[
        idx, AgentInternalIndex.heading()] = agent.center.heading
    a_frame_agents_feat_8[idx, AgentInternalIndex.width()] = agent.box.width
    a_frame_agents_feat_8[idx,
                          AgentInternalIndex.length()] = agent.box.length
    a_frame_agents_feat_8[idx, AgentInternalIndex.x()] = agent.center.x
    a_frame_agents_feat_8[idx, AgentInternalIndex.y()] = agent.center.y

    agent_types.append(agent.tracked_object_type)

return a_frame_agents_feat_8, token_to_id, agent_types

def sampled_tracked_objects_to_array_list( tracked_objects_list: List[Union[TrackedObjects, DetectionsTracks]], token_to_id: Optional[Dict[str, int]] = None, ) -> Tuple[List[np.ndarray], List[List[TrackedObjectType]], Dict[str, int]]: """여러 시점의 감지 결과를 프레임별 에이전트 배열 리스트로 바꾸는 함수.

이 함수는 연속된 여러 시점(프레임)의 감지 결과를 입력으로 받아,
각 시점마다 에이전트들을 숫자 배열로 바꾸어 리스트로 모아 줍니다.

간단히 말해,
- "시간에 따라 에이전트들이 어떻게 움직였는지"를
  프레임별 2차원 배열 목록으로 만드는 역할을 합니다.
시간 흐름:
    - 입력 리스트의 0번 인덱스: 가장 과거 시점
    - 마지막 인덱스: 가장 최근 시점

Args:
    tracked_objects_list (List[Union[TrackedObjects, DetectionsTracks]]):
        - 시간 순서대로 정렬된 감지 결과 리스트입니다.

Returns:
    Tuple[List[np.ndarray], List[List[TrackedObjectType]], Dict[str, int]]:
        - agents_seq_world_8_list (List[np.ndarray]):
            - 길이: num_frames
            - 각 원소 모양: (frame_agents_num, AgentInternalIndex.dim() = 8)
            - int id , vx, vy, heading, width, length, x, y
            - 각 시점의 에이전트 배열입니다. 행이 에이전트, 열이 특성입니다.
        - past_cur_agents_types_list (List[List[TrackedObjectType]]):
            - 길이: num_frames
            - 각 시점에서, 에이전트별 타입(차량/보행자/자전거 등)을 담은 리스트입니다.
        - token_to_id (Dict[str, int]):
            - 전체 시퀀스 동안 등장한 track_token을
              정수 ID로 매핑하는 사전입니다.

"""

# agents_seq_world_8_list:
#   - 길이: num_frames
#   - 각 원소: np.ndarray, shape (frame_agents_num, 8)
agents_seq_world_8_list: List[np.ndarray] = []

# past_cur_agents_types_list:
#   - 길이: num_frames
#   - 각 원소: List[TrackedObjectType], 길이 = frame_agents_num
past_cur_agents_types_list: List[List[TrackedObjectType]] = []

# track_token(문자열) → int ID
if token_to_id is None:
    token_to_id: Dict[str, int] = {}
for timestep_idx in range(len(tracked_objects_list)):
    # 현재 시점의 원시 감지 결과
    if type(tracked_objects_list[timestep_idx]) == DetectionsTracks:
        tracked_objects: TrackedObjects = tracked_objects_list[
            timestep_idx].tracked_objects
    else:
        tracked_objects = tracked_objects_list[
            timestep_idx]  # type: ignore[assignment]

    # a_frame_agents_feat_8: (frame_agents_num, 8)
    #   int id , vx, vy, heading, width, length, x, y
    # token_to_id: Dict[str, int]
    # agent_types: List[TrackedObjectType], 길이 = frame_agents_num
    a_frame_agents_feat_8, token_to_id, agent_types = _extract_agent_array(
        tracked_objects, token_to_id)

    agents_seq_world_8_list.append(a_frame_agents_feat_8)
    past_cur_agents_types_list.append(agent_types)
"""
agents_seq_world_8_list: List[np.ndarray]
    - 각 원소: (frame_agents_num, 8)  # frame_agents_num 은 프레임마다 다름
past_cur_agents_types_list: List[List[TrackedObjectType]]
    - 각 원소 길이: frame_agents_num
token_to_id: Dict[str, int]
    - track_token 문자열 → int ID 매핑
"""
return agents_seq_world_8_list, past_cur_agents_types_list, token_to_id

def _extract_ego_array(track_ego: EgoState) -> np.ndarray: # (10) frame_ego_feature = np.zeros((10,), dtype=np.float64) # x, y, heading, vx, vy, width, length, (car, pedestrian, cyclist)

frame_ego_feature[0] = track_ego.center.x
frame_ego_feature[1] = track_ego.center.y
frame_ego_feature[2] = track_ego.center.heading
# EgoState의 속도는 자차량 좌표계 기준 벡터이므로, 세계 좌표계로 변환이 필요하다.
v_local = track_ego.dynamic_car_state.center_velocity_2d
he = float(track_ego.center.heading)
c, s = np.cos(he), np.sin(he)
vx_w = c * float(v_local.x) - s * float(v_local.y)
vy_w = s * float(v_local.x) + c * float(v_local.y)
frame_ego_feature[3] = vx_w
frame_ego_feature[4] = vy_w
frame_ego_feature[5] = track_ego.car_footprint.width
frame_ego_feature[6] = track_ego.car_footprint.length
frame_ego_feature[7:10] = [1, 0, 0]  # Mark as VEHICLE

return frame_ego_feature

def sampled_ego_objects_to_array_list( ego_state_buffer: Deque[EgoState]) -> np.ndarray: # (num_frames, 10) # x, y, heading, vx, vy, width, length, (car, pedestrian, cyclist) all_frame_ego_feature = []

for past_idx in range(len(ego_state_buffer)):
    # 가장 과거 -> 가장 최근 순서
    track_ego: EgoState = ego_state_buffer[past_idx]
    frame_agents_feature = _extract_ego_array(track_ego)
    all_frame_ego_feature.append(frame_agents_feature)
all_frame_ego_feature = np.stack(all_frame_ego_feature)  # (num_frames, 10)

return all_frame_ego_feature

def sampled_static_objects_to_array_list( present_tracked_objects: TrackedObjects):

static_object_types = [
    TrackedObjectType.CZONE_SIGN, TrackedObjectType.BARRIER,
    TrackedObjectType.TRAFFIC_CONE, TrackedObjectType.GENERIC_OBJECT
]

if type(present_tracked_objects) == DetectionsTracks:
    (present_tracked_objects
    ): TrackedObjects = present_tracked_objects.tracked_objects

static_obj: List[
    TrackedObject] = present_tracked_objects.get_tracked_objects_of_types(
        static_object_types)
static_types_list = []
present_static_feat_5 = np.zeros((len(static_obj), 5), dtype=np.float64)

for idx, agent in enumerate(static_obj):
    present_static_feat_5[idx, 0] = agent.center.x
    present_static_feat_5[idx, 1] = agent.center.y
    present_static_feat_5[idx, 2] = agent.center.heading
    present_static_feat_5[idx, 3] = agent.box.width
    present_static_feat_5[idx, 4] = agent.box.length
    static_types_list.append(agent.tracked_object_type)

return present_static_feat_5, static_types_list

=====================

2. Get agents array for model input

=====================

def _filter_agents_array( all_frame_agents_world_8: List[ np.ndarray], # len = num_frames, 각 원소 shape: (frame_agents_num, 8) reverse: bool = False, ) -> List[ np.ndarray]: # len = num_frames, 각 원소 shape: (frame_save_agents_num, 8) """프레임 전체에서 같은 에이전트들만 남기도록 걸러내는 함수.

한 시퀀스 안에 여러 시점(프레임)의 에이전트 목록이 있을 때,
첫 프레임(또는 `reverse=True`이면 마지막 프레임)에 등장한 에이전트만
나머지 모든 프레임에서도 유지하고, 그렇지 않은 에이전트는 제거합니다.

쉽게 말해서,
- “기준 시점에 존재하던 에이전트들만 끝까지 추적하겠다”
라는 필터링을 수행하는 함수입니다.

각 프레임의 배열은
`AgentInternalIndex` 순서( track_id, vx, vy, heading, width, length, x, y )
를 따릅니다.

**행 순서에 대한 설명**

- 기준 프레임(첫 프레임 또는 마지막 프레임)에서:
  · 기준 프레임 배열의 행을 위에서부터 순차적으로 보면서,
    해당 에이전트가 “기준 프레임에 존재하는지”를 검사하고,
    통과하는 행만 그대로 순서대로 쌓습니다.
  · 따라서 기준 프레임의 출력 배열 `(frame_save_agents_num, 8)` 에서
    행의 순서는 **입력 기준 프레임에서의 원래 순서를 그대로 유지**합니다.
- 다른 프레임들에서도:
  · 각 프레임의 입력 배열 `(frame_agents_num, 8)` 을 위에서부터 순회하면서,
    그 행의 track_id 가 기준 프레임 id 집합에 속하면 그 행을 그대로 추가합니다.
  · 이때도 “해당 프레임에서의 등장 순서”를 유지한 채 필터링만 할 뿐,
    별도의 정렬이나 재배열은 하지 않습니다.

즉, 출력 리스트 각 원소의 행 순서는
- **그 프레임의 입력 배열에서의 상대적 순서를 그대로 유지**하면서,
  기준 프레임에 없는 에이전트 행만 제거된 형태입니다.

Args:
    all_frame_agents_world_8 (List[np.ndarray]):
        - 길이: num_frames
        - 각 원소 shape: (frame_agents_num, 8)
        - 각 행은 한 에이전트에 해당하고, 열은 ID/속도/방향/크기/위치 정보를 담습니다.
    reverse (bool, optional):
        - False:
            · 첫 번째 프레임(가장 과거 시점)을 기준 프레임으로 사용합니다.
        - True:
            · 마지막 프레임(가장 최근 시점)을 기준 프레임으로 사용합니다.

Returns:
    List[np.ndarray]:
        - 길이: num_frames
        - 각 원소 shape: (frame_save_agents_num, 8)
        - 기준 프레임에 등장했던 에이전트만 남기고, 각 프레임마다 배열을 다시 구성한 결과입니다.
          에이전트 수(frame_save_agents_num)는 프레임마다 같을 수도, 다를 수도 있습니다.
        - 각 프레임의 행 순서는, 해당 프레임 입력 배열에서의 순서를 유지한 채
          필터링만 적용된 것입니다(추가적인 정렬 없음).
"""
# target_frame_agents_feature: (target_frame_agents_num, 8)
target_frame_agents_feature = all_frame_agents_world_8[
    -1] if reverse else all_frame_agents_world_8[0]
# target_frame_agents_id: (target_frame_agents_num,)
target_frame_agents_id = target_frame_agents_feature[:,
                                                     AgentInternalIndex.
                                                     track_token()]
for time_idx in range(len(all_frame_agents_world_8)):
    frame_exist_agents = []  # len: frame_save_agents_num # 길이 가변적
    # frame_agents_feature: (frame_agents_num, 8)
    frame_agents_feature: np.ndarray = all_frame_agents_world_8[
        time_idx]  # (_, 8)
    for agent_idx in range(frame_agents_feature.shape[0]):
        if target_frame_agents_feature.shape[0] > 0:
            agent_id = float(
                frame_agents_feature[agent_idx,
                                     int(AgentInternalIndex.track_token())])
            is_in_target_frame = bool(
                (agent_id == target_frame_agents_id).max())
            if is_in_target_frame:
                frame_exist_agents.append(
                    frame_agents_feature[agent_idx, :].squeeze())

    if len(frame_exist_agents) > 0:
        # (frame_save_agents_num, 8)
        all_frame_agents_world_8[time_idx] = np.stack(frame_exist_agents)
    else:
        # 기준 프레임에 존재하지 않는 에이전트만 있었던 경우 → 빈 배열 유지
        all_frame_agents_world_8[time_idx] = np.empty(
            (0, frame_agents_feature.shape[1]), dtype=np.float32)  # (0, 8)

return all_frame_agents_world_8

def _filter_agents_array_w_id( cur_fut_agents_world_8_list: List[ np.ndarray], # len = num_frames_all, 각 원소: (frame_agents_num_t, 8) neighbor_token_id: np.ndarray, # shape: (chosen_agent_num,) ) -> List[np.ndarray]: """각 프레임의 에이전트 배열에서, 원하는 track_id(neighbor_token_id)에 해당하는 에이전트만 남긴다.

하는 일
--------
- 입력으로 "현재 + 여러 미래 프레임"에 대한 에이전트 상태가 들어온다.
  · cur_fut_agents_world_8_list : List[np.ndarray]
    - 길이: num_frames_all
    shape: (frame_agents_num_t, 8)
    [track_id, vx, vy, heading, width, length, x, y]

- neighbor_token_id 에는 "우리가 계속 추적하고 싶은 에이전트 ID" 들이 들어 있다.
  예: [10, 25, 31] 같은 1차원 배열.

- 각 프레임에 대해:
    1) 해당 프레임의 track_id 열을 보고,
       neighbor_token_id 중 어떤 것들이 있는지 찾는다.
    2) 그 ID 를 가진 행만 남겨서 새 배열(frame_save_agents_num_t, 8)을 만든다.
    3) 한 프레임에 하나도 없다면 (0, 8) 빈 배열을 넣는다.

결과적으로,
- 원래 프레임 수는 그대로 유지하고
- 각 프레임마다 "선택된 에이전트들만 남은 배열" 리스트를 돌려준다.

Args:
    cur_fut_agents_world_8_list:
        - 길이: num_frames_all
        - 각 원소 shape: (frame_agents_num_t, 8)
          [track_id, vx, vy, heading, width, length, x, y] (월드 좌표).
    neighbor_token_id:
        - shape: (chosen_agent_num,)
        - 선택된 에이전트들의 track_id 배열.

Returns:
    List[np.ndarray]:
        - cur_fut_chosen_agents_world_8_list
        - 길이: num_frames_all
        - 각 원소 shape: (frame_save_agents_num_t, 8)
          · frame_save_agents_num_t 는 프레임마다 달라질 수 있다.
          · 선택된 에이전트가 하나도 없으면 (0, 8) 배열.
"""
cur_fut_chosen_agents_world_8_list: List[np.ndarray] = []

# 선택된 토큰이 하나도 없으면, 모든 프레임에 대해 (0, 8) 빈 배열 반환
if neighbor_token_id.size == 0:
    for frame_agents_world_8 in cur_fut_agents_world_8_list:
        cur_fut_chosen_agents_world_8_list.append(
            np.empty((0, frame_agents_world_8.shape[1]),
                     dtype=frame_agents_world_8.dtype))
    return cur_fut_chosen_agents_world_8_list

# 비교를 편하게 하기 위해 float32 로 맞춘다.
neighbor_token_id_f32 = neighbor_token_id.astype(np.float32, copy=False)

for frame_agents_world_8 in cur_fut_agents_world_8_list:
    # frame_agents_world_8: (frame_agents_num_t, 8) 또는 (0, 8)
    if frame_agents_world_8.size == 0:
        cur_fut_chosen_agents_world_8_list.append(
            np.empty((0, frame_agents_world_8.shape[1]),
                     dtype=frame_agents_world_8.dtype))
        continue

    # 현재 프레임의 track_id 열: (frame_agents_num_t,)
    frame_ids = frame_agents_world_8[:, AgentInternalIndex.track_token()]

    # 이 프레임에서 neighbor_token_id 에 속하는 행만 True
    mask = np.isin(
        frame_ids.astype(np.float32, copy=False),
        neighbor_token_id_f32,  # (chosen_agent_num,)
    )  # shape: (frame_agents_num_t,)

    if np.any(mask):
        # 선택된 행만 남긴 배열: (frame_save_agents_num_t, 8)
        cur_fut_chosen_agents_world_8_list.append(
            frame_agents_world_8[mask])
    else:
        cur_fut_chosen_agents_world_8_list.append(
            np.empty((0, frame_agents_world_8.shape[1]),
                     dtype=frame_agents_world_8.dtype))

return cur_fut_chosen_agents_world_8_list

def _pad_agent_states( all_frame_cur_exists_agents: List[ np.ndarray], # len = num_frames, 각 원소 shape: (current_agents_num_t, 8) reverse: bool, ) -> List[np.ndarray]: # len = num_frames, 각 원소 shape: (current_agents_num, 8) """프레임마다 빠지는 에이전트를 0으로 채우면서, 에이전트 순서를 기준 프레임에 맞춘다.

간단한 규칙:
    - 기준 프레임(현재 프레임 기준)에서 살아 있는 에이전트 집합을 고정한다.
    - 각 시점마다:
        · 해당 시점에 관측된 에이전트는 그 값으로 채우고
        · 관측되지 않은 에이전트는 [0, 0, ..., 0] 으로 둔다.
    - 모든 프레임의 행 개수/순서는 기준 프레임과 같다.

Args:
    all_frame_cur_exists_agents (List[np.ndarray]):
        - 길이: num_frames
        - 각 원소 shape: (current_agents_num_t, 8)
          [track_id, vx, vy, heading, width, length, x, y]
    reverse (bool):
        - True 이면 리스트를 뒤집어서 마지막 프레임을 기준으로 사용한다.

Returns:
    List[np.ndarray]:
        - 길이: num_frames
        - 각 원소 shape: (current_agents_num, 8)
          기준 프레임 에이전트 순서로 정렬되고, 없는 시점은 0으로 채워진 배열.
"""
track_id_idx = AgentInternalIndex.track_token()

# 시간 방향 뒤집기 (현재 프레임을 기준 프레임으로 쓰기 위해)
if reverse:
    all_frame_cur_exists_agents = all_frame_cur_exists_agents[::-1]

if len(all_frame_cur_exists_agents) == 0:
    return all_frame_cur_exists_agents

# 기준 프레임: (current_agents_num, 8)
key_frame_agents: np.ndarray = all_frame_cur_exists_agents[0]
current_agents_num: int = int(key_frame_agents.shape[0])
feature_dim: int = int(key_frame_agents.shape[1])

if current_agents_num == 0:
    # 에이전트가 아예 없으면 그대로 반환
    return all_frame_cur_exists_agents

# 기준 프레임에서 track_id → 행 인덱스 매핑
key_frame_id_to_row: Dict[int, int] = {}
for cur_agent_idx, agent_id in enumerate(key_frame_agents[:, track_id_idx]):
    key_frame_id_to_row[int(agent_id)] = cur_agent_idx

# 기준 프레임의 id 벡터 (shape: (current_agents_num,))
key_frame_ids: np.ndarray = key_frame_agents[:, track_id_idx].copy()

new_all_frame_cur_exists_agents: List[np.ndarray] = []

for time_idx in range(len(all_frame_cur_exists_agents)):
    # frame_cur_exists_agents: (current_agents_num_t, 8)
    frame_cur_exists_agents: np.ndarray = all_frame_cur_exists_agents[
        time_idx]

    # frame_state: (current_agents_num, 8), 처음엔 전부 0
    frame_state: np.ndarray = np.zeros(
        (current_agents_num, feature_dim),
        dtype=np.float64,
    )

    # track_id 칸은 기준 프레임 id 로 채워 둔다.
    frame_state[:, track_id_idx] = key_frame_ids

    # 현재 프레임에서 실제로 관측된 에이전트만 해당 행 위치에 복사
    for agent_row_idx in range(frame_cur_exists_agents.shape[0]):
        agent_id = int(frame_cur_exists_agents[agent_row_idx, track_id_idx])
        if agent_id not in key_frame_id_to_row:
            # 기준 프레임에 없는 id 는 이웃 후보가 아니므로 무시
            continue
        key_frame_row: int = key_frame_id_to_row[agent_id]
        frame_state[key_frame_row, :] = frame_cur_exists_agents[
            agent_row_idx, :]

    new_all_frame_cur_exists_agents.append(frame_state)

# 시간 방향을 다시 원래대로 복원
if reverse:
    new_all_frame_cur_exists_agents = new_all_frame_cur_exists_agents[::-1]

return new_all_frame_cur_exists_agents

def _pad_agent_states_with_zeros_w_id( cur_fut_chosen_agents_local_8_list: List[ np. ndarray], # 길이: num_frames_all, 각 원소 shape: (frame_save_agents_num_t, 8) neighbor_token_id: np.ndarray, # shape: (chosen_agent_num,) neighbor_types_one_hot: Optional[ np.ndarray] = None, # shape: (chosen_agent_num, 3) 또는 None ) -> np.ndarray: """선택된 이웃 토큰에 대해, 시간 전체 구간의 궤적을 고정 크기 텐서로 채운다.

각 시점에서 관측된 에이전트의 상태는 해당 슬롯에 그대로 넣고,
관측되지 않은 시점은 0으로 둔다. 타입 one-hot은 시간축 전체에 복사한다.

Args:
    cur_fut_chosen_agents_local_8_list (List[np.ndarray]):
        - 길이: num_frames_all
        - 각 원소 shape: (frame_save_agents_num_t, 8)
          [track_id, vx, vy, heading, width, length, x, y] (ego 기준)
    neighbor_token_id (np.ndarray):
        - shape: (chosen_agent_num,)
        - 이웃 에이전트의 track_id 배열.
    neighbor_types_one_hot (Optional[np.ndarray]):
        - shape: (chosen_agent_num, 3)
        - [onehot_vehicle, onehot_ped, onehot_bike]
        - None 이면 타입 one-hot은 모두 0으로 둔다.

Returns:
     np.ndarray:
        - cur_fut_chosen_agents_full_11:
            · shape: (chosen_agent_num, num_frames_all, 11)
            · [x, y, cos(yaw), sin(yaw), v_x, v_y, width, length, one_hot(3)]
"""
track_id_idx = AgentInternalIndex.track_token()
vx_idx = AgentInternalIndex.vx()
vy_idx = AgentInternalIndex.vy()
heading_idx = AgentInternalIndex.heading()
width_idx = AgentInternalIndex.width()
length_idx = AgentInternalIndex.length()
x_idx = AgentInternalIndex.x()
y_idx = AgentInternalIndex.y()

# neighbor_token_id: (chosen_agent_num,)
chosen_agent_num: int = int(neighbor_token_id.shape[0])
# cur_fut_chosen_agents_local_8_list 길이: num_frames_all
cur_future_all_len: int = len(cur_fut_chosen_agents_local_8_list)

# 출력 버퍼 (모두 0으로 초기화)

# cur_fut_chosen_agents_full_11: (chosen_agent_num, num_frames_all, 11)
cur_fut_chosen_agents_full_11: np.ndarray = np.zeros(
    (chosen_agent_num, cur_future_all_len, 11),
    dtype=np.float32,
)

if chosen_agent_num == 0:
    return cur_fut_chosen_agents_full_11

# 타입 one-hot: (chosen_agent_num, 3) → (chosen_agent_num, 1, 3) → 시간축 브로드캐스트
if neighbor_types_one_hot is not None:
    # neighbor_types_one_hot: (chosen_agent_num, 3)
    neighbor_types_one_hot = neighbor_types_one_hot.astype(np.float32,
                                                           copy=False)
    # cur_fut_chosen_agents_full_11[:, :, 8:11]: (chosen_agent_num, num_frames_all, 3)
    cur_fut_chosen_agents_full_11[:, :,
                                  8:] = neighbor_types_one_hot[:, None, :]

# 각 시점 프레임 처리
for timestep_idx, frame_chosen_agents_local_8 in enumerate(
        cur_fut_chosen_agents_local_8_list):
    # frame_chosen_agents_local_8: (frame_chosen_agents_num_t, 8) 또는 (0, 8)
    if frame_chosen_agents_local_8.shape[0] == 0:
        continue

    # 현재 프레임의 track_id 들: (frame_chosen_agents_num_t,)
    frame_ids: np.ndarray = frame_chosen_agents_local_8[:,
                                                        track_id_idx].astype(
                                                            np.float32,
                                                            copy=False)

    # frame_ids 와 neighbor_token_id 의 공통 원소 및 위치
    # inter: 공통 track_id 값들 (길이 M)
    # idx_frame: frame_ids 에서의 인덱스 (shape: (M,))
    # idx_neighbors: neighbor_token_id 에서의 인덱스 (shape: (M,))
    inter, idx_frame, idx_neighbors = np.intersect1d(
        frame_ids,
        neighbor_token_id.astype(np.float32, copy=False),
        assume_unique=False,
        return_indices=True,
    )
    if inter.size == 0:
        continue

    # 선택된 행만 모으기
    # frame_selected: (M, 8)
    frame_selected: np.ndarray = frame_chosen_agents_local_8[idx_frame]

    # 위치/각도: (M,)
    x_local: np.ndarray = frame_selected[:, x_idx].astype(np.float32,
                                                          copy=False)
    y_local: np.ndarray = frame_selected[:, y_idx].astype(np.float32,
                                                          copy=False)
    heading_local: np.ndarray = frame_selected[:, heading_idx].astype(
        np.float32, copy=False)

    # 속도/크기: (M,)
    vx_local: np.ndarray = frame_selected[:, vx_idx].astype(np.float32,
                                                            copy=False)
    vy_local: np.ndarray = frame_selected[:, vy_idx].astype(np.float32,
                                                            copy=False)
    width_local: np.ndarray = frame_selected[:,
                                             width_idx].astype(np.float32,
                                                               copy=False)
    length_local: np.ndarray = frame_selected[:,
                                              length_idx].astype(np.float32,
                                                                 copy=False)

    # ---- (2) full 11차원 채우기: [x, y, cos, sin, vx, vy, w, h, one_hot(3)] ----
    # cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx, :8]: (M, 8)
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx, 0] = x_local
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx, 1] = y_local
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx,
                                  2] = np.cos(heading_local)
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx,
                                  3] = np.sin(heading_local)
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx, 4] = vx_local
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx, 5] = vy_local
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx,
                                  6] = width_local
    cur_fut_chosen_agents_full_11[idx_neighbors, timestep_idx,
                                  7] = length_local
    # 8:11 (one_hot)은 위에서 시간축 전체에 이미 채워져 있음

return cur_fut_chosen_agents_full_11

def build_ego_past_feature( past_cur_ego_world_10: np.ndarray, # (num_frames, 10) ego_cur_pose_np: np.ndarray, # (3,) ) -> np.ndarray: # (num_frames, 11) """이고 차량의 과거+현재 궤적을 이고 기준 상대 좌표계로 변환한다.

- 입력은 월드 좌표계 기준 이고 궤적(여러 시점의 상태 값)이다.
- 기준이 되는 현재 이고 상태(`ego_cur_pose_np`)를 중심으로
  모든 시점의 좌표/속도 등을 상대 좌표계로 바꾼다.
- 모델에서 바로 쓸 수 있도록 float32 형으로 정리한다.

Args:
    past_cur_ego_world_10 (np.ndarray):
        - x, y, heading, vx, vy, width, length, (car, pedestrian, cyclist)
        - shape: (num_frames, 10)
        - 월드 좌표계 이고 궤적.
    ego_cur_pose_np (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, yaw_ego], 기준 이고 상태.

Returns:
    [np.ndarray]:
        - shape: (num_frames, 11)
        - 이고 기준 상대 좌표계 궤적.
"""
# past_cur_ego_world_10: (num_frames, 10)
# ego_agent_past: (num_frames, 11)
ego_agent_past = convert_absolute_quantities_to_relative(
    past_cur_ego_world_10, ego_cur_pose_np)
assert ego_agent_past.shape[1] == 11
ego_agent_past = ego_agent_past.astype(np.float32)
return ego_agent_past

def _convert_all_frames_agents_to_ego_local( all_frame_cur_exists_agents: List[ np.ndarray], # len = num_frames, 각 원소: (current_agents_num_t, 8) ego_cur_pose_np: np.ndarray, # (3,) ) -> List[np.ndarray]: """각 프레임의 에이전트 상태를 먼저 ego 기준으로 바꾸고, 이후에 프레임별 크기를 맞춰준다.

처리 단계:
    1) 각 시점의 (world) 에이전트 상태를 개별적으로 ego 기준으로 변환.
       - 입력: (current_agents_num_t, 8)
       - 출력: (current_agents_num_t, 8)  (좌표계만 변경)
    2) `_pad_agent_states` 를 이용해
       - 기준 프레임 에이전트 집합/순서에 맞춰
       - 없는 시점은 0으로 채운다.

Args:
    all_frame_cur_exists_agents (List[np.ndarray]):
        - 길이: num_frames
        - 각 원소 shape: (current_agents_num_t, 8)
          [track_id, vx, vy, heading, width, length, x, y]
          (월드 좌표계 기준)
    ego_cur_pose_np (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, yaw_ego]

Returns:
    List[np.ndarray]:
        - 길이: num_frames
        - 각 원소 shape: (current_agents_num, 8)
          ego 기준 좌표계이며, 기준 프레임 에이전트 순서를 따른다.
          관측이 없는 시점은 0으로 채워진다.
"""
# 에이전트가 없으면 그대로 반환
if len(all_frame_cur_exists_agents) == 0:
    return all_frame_cur_exists_agents

all_frame_cur_exists_agents_local_world: List[np.ndarray] = []

for frame_agents_world_8 in all_frame_cur_exists_agents:
    # frame_agents_world_8: (current_agents_num_t, 8)
    if frame_agents_world_8.shape[0] == 0:
        # 관측이 아예 없는 프레임은 0 배열 유지
        all_frame_cur_exists_agents_local_world.append(
            np.zeros_like(frame_agents_world_8, dtype=np.float64))
        continue

    # 복사본을 만들어 좌표계를 변환 (shape: (current_agents_num_t, 8))
    frame_agents_world_8_copy = frame_agents_world_8.astype(np.float64,
                                                            copy=True)
    frame_agents_local_8 = convert_absolute_quantities_to_relative(
        frame_agents_world_8_copy,
        ego_cur_pose_np,
        'agent',
    )
    all_frame_cur_exists_agents_local_world.append(frame_agents_local_8)

# 기준 프레임에 맞춰 에이전트 개수/순서를 고정하고, 없는 시점은 0 유지
all_frame_cur_exists_agents_local: List[np.ndarray] = _pad_agent_states(
    all_frame_cur_exists_agents=all_frame_cur_exists_agents_local_world,
    reverse=True,
)

return all_frame_cur_exists_agents_local

def _pack_ego_local_agents( all_frame_cur_exists_agents_local: List[ np.ndarray], # len = num_frames, 각: (current_agents_num, 8) agents_states_dim: int, ) -> np.ndarray: """ego 기준 에이전트 상태 리스트를 3D 텐서로 모아 쌓는다.

각 프레임별 에이전트 상태(ego 기준)를 받아서,
- [x, y, cos(heading), sin(heading), vx, vy, width, length, id]
  형식의 9차원 특성으로 재구성한다.

Args:
    all_frame_cur_exists_agents_local (List[np.ndarray]):
        - 길이: num_frames
        - 각 원소 shape: (current_agents_num, 8)
            - [track_id, vx, vy, heading, width, length, x, y]
    agents_states_dim (int):
        - x, y, cos h, sin h, vx, vy, length, width 의 차원 수(=8).

Returns:
    np.ndarray:
        - shape: (num_frames, current_agents_num, agents_states_dim + 1)
        - 마지막 채널(+1)은 track_id 를 저장한다.
            - [x, y, cos(heading), sin(heading), vx, vy, width, length, id]
"""
# all_frame_np_agents_local: (num_frames, current_agents_num, agents_states_dim + 1)
#   마지막 채널(+1)은 track_id 저장용
num_frames: int = len(all_frame_cur_exists_agents_local)
if num_frames == 0:
    return np.zeros((0, 0, agents_states_dim + 1), dtype=np.float64)

current_agents_num: int = int(all_frame_cur_exists_agents_local[0].shape[0])
# all_frame_np_agents_local: (T, N, 9)
all_frame_np_agents_local: np.ndarray = np.zeros(
    (num_frames, current_agents_num, agents_states_dim + 1),
    dtype=np.float64,
)

track_id_idx = AgentInternalIndex.track_token()
vx_idx = AgentInternalIndex.vx()
vy_idx = AgentInternalIndex.vy()
heading_idx = AgentInternalIndex.heading()
width_idx = AgentInternalIndex.width()
length_idx = AgentInternalIndex.length()
x_idx = AgentInternalIndex.x()
y_idx = AgentInternalIndex.y()

for t in range(num_frames):
    # frame_cur_exists_agents_local: (N, 8)
    frame_cur_exists_agents_local: np.ndarray = all_frame_cur_exists_agents_local[
        t]

    # empty_mask: (N,)
    # - track_id 를 제외한 나머지가 전부 0이면 "관측 없는 패딩 row"로 본다.
    frame_non_id = frame_cur_exists_agents_local.astype(np.float64,
                                                        copy=False).copy()
    frame_non_id[:, track_id_idx] = 0.0
    empty_mask: np.ndarray = (np.abs(frame_non_id).sum(axis=-1) == 0.0)

    # heading: (N,)
    heading = frame_cur_exists_agents_local[:, heading_idx].astype(
        np.float64, copy=False).reshape(-1)
    cos_heading = np.cos(heading)
    sin_heading = np.sin(heading)

    # (N,) 동적 값들
    x_local = frame_cur_exists_agents_local[:, x_idx].astype(
        np.float64, copy=False).reshape(-1)
    y_local = frame_cur_exists_agents_local[:, y_idx].astype(
        np.float64, copy=False).reshape(-1)
    vx_local = frame_cur_exists_agents_local[:, vx_idx].astype(
        np.float64, copy=False).reshape(-1)
    vy_local = frame_cur_exists_agents_local[:, vy_idx].astype(
        np.float64, copy=False).reshape(-1)
    width_local = frame_cur_exists_agents_local[:, width_idx].astype(
        np.float64, copy=False).reshape(-1)
    length_local = frame_cur_exists_agents_local[:, length_idx].astype(
        np.float64, copy=False).reshape(-1)

    # ✅ 패딩 row는 8개 채널 전부 0으로 고정
    cos_heading[empty_mask] = 0.0
    sin_heading[empty_mask] = 0.0
    x_local[empty_mask] = 0.0
    y_local[empty_mask] = 0.0
    vx_local[empty_mask] = 0.0
    vy_local[empty_mask] = 0.0
    width_local[empty_mask] = 0.0
    length_local[empty_mask] = 0.0

    # pack: (T, N, 9)
    all_frame_np_agents_local[t, :, 0] = x_local
    all_frame_np_agents_local[t, :, 1] = y_local
    all_frame_np_agents_local[t, :, 2] = cos_heading
    all_frame_np_agents_local[t, :, 3] = sin_heading
    all_frame_np_agents_local[t, :, 4] = vx_local
    all_frame_np_agents_local[t, :, 5] = vy_local
    all_frame_np_agents_local[t, :, 6] = width_local
    all_frame_np_agents_local[t, :, 7] = length_local
    all_frame_np_agents_local[
        t, :, 8] = frame_cur_exists_agents_local[:, track_id_idx].astype(
            np.float64, copy=False).reshape(-1)

return all_frame_np_agents_local

def build_agents_past_ego_frame_array( past_cur_agents_world_8_list: List[ np.ndarray], # len = num_frames, 각 원소: (frame_agents_num, 8) ego_cur_pose_np: np.ndarray, # (3,) agents_states_dim: int, ) -> np.ndarray: """과거+현재 에이전트 상태를 이고 기준 상대 좌표계 3차원 텐서로 만드는 함수.

처리 단계:
    1) `_filter_agents_array` 로 현재 프레임에 존재하는 에이전트만 남긴다.
    2) `_pad_agent_states` 로 프레임마다 사라지는 에이전트를 이전 상태로 채운다.
    3) 각 프레임을 이고 기준 상대 좌표계로 변환한다.
    4) [x, y, cos(yaw), sin(yaw), vx, vy, width, length, id] 를 모아
       (num_frames, current_agents_num, 9) 텐서를 만든다.

Args:
    past_cur_agents_world_8_list (List[np.ndarray]):
        - 길이: num_frames
        - 각 원소 shape: (frame_agents_num, 8)
        - [track_id, vx, vy, heading, width, length, x, y]
    ego_cur_pose_np (np.ndarray):
        - shape: (3,), [x_ego, y_ego, yaw_ego]
    agents_states_dim (int):
        - x, y, cos h, sin h, vx, vy, length, width 의 차원 수(=8).

Returns:
    np.ndarray
        - all_frame_np_agents_local (np.ndarray):
            · shape: (num_frames, current_agents_num, agents_states_dim + 1 = 9)
            -  [x, y, cos(heading), sin(heading), vx, vy, width, length, id]
"""
# 1) 현재 프레임 기준 에이전트 필터링 및 타입 수집
# all_frame_cur_exists_agents: len = num_frames,
#   각 원소 shape: (frame_save_agents_num, 8)
all_frame_cur_exists_agents: List[np.ndarray] = _filter_agents_array(
    past_cur_agents_world_8_list, reverse=True)

if all_frame_cur_exists_agents[-1].shape[0] == 0:
    # Return zero array when there are no agents in the scene
    # all_frame_np_agents_local: (num_frames, 0, agents_states_dim)
    all_frame_np_agents_local = np.zeros(
        (len(all_frame_cur_exists_agents), 0, agents_states_dim))
else:
    # 2) (world→ego) 프레임별 에이전트 상태 변환
    """ all_frame_cur_exists_agents_local
    List[np.ndarray]:
        - 길이: num_frames
        - 각 원소 shape: (current_agents_num, 8) # current_agents_num 고정
            = [track_id, vx, vy, heading, width, length, x, y]
    """
    all_frame_cur_exists_agents_local = _convert_all_frames_agents_to_ego_local(
        all_frame_cur_exists_agents=all_frame_cur_exists_agents,
        ego_cur_pose_np=ego_cur_pose_np,
    )

    # 3) (프레임 × 에이전트 × 특성) 3D 텐서로 패킹
    # (num_frames, current_agents_num, agents_states_dim + 1 = 9)
    #  [x, y, cos(heading), sin(heading), vx, vy, width, length, id]
    all_frame_np_agents_local = _pack_ego_local_agents(
        all_frame_cur_exists_agents_local=all_frame_cur_exists_agents_local,
        agents_states_dim=agents_states_dim,
    )

return all_frame_np_agents_local

def _build_present_static_feature_6( present_static_feat_5: np.ndarray, # (cur_static_num, 5) ego_cur_pose_np: np.ndarray, # (3,) ) -> np.ndarray: """현재 프레임의 정적 객체 정보를 -> 이고 기준 6차원 표현으로 확장한다.

- 입력: [x, y, heading, width, length] (월드 좌표계)
- 출력: [x, y, cos(heading), sin(heading), width, length] (ego-relative)

Args:
    present_static_feat_5 (np.ndarray):
        - shape: (cur_static_num, 5)
    ego_cur_pose_np (np.ndarray):
        - shape: (3,), [x_ego, y_ego, yaw_ego]

Returns:
    np.ndarray:
        - shape: (cur_static_num, 6)
        - [x, y, cos(heading), sin(heading), width, length]
"""
# present_static_feature_6: (cur_static_num, 6)
present_static_feature_6 = np.zeros((present_static_feat_5.shape[0], 6))
if present_static_feat_5.shape[0] != 0:
    # present_static_feature_local: (cur_static_num, 5)
    present_static_feature_local = convert_absolute_quantities_to_relative(
        present_static_feat_5, ego_cur_pose_np, 'static')

    present_static_feature_6[:, 0] = present_static_feature_local[:, 0]
    present_static_feature_6[:, 1] = present_static_feature_local[:, 1]
    present_static_feature_6[:, 2] = np.cos(present_static_feature_local[:,
                                                                         2])
    present_static_feature_6[:, 3] = np.sin(present_static_feature_local[:,
                                                                         2])
    present_static_feature_6[:, 4] = present_static_feature_local[:, 3]
    present_static_feature_6[:, 5] = present_static_feature_local[:, 4]

return present_static_feature_6

from typing import Optional, List, Tuple # 이미 있으면 중복 import는 제거해도 OK

def _filter_out_neighbors_not_present_at_current( neighbor_agents_past: np.ndarray, # shape: (K, T, 11) agents_cur_frame_indices: np.ndarray, # shape: (K,) neighbors_id: np.ndarray, # shape: (K,) *, eps: float = 1e-8, ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: """현재 프레임에 '실제로 존재하는' 이웃 에이전트만 남긴다.

여기서 "현재 프레임"은 neighbor_agents_past의 마지막 시간 인덱스(T-1)입니다.

어떤 이웃이 현재에 존재한다고 볼지(판정 기준)
-------------------------------------------
- neighbor_agents_past[k, -1, :8] = [x, y, cos, sin, vx, vy, width, length]
  이 8개 값이 전부 0(또는 0에 매우 가까움)이면,
  그 이웃은 "현재 프레임에 존재하지 않는다"고 판단합니다.
- (주의) 타입 one-hot(마지막 3차원)은 시간 전체에 복사되는 구조라서,
  타입 값이 0이 아니더라도 동적 상태가 전부 0이면 "없는 에이전트"로 봐야 합니다.
  그래서 판정은 앞 8차원만 사용합니다.

이 함수가 필요한 이유
--------------------
아주 드문 엣지 케이스(업스트림에서 섞인 0-padding 등)로 인해
현재 프레임이 전부 0인 이웃이 선택 결과에 들어오면,
이후 단계에서 그 이웃이 "전 시간 0"으로 남아버릴 수 있습니다.
당신의 의도는 그런 이웃은 결과 텐서에 아예 포함되지 않는 것이므로,
여기서 agent 축에서 제거합니다.

Args:
    neighbor_agents_past (np.ndarray):
        shape: (K, T, 11)
        - K: 선택된 이웃 수
        - T: 과거+현재 프레임 수
    agents_cur_frame_indices (np.ndarray):
        shape: (K,)
        - 현재 프레임 기준으로 선택된 이웃의 원본 인덱스들
    neighbors_id (np.ndarray):
        shape: (K,)
        - 선택된 이웃들의 track_id
    eps (float):
        0과 "거의 0"을 구분하기 위한 작은 값

Returns:
    Tuple[np.ndarray, np.ndarray, np.ndarray]:
        - filtered_neighbor_agents_past: shape (K_valid, T, 11)
        - filtered_agents_cur_frame_indices: shape (K_valid,)
        - filtered_neighbors_id: shape (K_valid,)
"""
if neighbor_agents_past.ndim != 3 or neighbor_agents_past.shape[-1] != 11:
    raise ValueError(
        f"`neighbor_agents_past`는 (K, T, 11) shape 이어야 합니다. got {neighbor_agents_past.shape}"
    )
if agents_cur_frame_indices.ndim != 1 or neighbors_id.ndim != 1:
    raise ValueError(
        "`agents_cur_frame_indices`와 `neighbors_id`는 1차원 배열이어야 합니다.")
if neighbor_agents_past.shape[0] != agents_cur_frame_indices.shape[0] or \
   neighbor_agents_past.shape[0] != neighbors_id.shape[0]:
    raise ValueError(
        "세 입력의 agent 축 길이가 서로 달라서 정렬/필터링이 불가능합니다. "
        f"K={neighbor_agents_past.shape[0]}, "
        f"indices={agents_cur_frame_indices.shape[0]}, ids={neighbors_id.shape[0]}"
    )

K: int = int(neighbor_agents_past.shape[0])
if K == 0:
    return neighbor_agents_past, agents_cur_frame_indices, neighbors_id

# current_state_8: (K, 8)
current_state_8: np.ndarray = neighbor_agents_past[:, -1, :8]

# current_present_mask: (K,)
# - 현재 프레임에서 8개 값 중 하나라도 0이 아니면 "존재"
current_present_mask: np.ndarray = (np.abs(current_state_8)
                                    > eps).any(axis=1)

return (
    neighbor_agents_past[current_present_mask],
    agents_cur_frame_indices[current_present_mask],
    neighbors_id[current_present_mask],
)

def _compute_valid_sorted_indices( all_frame_np_agents_local: np. ndarray, # shape: (num_frames, current_agents_num, 9) filter_radius: Optional[float], ) -> Tuple[np.ndarray, np.ndarray]: """filter_radius 를 적용한 뒤, ego와의 거리 기준으로 에이전트 인덱스를 정렬한다.

추가로, "현재 프레임에 존재하지 않는 에이전트(현재 상태가 전부 0)"는
후보에서 아예 제외한다.

Args:
    all_frame_np_agents_local:
        - shape: (num_frames, current_agents_num, 9)
        - 마지막 채널(8)은 track_id
        - 앞 8채널은 [x, y, cos, sin, vx, vy, width, length]
    filter_radius:
        - None 이 아니면, ego 기준 거리 <= filter_radius 인 에이전트만 후보.

Returns:
    sorted_cur_agent_indices:
        - shape: (M,)  # M = 유효 후보 수
        - ego 로부터 가까운 순서대로 정렬된 현재 프레임 인덱스.
    dist_from_cur_agent_to_ego:
        - shape: (current_agents_num,)
        - 각 에이전트의 ego 기준 2D 거리.
"""
if all_frame_np_agents_local.ndim != 3 or all_frame_np_agents_local.shape[
        -1] < 9:
    raise ValueError(
        f"`all_frame_np_agents_local`는 (T, N, 9) shape 이어야 합니다. got {all_frame_np_agents_local.shape}"
    )

# 현재 프레임(마지막 프레임 기준)에서 ego까지의 거리: (current_agents_num,)
dist_from_cur_agent_to_ego: np.ndarray = np.linalg.norm(
    all_frame_np_agents_local[-1, :, :2], axis=-1)

current_agents_num: int = int(all_frame_np_agents_local.shape[1])

# (핵심) 현재 프레임 존재 여부 판단: (current_agents_num,)
# - [x, y, cos, sin, vx, vy, width, length] 이 전부 0이면 "현재에 없음"
eps: float = 1e-8
current_state_8: np.ndarray = all_frame_np_agents_local[-1, :, :8]  # (N, 8)
present_mask: np.ndarray = (np.abs(current_state_8)
                            > eps).any(axis=1)  # (N,)

# filter_radius 내의 에이전트만 후보로 사용
if filter_radius is not None:
    within_radius = dist_from_cur_agent_to_ego <= float(
        filter_radius)  # (N,)
    valid_mask = present_mask & within_radius
else:
    valid_mask = present_mask

valid_indices: np.ndarray = np.nonzero(valid_mask)[0].astype(int)  # (M,)

if valid_indices.size == 0:
    # 유효한 에이전트가 하나도 없는 경우
    return np.zeros((0,), dtype=int), dist_from_cur_agent_to_ego

# 유효한 에이전트들만 뽑아서 거리 기준 정렬
dist_valid: np.ndarray = dist_from_cur_agent_to_ego[valid_indices]  # (M,)
order_local: np.ndarray = np.argsort(dist_valid)  # (M,)
sorted_cur_agent_indices: np.ndarray = valid_indices[order_local]  # (M,)

return sorted_cur_agent_indices, dist_from_cur_agent_to_ego

def _select_indices_with_type_cap( sorted_cur_agent_indices: np.ndarray, # (M,) current_agent_types_list: List[TrackedObjectType], max_agent_num: int, max_pedestrians: int, max_bicycles: int, dist_from_cur_agent_to_ego: np.ndarray, # (current_agents_num,) ) -> np.ndarray: """보행자/자전거 상한을 적용하여 에이전트를 선택한다.

절차:
    1) sorted_cur_agent_indices 를 타입별로 세 그룹으로 나눈다.
       - 보행자 / 자전거 / 차량
    2) 보행자 → 자전거 → 차량 순으로 슬롯을 채우되,
       각각 max_pedestrians / max_bicycles / 나머지 로 상한을 둔다.
    3) 최종 선택 집합을 ego-거리 기준으로 다시 정렬하고,
       max_agent_num 개까지 사용한다.

Returns:
    agents_cur_frame_indices:
        - shape: (K,), K ≤ max_agent_num
"""
if max_agent_num <= 0:
    return np.zeros((0,), dtype=int)

# 타입별로 거리 오름차순 리스트 분리
ped_sorted_indices = [
    idx for idx in sorted_cur_agent_indices
    if current_agent_types_list[idx] == TrackedObjectType.PEDESTRIAN
]
bike_sorted_indices = [
    idx for idx in sorted_cur_agent_indices
    if current_agent_types_list[idx] == TrackedObjectType.BICYCLE
]
vehicle_sorted_indices = [
    idx for idx in sorted_cur_agent_indices
    if current_agent_types_list[idx] == TrackedObjectType.VEHICLE
]

# 보행자/자전거 상한 적용 (max_agent_num을 넘지 않도록)
ped_cap = min(max_pedestrians, max_agent_num)
sel_peds = ped_sorted_indices[:ped_cap]

remain = max_agent_num - len(sel_peds)
bike_cap = min(max_bicycles, remain)
sel_bikes = bike_sorted_indices[:bike_cap]

remain -= len(sel_bikes)
sel_vehs = vehicle_sorted_indices[:max(0, remain)]

# 1차 선택 결과_select_neighbor_agents_and_build_past
selected_indices = sel_peds + sel_bikes + sel_vehs

if not selected_indices:
    return np.zeros((0,), dtype=int)

# 최종 후보를 ego-거리 기준으로 다시 정렬 후 max_agent_num 까지만 사용
agents_cur_frame_indices = np.array(
    sorted(
        selected_indices,
        key=lambda idx: dist_from_cur_agent_to_ego[idx],
    )[:max_agent_num],
    dtype=int,
)
return agents_cur_frame_indices

def _build_neighbor_vectors( all_frame_np_agents_local: np. ndarray, # (num_frames, current_agents_num, 9) current_agent_types_list: List[ TrackedObjectType], # List[TrackedObjectType], # 길이 = current_agents_num agents_states_dim: int, agents_cur_frame_indices: np.ndarray, # (chosen_agent_num,) ) -> Tuple[np.ndarray, np.ndarray]: """선택된 에이전트 인덱스들에 대해, 한 번에 이웃 궤적 벡터와 track_id 벡터를 만든다.

입력 텐서 구조
--------------
- all_frame_np_agents_local: (T, N, 9)
    · T: num_frames (과거 + 현재 시점 수)
    · N: 현재 프레임에서 살아남은 에이전트 수
    · 채널: x, y, cos(heading), sin(heading), vx, vy, width, length, id

- current_agent_types_list: 길이 N 의 리스트.
    · 각 인덱스 i 에 대해 TrackedObjectType.VEHICLE / PEDESTRIAN / BICYCLE 중 하나.

- agents_cur_frame_indices: (K,)
    · 현재 프레임(마지막 시점 T-1)의 에이전트 배열에서
      “이웃으로 선택된 행 인덱스” 들.
    · 이 순서가 곧 출력 텐서의 에이전트 축 순서가 된다.

알고리즘 개요
-------------
1) 동적 상태 8차원 복사 (완전 벡터화)
    - 후보 텐서에서 선택된 인덱스만 골라 (T, K, 8) 로 뽑은 뒤,
      축을 바꿔 (K, T, 8) 로 만든다.
    - 이걸 neighbor_agents_past[:, :, :8] 에 그대로 넣는다.

2) track_id 추출 (벡터화)
    - 현재 프레임 T-1 에서 선택된 인덱스의 id 채널(8)을
      한 번에 뽑아서 neighbors_id (K,) 로 만든다.

3) 타입 one-hot 생성 (부분 벡터화 + 브로드캐스트)
    - current_agent_types_list 를 numpy 배열로 만든 후,
      agents_cur_frame_indices 로 인덱싱해 길이 K 의 neighbors_types 배열을 만든다.
    - neighbors_types == VEHICLE / PEDESTRIAN / BICYCLE 로
      boolean mask 세 개를 만든 다음,
      type_one_hot[mask, col] = 1.0 방식으로 한 번에 채운다.
    - 마지막으로 (K, 3) → (K, 1, 3) 로 늘리고
      시간축(T) 방향으로 브로드캐스트 해서 neighbor_agents_past[:, :, 8:] 에 넣는다.


Args:
    all_frame_np_agents_local:
        이웃 후보 에이전트들의 ego 기준 과거+현재 시퀀스.
        - shape: (num_frames, current_agents_num, 9)
    current_agent_types_list:
        - 길이: current_agents_num
        - 각 에이전트 인덱스에 대응하는 TrackedObjectType 값.
    agents_states_dim:
        - 동적 상태 차원 수 (=8).
    agents_cur_frame_indices:
        - shape: (chosen_agent_num,)
        - 현재 프레임에서 선택된 에이전트 행 인덱스들.

Returns:
    neighbor_agents_past:
        - shape: (chosen_agent_num, num_frames, 11)
        - [x, y, cos, sin, vx, vy, width, length,
           track_id, onehot_vehicle, onehot_ped, onehot_bike]
    neighbors_id:
        - shape: (chosen_agent_num,)
        - 각 이웃 에이전트의 track_id (현재 프레임 기준).
"""
num_frames: int = all_frame_np_agents_local.shape[0]
agents_cur_frame_indices = np.asarray(agents_cur_frame_indices, dtype=int)
chosen_agent_num: int = int(agents_cur_frame_indices.shape[0])

if chosen_agent_num == 0:
    neighbor_agents_past = np.zeros((0, num_frames, agents_states_dim + 3),
                                    dtype=np.float32)
    neighbors_id = np.zeros((0,), dtype=np.float32)
    return neighbor_agents_past, neighbors_id

eight_ = agents_states_dim

# 1) 동적 상태 8차원 배치 추출 및 축 재배치
# (T, N, 9) → (T, chosen_agent_num, 8) → (chosen_agent_num, T, 8)
dynamic_states: np.ndarray = all_frame_np_agents_local[:,
                                                       agents_cur_frame_indices, :
                                                       eight_].transpose(
                                                           1, 0, 2)

# 출력 텐서 초기화
neighbor_agents_past = np.zeros(
    (chosen_agent_num, num_frames, agents_states_dim + 3),
    dtype=np.float32,
)
neighbor_agents_past[:, :, :eight_] = dynamic_states

# 2) track_id 벡터 배치 추출 (현재 프레임 = 마지막 프레임 기준)
# shape: (chosen_agent_num,)
neighbors_id = all_frame_np_agents_local[-1, agents_cur_frame_indices,
                                         eight_].astype(np.float32)

# 3) 타입 one-hot (chosen_agent_num, 3) 생성 → (chosen_agent_num, T, 3) 브로드캐스트
type_one_hot = np.zeros((chosen_agent_num, 3), dtype=np.float32)

# current_agent_types_list 를 배열로 바꾸고, 선택된 인덱스만 추출
# neighbors_types: shape (chosen_agent_num,), dtype=object (TrackedObjectType 인스턴스들)
neighbors_types = np.asarray(current_agent_types_list,
                             dtype=object)[agents_cur_frame_indices]

veh_mask = neighbors_types == TrackedObjectType.VEHICLE
ped_mask = neighbors_types == TrackedObjectType.PEDESTRIAN
bike_mask = neighbors_types == TrackedObjectType.BICYCLE

# 각 타입별 column에 1 할당 (행 단위로 일괄 처리)
type_one_hot[veh_mask, 0] = 1.0
type_one_hot[ped_mask, 1] = 1.0
type_one_hot[bike_mask, 2] = 1.0

# (chosen_agent_num, 3) → (chosen_agent_num, 1, 3) 로 늘려서 시간축(T)에 브로드캐스트
# → (chosen_agent_num, T, 3)
neighbor_agents_past[:, :, eight_:] = type_one_hot[:, None, :]

return neighbor_agents_past, neighbors_id

def _select_neighbor_agents_and_build_past( all_frame_np_agents_local: np. ndarray, # (num_frames, current_agents_num, 9) current_agent_types_list: List[ TrackedObjectType], # 길이 = current_agents_num agents_states_dim: int, max_agent_num: int, max_pedestrians: Optional[int], max_bicycles: Optional[int], filter_radius: Optional[float] = None, ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: """이고 기준 에이전트 텐서에서, 가까운 에이전트들을 선택하여 이웃 궤적 텐서를 구성한다.

특징:
    - `max_agent_num` 은 상한(최대 개수)만 의미한다. 실제 선택 수 chosen_agent_num ≤ max_agent_num.
    - `max_pedestrians` 또는 `max_bicycles` 가 None 이면
      타입 상한을 모두 끄고, 거리 기준으로만 가까운 순서대로 선택한다.
    - `filter_radius` 가 주어지면, ego 로부터 그 거리 안에 있는 에이전트만
      후보로 사용한다(그 밖은 완전히 무시).

Returns:
    neighbor_agents_past: (chosen_agent_num, num_frames, 11)
    agents_cur_frame_indices: shape chosen_agent_num (현재 프레임 기준 인덱스)
    neighbors_id:         (chosen_agent_num,)
"""
num_frames: int = all_frame_np_agents_local.shape[0]

# 0) radius 필터 + 거리 기준 정렬된 인덱스 계산
"""
    sorted_cur_agent_indices: shape: (M,)  # M = 유효 에이전트 수
        - ego 로부터 가까운 순서대로 정렬된 현재 프레임 인덱스.
    dist_from_cur_agent_to_ego: shape: (current_agents_num,)
        - 각 에이전트의 ego 기준 2D 거리.
"""
sorted_cur_agent_indices, dist_from_cur_agent_to_ego = \
    _compute_valid_sorted_indices(
        all_frame_np_agents_local=all_frame_np_agents_local, # (num_frames, current_agents_num, 9)
        filter_radius=filter_radius,
    )

# 유효한 후보가 없거나, max_agent_num 이 0 이하인 경우
if sorted_cur_agent_indices.size == 0 or max_agent_num <= 0:
    neighbor_agents_past = np.zeros(
        (0, num_frames, agents_states_dim + 3),
        dtype=np.float32,
    )
    agents_cur_frame_indices = np.zeros((0,), dtype=int)
    neighbors_id = np.zeros((0,), dtype=np.float32)
    return neighbor_agents_past, agents_cur_frame_indices, neighbors_id

# 1) 타입 상한을 끄는 경우: 거리 기준으로만 선택
if (max_pedestrians is None) or (max_bicycles is None):
    # agents_cur_frame_indices: shape: (chosen_agent_num,), chosen_agent_num ≤ max_agent_num
    agents_cur_frame_indices = sorted_cur_agent_indices[:max_agent_num]
else:
    # 2) 타입 상한을 적용하는 경우
    # agents_cur_frame_indices: shape: (chosen_agent_num,), chosen_agent_num ≤ max_agent_num
    agents_cur_frame_indices = _select_indices_with_type_cap(
        sorted_cur_agent_indices=sorted_cur_agent_indices,
        current_agent_types_list=
        current_agent_types_list,  # List[TrackedObjectType],  # 길이 = current_agents_num
        max_agent_num=max_agent_num,
        max_pedestrians=max_pedestrians,
        max_bicycles=max_bicycles,
        dist_from_cur_agent_to_ego=dist_from_cur_agent_to_ego,
    )

# 실제 선택된 에이전트 수 chosen_agent_num
chosen_agent_num = int(agents_cur_frame_indices.shape[0])
if chosen_agent_num == 0:
    neighbor_agents_past = np.zeros((0, num_frames, agents_states_dim + 3),
                                    dtype=np.float32)
    neighbors_id = np.zeros((0,), dtype=np.float32)
    return neighbor_agents_past, np.array([], dtype=int), neighbors_id

# 3) 선택된 인덱스로부터 최종 텐서 구성
"""
    neighbor_agents_past: shape: (chosen_agent_num, num_frames, 11)
        - [x, y, cos, sin, vx, vy, width, length,
           track_id, onehot_vehicle, onehot_ped, onehot_bike]
    neighbors_id: shape: (chosen_agent_num,)
        - 각 이웃 에이전트의 track_id (현재 프레임 기준).
"""
neighbor_agents_past, neighbors_id = _build_neighbor_vectors(
    all_frame_np_agents_local=
    all_frame_np_agents_local,  # (num_frames, current_agents_num, 9)
    current_agent_types_list=
    current_agent_types_list,  # List[TrackedObjectType],  # 길이 = current_agents_num
    agents_states_dim=agents_states_dim,
    agents_cur_frame_indices=agents_cur_frame_indices,  # (chosen_agent_num,)
)

# numpy → List[int]
agents_cur_frame_indices_list: List[int] = list(
    map(int,
        np.asarray(agents_cur_frame_indices).tolist()))
agents_cur_frame_indices = np.asarray(agents_cur_frame_indices_list,
                                      dtype=int)  # (K,)
return neighbor_agents_past, agents_cur_frame_indices, neighbors_id

from typing import Optional # 이미 있을 가능성 높음

def _build_static_objects( present_static_feature_6: np.ndarray, # (cur_static_num, 6) static_types_list: List[TrackedObjectType], # 길이 = cur_static_num max_static_num: int, filter_radius: Optional[float] = None, ) -> np.ndarray: """정적 객체 정보를 거리 기준으로 정렬하고, 타입 one-hot 을 붙여 배열로 만든다.

- 이고와의 거리를 기준으로 가까운 순서로 최대 `max_static_num` 개 선택.
- `filter_radius` 가 주어지면, ego 기준 거리 <= filter_radius 인 것만 후보.
- 실제 개수가 max_static_num 이하이면 zero-padding 없이 그 개수만 반환.

Returns:
    static_objects: (K, 10), K = min(유효 정적 객체 수, max_static_num)
"""
cur_static_num = int(present_static_feature_6.shape[0])

if cur_static_num == 0 or max_static_num <= 0:
    return np.zeros((0, present_static_feature_6.shape[-1] + 4),
                    dtype=np.float32)

# ego 기준 거리
static_distance_to_ego = np.linalg.norm(
    present_static_feature_6[:, :2],
    axis=-1,
)  # (cur_static_num,)

# filter_radius 적용
if filter_radius is not None:
    valid_mask = static_distance_to_ego <= float(filter_radius)
    valid_indices = np.nonzero(valid_mask)[0]
else:
    valid_indices = np.arange(cur_static_num, dtype=int)

if valid_indices.size == 0:
    return np.zeros((0, present_static_feature_6.shape[-1] + 4),
                    dtype=np.float32)

# 유효 객체들 안에서 거리 기준 오름차순
dist_valid = static_distance_to_ego[valid_indices]
order_local = np.argsort(dist_valid)
sorted_indices = valid_indices[order_local]

# 실제 사용할 개수 K
K = min(len(sorted_indices), max_static_num)

static_objects = np.zeros(
    (K, present_static_feature_6.shape[-1] + 4),
    dtype=np.float32,
)
six_ = present_static_feature_6.shape[-1]

for i, j in enumerate(sorted_indices[:K]):
    static_objects[i, :six_] = present_static_feature_6[j, :six_]
    if static_types_list[j] == TrackedObjectType.CZONE_SIGN:
        static_objects[i, six_:] = [1, 0, 0, 0]
    elif static_types_list[j] == TrackedObjectType.BARRIER:
        static_objects[i, six_:] = [0, 1, 0, 0]
    elif static_types_list[j] == TrackedObjectType.TRAFFIC_CONE:
        static_objects[i, six_:] = [0, 0, 1, 0]
    else:
        static_objects[i, six_:] = [0, 0, 0, 1]

return static_objects

def build_static_feature( present_static_feat_5: np.ndarray, # (cur_static_num, 5) static_types_list: List[TrackedObjectType], # 길이 = cur_static_num max_static_num: int, ego_cur_pose_np: np.ndarray, # (3,) filter_radius: Optional[float] = None, ) -> np.ndarray: """현재 프레임의 정적 객체들을 ego 기준 좌표계로 변환하고, 가까운 순으로 최대 max_static_num 개까지만 선택한다.

처리 순서
----------
1) `present_static_feat_5` (월드 좌표계)를
   `_build_present_static_feature_6` 를 이용해 ego 기준으로 변환:
   - 입력: [x, y, heading, width, length]
   - 출력: [x, y, cos(heading), sin(heading), width, length]
   → shape: (cur_static_num, 6)

2) `_build_static_objects` 를 호출해,
   - ego와의 2D 거리,
   - `filter_radius`,
   - `max_static_num`
   을 기준으로 정적 객체를 선택하고 타입 one-hot 을 붙인다.
   - 최종 출력: (K, 10)
     · [x, y, cos, sin, width, length,
        onehot_CZONE, onehot_BARRIER, onehot_CONE, onehot_GENERIC]
     · K = min(유효 정적 객체 수, max_static_num)

Args:
    present_static_feat_5:
        - shape: (cur_static_num, 5)
        - [x, y, heading, width, length] (월드 좌표계)
    static_types_list:
        - 길이: cur_static_num
        - 각 정적 객체의 타입 (TrackedObjectType).
    max_static_num:
        - 선택할 정적 객체 수의 상한값.
        - 실제 K는 K ≤ max_static_num.
    ego_cur_pose_np:
        - shape: (3,), [x_ego, y_ego, yaw_ego]
    filter_radius:
        - None 이 아니면 ego 기준 거리 <= filter_radius 인 정적 객체만 후보.
        - None 이면 거리 제한 없음.

Returns:
    np.ndarray:
        - static_objects
        - shape: (K, 10)
        - [x, y, cos, sin, width, length,
           onehot_CZONE, onehot_BARRIER, onehot_CONE, onehot_GENERIC]
"""
# (cur_static_num, 6)
present_static_feature_6 = _build_present_static_feature_6(
    present_static_feat_5=present_static_feat_5,
    ego_cur_pose_np=ego_cur_pose_np,
)

static_objects = _build_static_objects(
    present_static_feature_6=present_static_feature_6,
    static_types_list=static_types_list,
    max_static_num=max_static_num,
    filter_radius=filter_radius,
)
return static_objects

def build_neighbor_past_feature( past_cur_agents_world_8_list: List[np.ndarray], past_cur_agents_types_list: List[List[TrackedObjectType]], max_agent_num: int, ego_cur_pose_np: np.ndarray, # (3,) max_pedestrians: Optional[int], max_bicycles: Optional[int], token_to_id: Dict[str, int], filter_radius: Optional[float] = None, ) -> Tuple[np.ndarray, np.ndarray, np.ndarray, List[str]]: """이웃 에이전트의 과거+현재 궤적을 ego 기준으로 변환하고, 가까운 에이전트들만 골라 (chosen_agent_num, T, 11) 텐서로 만든다.

추가 보장(중요)
-------------
- 최종 출력에는 "현재 프레임이 전부 0인 이웃"이 절대 포함되지 않도록,
  선택 이후 한 번 더 걸러낸다.
  (아주 드문 데이터 이상/0-padding 혼입 케이스를 확실히 막기 위함)

Args:
    past_cur_agents_world_8_list:
        - 길이: num_frames
        - 각 원소 shape: (frame_agents_num, 8)
        - [track_id, vx, vy, heading, width, length, x, y] (월드 좌표계)
    past_cur_agents_types_list:
        - 길이: num_frames
        - 현재 프레임(마지막 원소)의 타입 정보를 사용.
    max_agent_num:
        - 선택할 이웃 에이전트 수의 상한값.
    ego_cur_pose_np:
        - shape: (3,), [x_ego, y_ego, yaw_ego]
    max_pedestrians / max_bicycles:
        - None 이면 타입 상한 미사용, 정수면 타입 상한 적용.
    token_to_id:
        - track_token → int ID 매핑
    filter_radius:
        - None 이 아니면 ego 기준 거리 <= filter_radius 인 에이전트만 후보.

Returns:
    Tuple[np.ndarray, np.ndarray, np.ndarray, List[str]]:
        - neighbor_agents_past:
            · shape: (chosen_agent_num, num_frames, 11)
            · [x, y, cos, sin, vx, vy, width, length, onehot_vehicle, onehot_ped, onehot_bike]
        - agents_cur_frame_indices:
            · shape: (chosen_agent_num,)
        - neighbors_id:
            · shape: (chosen_agent_num,)
        - neighbor_track_token:
            · 길이: chosen_agent_num
"""
agents_states_dim = 8  # x, y, cos h, sin h, vx, vy, width, length

# 현재 프레임의 타입 정보 (길이 = current_agents_num)
current_agent_types_list = past_cur_agents_types_list[-1]

# (num_frames, current_agents_num, 9)
all_frame_np_agents_local = build_agents_past_ego_frame_array(
    past_cur_agents_world_8_list=past_cur_agents_world_8_list,
    ego_cur_pose_np=ego_cur_pose_np,
    agents_states_dim=agents_states_dim,
)

# 이웃 선택 + (chosen_agent_num, T, 11) 텐서 구성
neighbor_agents_past, agents_cur_frame_indices, neighbors_id = \
    _select_neighbor_agents_and_build_past(
        all_frame_np_agents_local=all_frame_np_agents_local,
        current_agent_types_list=current_agent_types_list,
        agents_states_dim=agents_states_dim,
        max_agent_num=max_agent_num,
        max_pedestrians=max_pedestrians,
        max_bicycles=max_bicycles,
        filter_radius=filter_radius,
    )

# ✅ (핵심) "현재 프레임이 전부 0"인 이웃이 섞여 들어오면 여기서 제거
neighbor_agents_past, agents_cur_frame_indices, neighbors_id = \
    _filter_out_neighbors_not_present_at_current(
        neighbor_agents_past=neighbor_agents_past,              # (K, T, 11)
        agents_cur_frame_indices=agents_cur_frame_indices,      # (K,)
        neighbors_id=neighbors_id,                              # (K,)
    )

# neighbors_id -> track_token 변환
id_to_token: Dict[int, str] = {int(v): k for k, v in token_to_id.items()}

neighbor_track_token: List[str] = []
for track_id in neighbors_id:
    track_id_int: int = int(track_id)
    if track_id_int == -1:
        raise ValueError("Neighbor agent has invalid track_id -1.")
    if track_id_int not in id_to_token:
        raise KeyError(
            f"neighbors_id={track_id_int} 가 token_to_id에 존재하지 않습니다. "
            "ID 정밀도(특히 float 변환) 또는 token_to_id 갱신 흐름을 점검해 주세요.")
    neighbor_track_token.append(id_to_token[track_id_int])

return neighbor_agents_past, agents_cur_frame_indices, neighbors_id, neighbor_track_token

def agent_future_all_process( ego_cur_pose_np: np.ndarray, # shape: (3,) = [x_ego, y_ego, yaw_ego] cur_fut_agents_world_8_list: List[ np.ndarray], # 길이: 1+Tf_all, 각 원소 shape: (frame_agents_num_t, 8) neighbor_token_id: np.ndarray, # shape: (chosen_agent_num,) neighbor_agents_past: Optional[ np.ndarray] = None, # (chosen_agent_num, Tp, 11) 또는 None ) -> np.ndarray: """선택된 이웃 에이전트에 대해, 현재+미래 전체 구간의 궤적을 3차원/11차원 둘 다 만든다.

흐름:
    1) track_id 로 원하는 에이전트만 골라서 프레임별 리스트로 만든다.
    2) 각 프레임을 ego 기준 좌표계로 변환한다.
    3) 시간-에이전트 고정 크기 텐서로 채우되, 관측 없는 시점은 0으로 둔다.
    4) 타입 one-hot 은 neighbor_agents_past 에서 가져와 시간축 전체에 복사한다.

Args:
    ego_cur_pose_np (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, yaw_ego]
    cur_fut_agents_world_8_list (List[np.ndarray]):
        - 길이: num_frames_all = 1 + Tf_all
        - 각 원소 shape: (frame_agents_num_t, 8)
          [track_id, vx, vy, heading, width, length, x, y] (월드 좌표계)
    neighbor_token_id (np.ndarray):
        - shape: (chosen_agent_num,)
        - 선택된 이웃 에이전트의 track_id 배열.
    neighbor_agents_past (Optional[np.ndarray]):
        - shape: (chosen_agent_num, Tp, 11)
          과거 이웃 궤적 텐서. 마지막 3차원에 타입 one-hot 이 들어있다.
          None 이면 미래 쪽 타입 one-hot 은 0으로 둔다.

Returns:
    np.ndarray
        - cur_fut_chosen_agents_full_11:
            · shape: (chosen_agent_num, num_frames_all, 11)
            · [x, y, cos(yaw), sin(yaw), v_x, v_y, width, length, one_hot(3)]
"""
# 1) track_id 기준으로 원하는 에이전트만 남기기
# cur_fut_chosen_agents_world_8_list:
#   길이: num_frames_all, 각 원소 shape: (frame_save_agents_num_t, 8)
cur_fut_chosen_agents_world_8_list: List[
    np.ndarray] = _filter_agents_array_w_id(
        cur_fut_agents_world_8_list,
        neighbor_token_id,
    )

# 2) 각 프레임을 ego 기준 좌표계로 변환
cur_fut_chosen_agents_local_8_list: List[np.ndarray] = []
for frame_chosen_agents_world_8 in cur_fut_chosen_agents_world_8_list:
    # frame_chosen_agents_world_8: (frame_save_agents_num_t, 8)
    # → [track_id, vx, vy, heading, width, length, x, y] (ego 기준) 으로 변환
    frame_local_8: np.ndarray = convert_absolute_quantities_to_relative(
        frame_chosen_agents_world_8,
        ego_cur_pose_np,
        'agent',
    )
    cur_fut_chosen_agents_local_8_list.append(frame_local_8)

# 3) 타입 one-hot 준비 (neighbor_agents_past 가 있을 때만)
neighbor_types_one_hot: Optional[np.ndarray] = None
if neighbor_agents_past is not None and neighbor_agents_past.size > 0:
    # neighbor_agents_past: (chosen_agent_num, Tp, 11)
    # 타입 one-hot 은 마지막 차원 8:11
    # neighbor_types_one_hot: (chosen_agent_num, 3)
    neighbor_types_one_hot = neighbor_agents_past[:, -1,
                                                  8:11].astype(np.float32,
                                                               copy=False)

# 4) 시간축 패딩 + xyh / full_11 텐서 만들기
"""
cur_fut_chosen_agents_full_11: (chosen_agent_num, num_frames_all, 11)
"""
cur_fut_chosen_agents_full_11 = _pad_agent_states_with_zeros_w_id(
    cur_fut_chosen_agents_local_8_list=cur_fut_chosen_agents_local_8_list,
    neighbor_token_id=neighbor_token_id,
    neighbor_types_one_hot=neighbor_types_one_hot,
)

return cur_fut_chosen_agents_full_11

================= """ Module: Coordination Transformation Functions and Numpy-Tensor Transformation Description: This module contains functions for transforming the coordination to ego-centric coordination and Numpy-Tensor transformation.

Categories: 1. Ego, agent, static coordination transformation 2. Map coordination transformation 3. Numpy-Tensor transformation """ from nuplan.common.maps.nuplan_map.utils import get_roadblock_ids_from_trajectory from nuplan.common.actor_state.tracked_objects import TrackedObjects from nuplan.planning.scenario_builder.nuplan_db.nuplan_scenario import NuPlanScenario from types import SimpleNamespace from nuplan.database.nuplan_db.nuplan_scenario_queries import
get_end_sensor_time_from_db from nuplan.database.nuplan_db.nuplan_db_utils import get_lidarpc_sensor_data import torch from nuplan.common.actor_state.tracked_objects import TrackedObjects, TrackedObject from nuplan.planning.training.preprocessing.utils.agents_preprocessing import EgoInternalIndex, AgentInternalIndex from nuplan.common.maps.abstract_map_objects import RoadBlockGraphEdgeMapObject from shapely.geometry import Point from nuplan.planning.scenario_builder.abstract_scenario import AbstractScenario from nuplan.common.actor_state.tracked_objects_types import TrackedObjectType from nuplan.common.actor_state.state_representation import StateSE2 from nuplan.common.actor_state.ego_state import EgoState from nuplan.planning.simulation.observation.observation_type import DetectionsTracks from diffusion_planner.data_process.roadblock_utils import route_roadblock_correction from typing import List, Optional, Union, Sequence import numpy as np from nuplan.common.actor_state.tracked_objects import TrackedObject from collections import defaultdict from typing import Dict, List, Optional, Tuple import math import shapely.geometry as geom from shapely import affinity from nuplan.common.maps.abstract_map import AbstractMap, MapObject from nuplan.common.maps.abstract_map import SemanticMapLayer from nuplan.common.actor_state.state_representation import Point2D from typing import Set import math from nuplan.common.actor_state.ego_state import EgoState from nuplan.planning.simulation.observation.observation_type import
DetectionsTracks from nuplan.common.actor_state.tracked_objects_types import
TrackedObjectType

utils.py (적절한 위치에 추가)

from typing import List, Optional, Sequence import warnings from types import SimpleNamespace from nuplan.planning.scenario_builder.nuplan_db.nuplan_scenario import NuPlanScenario from nuplan.common.actor_state.state_representation import StateSE2 from nuplan.common.actor_state.tracked_objects_types import TrackedObjectType from nuplan.common.maps.nuplan_map.utils import get_roadblock_ids_from_trajectory

from typing import Dict import numpy as np import numpy.typing as npt

def ego_local_traj3_to_global( local_traj_xyh: npt.NDArray[ np.floating], # shape: (T, 3) = [x_e, y_e, yaw_e] cur_ego_global_xyyaw: npt.NDArray[ np.floating], # shape: (3,) = [x_g, y_g, yaw_g] *, invalid_eps: float = 0.0, ) -> npt.NDArray[np.float64]: """ego 좌표계 (x, y, heading) 시퀀스를 세계 절대 좌표계로 변환하되, (0., 0., 0.)인 무효 행은 제거하고 유효 행만 반환합니다.

Args:
    local_traj_xyh (np.ndarray): shape (T, 3). 각 행은 [x_e, y_e, yaw_e].
    cur_ego_global_xyyaw (np.ndarray): shape (3,). [x_g, y_g, yaw_g].
    invalid_eps (float, optional): 무효 판정 허용 오차.
        - 0.0: 정확히 (0., 0., 0.)만 무효
        - >0.0: |x_e|, |y_e|, |yaw_e| 모두 eps 이하이면 무효

Returns:
    np.ndarray: shape (T_valid, 3). 각 행은 [x_g, y_g, yaw_g].
                유효 행이 하나도 없으면 (0, 3) 배열을 반환.
"""
if local_traj_xyh.ndim != 2 or local_traj_xyh.shape[1] != 3:
    raise ValueError(
        f"`local_traj_xyh` shape must be (T,3), got {local_traj_xyh.shape}")
if cur_ego_global_xyyaw.shape != (3,):
    raise ValueError(
        f"`cur_ego_global_xyyaw` shape must be (3,), got {cur_ego_global_xyyaw.shape}"
    )

# ── 1) 무효 행 필터링: (x, y, yaw) 모두 0(또는 eps 이내)이면 제거 ─────────────────
if invalid_eps <= 0.0:
    invalid_mask = (local_traj_xyh[:, 0] == 0.0) & (
        local_traj_xyh[:, 1] == 0.0) & (local_traj_xyh[:, 2] == 0.0)
else:
    invalid_mask = (
        np.isclose(local_traj_xyh[:, 0], 0.0, atol=invalid_eps) &
        np.isclose(local_traj_xyh[:, 1], 0.0, atol=invalid_eps) &
        np.isclose(local_traj_xyh[:, 2], 0.0, atol=invalid_eps))
valid_mask = ~invalid_mask
if not np.any(valid_mask):
    return np.empty((0, 3), dtype=np.float64)

local_valid = local_traj_xyh[valid_mask]  # (T_valid, 3)

# ── 2) ego→global 변환 ────────────────────────────────────────────────────────
x_e = local_valid[:, 0]
y_e = local_valid[:, 1]
yaw_e = local_valid[:, 2]

x_g0, y_g0, yaw_g0 = map(float,
                         cur_ego_global_xyyaw.tolist())  # 글로벌 기준(ego 현재 포즈)
c, s = np.cos(yaw_g0), np.sin(yaw_g0)
x_g = x_e * c - y_e * s + x_g0
y_g = x_e * s + y_e * c + y_g0
yaw_g = yaw_e + yaw_g0
# 필요하면 yaw_g = (yaw_g + np.pi) % (2 * np.pi) - np.pi  # [-pi, pi] 정규화

return np.stack([x_g, y_g, yaw_g], axis=1).astype(np.float64)

def get_npc_route_roadblock_ids( scenario: NuPlanScenario, past_cur_tracked_objects: List[TrackedObjects], neighbor_track_token: Optional[List[str]], # 길이 = chosen_agent_num horizon=20., ) -> Dict[str, List[str]]: """ get_future_tracked_objects 를 이용해 한 번에 궤적을 수집하고, get_roadblock_ids_from_trajectory 로 연결성 기반 ID 시퀀스를 추출합니다. """

# iteration=0 시점부터 시나리오 끝까지 future 트랙 객체를 한줄로 가져옴
# 전체 horizon은 시나리오 총 길이(초)로 지정
# horizon = max(30.0,  _scenario_total_horizon_s(scenario))
num_samples = int(horizon / 0.1)
# 1) 에이전트별 StateSE2 리스트 수집
if neighbor_track_token is None:
    allow_all_token = True
    neighbor_track_token_set = {}
else:
    allow_all_token = False
    neighbor_track_token_set = set([str(t) for t in neighbor_track_token])
    if not neighbor_track_token_set:
        return {}

future_observations: List[TrackedObjects] = []
for dets in scenario.get_future_tracked_objects(0, horizon, num_samples):
    future_observations.append(dets.tracked_objects)
past_future_observations: List[TrackedObjects] = []
past_future_observations.extend(past_cur_tracked_objects)
past_future_observations.extend(future_observations)

token_to_state_list: Dict[str, List[SimpleNamespace]] = defaultdict(list)
first_pose: Dict[str, SimpleNamespace] = {}
for tracked_objects in past_future_observations:
    # tracked_objects: TrackedObjects
    for obj in tracked_objects:
        # obj: TrackedObject
        if obj.tracked_object_type != TrackedObjectType.VEHICLE:
            continue
        token = str(obj.track_token)
        if (not allow_all_token) and (token
                                      not in neighbor_track_token_set):
            continue
        # heading은 실제로 사용하지 않지만, 넣어도 무방(여기서는 0.0 또는 obj.center.heading 가능)
        rear_axle_state = StateSE2(obj.center.x, obj.center.y,
                                   obj.center.heading)
        pseudo_ego = SimpleNamespace(rear_axle=rear_axle_state)
        if first_pose.get(token, None) is None:
            first_pose[token] = pseudo_ego

        token_to_state_list[token].append(pseudo_ego)

# 2) 연결성 기반 roadblock ID 추출
car_token_to_rb_ids_list: Dict[str, List[str]] = {}
for token, states_list in token_to_state_list.items():
    if not states_list:
        raise RuntimeError(
            f"Internal error: token_to_state_list[{token}] is empty.")
    # 덕 타이핑: states_list[*].rear_axle.point 만 참조됨
    rb_ids_list: List[str] = get_roadblock_ids_from_trajectory(
        scenario.map_api, states_list)
    if len(rb_ids_list) == 0:
        car_token_to_rb_ids_list[token] = []
        continue
    corrected_ids = route_roadblock_correction(
        first_pose[token],
        scenario.map_api,
        rb_ids_list,
    )

    car_token_to_rb_ids_list[token] = corrected_ids
return car_token_to_rb_ids_list

def _prefer_rr_on_conflict( rb_ids: Sequence[str], rbc_ids: Sequence[str], route_rr_ids: Optional[Sequence[str]] = None, verbose: bool = False, ) -> List[str]: """RB vs RBC 동시 검출 시 우선순위 규칙으로 선택한다.

규칙:
  1) route_rr_ids 와의 교집합이 존재하면 그쪽을 우선
     - RBC ∩ route  → 우선 반환
     - RB  ∩ route  → 차선책
  2) 그렇지 않다면 RBC 우선 (교차로/연결부 가정)
  3) 그래도 비어 있으면 RB
  4) 그래도 없으면 []

Args:
    rb_ids: RoadBlock id 리스트
    rbc_ids: RoadBlock-Connector id 리스트
    route_rr_ids: 시나리오의 글로벌 경로 roadblock ids
    verbose: 경고 메시지 출력 여부

Returns:
    우선순위 규칙으로 정한 id 리스트(비어 있을 수 있음)
"""
rb_ids = list(rb_ids) if rb_ids else []
rbc_ids = list(rbc_ids) if rbc_ids else []

if route_rr_ids:
    route_set = set(route_rr_ids)
    rbc_on_route = [rid for rid in rbc_ids if rid in route_set]
    if rbc_on_route:
        if verbose:
            warnings.warn(f"[RR-Resolve] RBC∩Route 선택: {rbc_on_route}")
        return rbc_on_route

    rb_on_route = [rid for rid in rb_ids if rid in route_set]
    if rb_on_route:
        if verbose:
            warnings.warn(f"[RR-Resolve] RB∩Route 선택: {rb_on_route}")
        return rb_on_route

if rbc_ids:
    if verbose:
        warnings.warn(f"[RR-Resolve] Route 교집합 없음 → RBC 우선: {rbc_ids}")
    return rbc_ids

if rb_ids:
    if verbose:
        warnings.warn(f"[RR-Resolve] RBC 없음 → RB 사용: {rb_ids}")
    return rb_ids

if verbose:
    warnings.warn(f"[RR-Resolve] 비어 있음 → []")
return []

def _map_object_to_geometry(obj: MapObject) -> Optional[geom.base.BaseGeometry]: """맵 객체(MapObject)에서 도형(점/선/면) 정보를 꺼내는 작은 도우미 함수.

이 함수는 나중에
`get_directional_proximal_map_objects` 같은 곳에서
“이 물체가 내가 만든 영역과 겹치는지”를 확인하기 위해,
맵 객체를 Shapely에서 이해할 수 있는 도형으로 바꿔주는 역할을 한다.

동작 규칙
--------
1) 먼저 `obj.polygon` 이 있는지 확인한다.
   - 예: 도로 묶음(roadblock), 교차로, 횡단보도 등은 보통 폴리곤(면)으로 제공된다.
   - 있으면 그대로 돌려준다.
     · 반환 도형 예시: Polygon

2) 폴리곤이 없다면, 차선처럼 “중심선 polyline” 형태를 갖고 있는지 본다.
   - `obj.baseline_path.discrete_path` 가 있는 경우:
     · 이 안에는 (x, y, heading) 형태의 점들이 순서대로 들어있다고 보면 된다.
       - 길이: N
       - 좌표 배열로 바꾸면 개념적으로 (N, 2) 모양
     · 이 점들로 Shapely LineString 을 만들어 반환한다.
       - N >= 2 인 경우에만 선(LineString)으로 만들고,
       - N == 1 이면 점(Point)으로 반환한다.

3) 위 두 가지 경우 모두 아니면, 공개된 속성만으로는 모양을 알 수 없으므로
   `None` 을 돌려준다.

Args:
    obj (MapObject):
        - NuPlan 맵에서 가져온 아무 종류의 맵 객체.
        - 예: 차선, 차선연결, 도로묶음, 교차로 등.

Returns:
    Optional[geom.base.BaseGeometry]:
        - Polygon / LineString / Point 같은 Shapely 도형 객체.
        - 도형 정보를 만들 수 없을 때는 `None`.
"""
# 1) 다각형이 있는 타입(예: Lane, RoadBlock, Connector 등)
polygon = getattr(obj, "polygon", None)
if polygon is not None:
    return polygon

# 2) 차선류 등: baseline_path → LineString
baseline_path = getattr(obj, "baseline_path", None)
if baseline_path is not None and hasattr(baseline_path, "discrete_path"):
    # pts: 길이 = N, 각 원소 = (x, y)  → 개념적 shape: (N, 2)
    pts = [(n.x, n.y) for n in baseline_path.discrete_path]
    if len(pts) >= 2:
        return geom.LineString(pts)
    elif len(pts) == 1:
        return geom.Point(pts[0])

# 3) 기타(공개 속성으로는 기하 획득 불가)
return None

def get_directional_proximal_map_objects( map_api: AbstractMap, point: Point2D, heading: float, radius: float, layers: List[SemanticMapLayer], ) -> Dict[SemanticMapLayer, List[MapObject]]: """ego 진행 방향을 기준으로 회전된 정사각형 안에 걸치는 맵 객체를 조회한다.

이 함수는 NuPlanMap 의 :meth:`get_proximal_map_objects` 와 비슷하지만,
패치 모양이 다르다.

- 기존: point 를 중심으로 한 **가로·세로 방향 정사각형**
    · [x - radius, x + radius] × [y - radius, y + radius]
- 이 함수: point 를 중심으로 한 **ego heading 방향에 맞춰 회전된 정사각형**
    · 한 변 길이: 2 * radius
    · 한 변이 ego heading 과 평행, 다른 변은 그에 수직

포함 기준
----------
각 레이어의 모든 객체에 대해, 객체의 도형(geometry)이
회전된 정사각형과 `intersects` 인지 검사한다.

Shapely 의 `intersects` 는
“도형이 서로 **한 점이라도 겹치면** True” 이므로, 아래 경우 모두 포함된다.

* 정사각형 안에 완전히 들어온 경우
* 정사각형 모서리에 살짝 걸치는 경우
* 거의 밖에 있지만, 일부 꼭짓점이나 변이 정사각형에 닿는 경우

내부 동작 흐름
--------------
1) 지원 레이어 확인
    - `map_api.get_available_map_objects()` 로 실제 지원 레이어 목록을 가져온다.
    - 요청한 `layers` 중 지원되지 않는 레이어가 있으면 assert 로 바로 실패시킨다.

2) 회전 전 정사각형 패치 생성 (축에 정렬된 네모)
    - x 방향 범위: [point.x - radius, point.x + radius]
    - y 방향 범위: [point.y - radius, point.y + radius]
    - 이 범위로 shapely 의 `geom.box(...)` 를 사용해 네모(Polygon)를 만든다.
      · `patch`: Polygon, 모양 = 축에 정렬된 정사각형
      · 개념적 shape: 직사각형 이지만, 여기서는 항상 정사각형
        - 한 변 길이 = 2 * radius

3) ego heading 기준으로 정사각형 회전
    - heading(라디안) → degree 로 변환: `angle_deg = heading * 180 / π`
    - `affinity.rotate(patch, angle_deg, origin=(point.x, point.y))` 호출
        · origin 을 ego 위치로 지정해서,
          정사각형이 ego 위치를 중심으로 회전하게 만든다.
    - 결과:
        · `rotated_patch`: Polygon
        · 한 변은 ego 진행 방향과 평행,
          나머지 한 변은 그에 정확히 수직

4) 레이어별로 geometry ∩ rotated_patch 검사
    - 여기서부터는 NuPlanMap 구체 구현에 의존하므로
      `map_api` 가 `NuPlanMap` 인지 확인 후 캐스팅한다.
    - 각 레이어에 대해:
        a) `layer_df = map_api._get_vector_map_layer(layer)` 로
           해당 레이어의 벡터 데이터를 가져온다.
           · `layer_df["geometry"]`: 각 행의 도형(Polygon 등), shape ≈ (num_objects,)
        b) `mask = layer_df["geometry"].intersects(rotated_patch)` 로
           각 도형이 회전된 정사각형과 겹치는지 계산한다.
           · `mask`: pandas Series(bool), shape: (num_objects,)
               - True  → 정사각형과 최소 한 점이라도 겹침
               - False → 전혀 안 겹침
        c) `map_object_ids = layer_df.loc[mask]["fid"]` 로
           겹치는 행들의 id 를 뽑는다.
        d) `map_api.get_map_object(fid, layer)` 를 사용해
           실제 `MapObject` 인스턴스를 얻는다.
           이렇게 얻은 객체들을 리스트로 모아 `object_map[layer]` 에 저장한다.

자료 구조 / shape 정리
----------------------
입력
  - map_api (AbstractMap):
      · 실제로는 NuPlanMap 인스턴스여야 한다.
      · 그렇지 않으면 TypeError 를 일으킨다.

  - point (Point2D):
      · ego 위치 (x, y), 단위 m

  - heading (float):
      · ego 진행 방향(라디안)

  - radius (float):
      · 회전된 정사각형 한 변의 절반 길이 [m]
      · 정사각형 전체 크기 = (2 * radius) × (2 * radius)

  - layers (List[SemanticMapLayer]):
      · 예: [SemanticMapLayer.LANE, SemanticMapLayer.ROADBLOCK]

중간 변수
  - patch: geom.Polygon
      · 축에 정렬된 정사각형
      · 좌표 범위: x ∈ [x-radius, x+radius], y ∈ [y-radius, y+radius]

  - rotated_patch: geom.Polygon
      · ego heading 에 따라 회전된 정사각형
      · patch 와 꼭짓점 좌표는 같지만, 회전된 상태

  - layer_df: VectorLayer (실제로는 GeoDataFrame)
      · 각 레이어의 벡터 데이터
      · `layer_df["geometry"]`: 각 객체의 도형, 길이 ≈ num_objects

  - mask: pandas.Series[bool]
      · shape: (num_objects,)
      · True 인 인덱스는 rotated_patch 와 교차하는 객체

출력
  - object_map: Dict[SemanticMapLayer, List[MapObject]]
      · key: 입력으로 넘긴 각 레이어
      · value: 해당 레이어에서 “회전된 정사각형과 조금이라도 겹치는” MapObject 리스트
      · 각 리스트 길이 = 해당 레이어에서 조건을 만족하는 객체 수

Args:
    map_api (AbstractMap):
        NuPlanMap 인스턴스여야 한다(내부 벡터 레이어 접근 필요).
    point (Point2D):
        정사각형 중심이 될 ego 위치 (x, y).
    heading (float):
        ego 진행 방향 (라디안).
    radius (float):
        정사각형 한 변의 절반 길이 [m].
    layers (List[SemanticMapLayer]):
        조회할 레이어 목록.

Returns:
    Dict[SemanticMapLayer, List[MapObject]]:
        레이어별로, 회전된 정사각형과 조금이라도 겹치는 MapObject 들을 모은 딕셔너리.

Raises:
    TypeError:
        - map_api 가 NuPlanMap 타입이 아닐 때.
    AssertionError:
        - 요청한 레이어 중 현재 맵에서 지원하지 않는 레이어가 있을 때.
"""

# 1) 지원 레이어 확인
supported_layers: List[
    SemanticMapLayer] = map_api.get_available_map_objects()
unsupported_layers: List[SemanticMapLayer] = [
    layer for layer in layers if layer not in supported_layers
]
assert len(unsupported_layers) == 0, (
    f"Object representation for layer(s): {unsupported_layers} is unavailable"
)

# 2) 회전 전 축정렬 정사각형 생성
x_min, x_max = point.x - radius, point.x + radius
y_min, y_max = point.y - radius, point.y + radius
patch: geom.Polygon = geom.box(x_min, y_min, x_max, y_max)

# 3) ego heading 기준으로 정사각형 회전 (deg 단위 필요)
angle_deg: float = float(heading) * 180.0 / np.pi
rotated_patch: geom.Polygon = affinity.rotate(
    patch,
    angle_deg,
    origin=(point.x, point.y),
)

object_map: Dict[SemanticMapLayer, List[MapObject]] = defaultdict(list)

# 4) 각 레이어에서 rotated_patch 와 intersects 인 객체만 선택
for layer in layers:
    # VectorLayer 는 GeoDataFrame 과 비슷한 구조라고 보면 된다.
    layer_df = map_api._get_vector_map_layer(layer)

    # geometry 컬럼과 회전된 정사각형의 "겹침 여부"를 벡터화해서 계산
    # mask: (num_objects,) bool
    mask = layer_df["geometry"].intersects(rotated_patch)

    # mask 가 True 인 행들의 fid 를 가져온다.
    map_object_ids = layer_df.loc[mask]["fid"]

    # fid 로 실제 MapObject 인스턴스를 생성
    object_map[layer] = [
        map_api.get_map_object(map_object_id, layer)
        for map_object_id in map_object_ids
    ]

return object_map

def get_circular_proximal_map_objects( map_api: AbstractMap, point: Point2D, radius: float, layers: List[SemanticMapLayer], ) -> Dict[SemanticMapLayer, List[MapObject]]: """원(동그라미) 반경 안에 조금이라도 걸치는 맵 객체들을 레이어별로 모아준다.

이 함수는 NuPlanMap 의 :meth:`get_proximal_map_objects` 와 비슷하지만,
**축에 정렬된 네모** 대신 **원 모양 영역**을 기준으로 객체를 찾는다.

- 기존: point 를 중심으로 한 네모 영역
    · x ∈ [point.x - radius, point.x + radius]
    · y ∈ [point.y - radius, point.y + radius]
- 이 함수: point 를 중심으로 한 **원(반지름 radius)**

포함 기준
----------
각 레이어의 모든 객체에 대해, 그 객체의 도형(geometry)이
이 원과 shapely 의 `intersects` 여부를 체크한다.

- `intersects(...)` 가 True 인 경우:
    · 원 안에 완전히 들어온 경우
    · 원 경계에 살짝 걸친 경우
    · 객체 대부분은 밖에 있지만 일부 모서리/변만 원에 닿는 경우
  모두 **포함**된다.

내부 동작 순서
--------------
1) 입력 맵 타입 확인
    - `map_api` 가 실제로 NuPlanMap 인스턴스인지 확인한다.
      (내부 벡터 레이어에 접근해야 하므로 필수)

2) 지원 레이어 검증
    - `map_api.get_available_map_objects()` 로 현재 맵에서 지원하는 레이어 목록을 얻는다.
    - 요청한 `layers` 중 지원하지 않는 레이어가 있으면 `assert` 로 바로 실패시킨다.

3) 원(원판) 도형 생성
    - 중심점: `center = geom.Point(point.x, point.y)`
    - 원 도형: `patch = center.buffer(radius)`
      · `patch` 는 Shapely Polygon 이고, 원을 다각형으로 근사한 결과.
      · 개념적으로는 “반지름이 radius 인 원”이라고 보면 된다.

4) 레이어별 geometry ∩ 원 검사
    - 각 레이어에 대해:
        a) `layer_df = map_api._get_vector_map_layer(layer)`
           · NuPlan 내부의 벡터 레이어(GeoDataFrame 유사 구조) 조회
           · `layer_df["geometry"]` 컬럼에는 각 행의 도형(Polygon 등)이 들어 있음
             - shape: (num_objects,)
        b) `mask = layer_df["geometry"].intersects(patch)`
           · `mask`: 길이 (num_objects,) 의 bool Series
           · True  → 해당 geometry 가 원과 한 점이라도 겹친다
           · False → 전혀 겹치지 않는다
        c) `map_object_ids = layer_df.loc[mask]["fid"]`
           · 원과 겹치는 객체들의 id(fId)만 추출
        d) 각 id 에 대해 `map_api.get_map_object(fid, layer)` 를 호출하여
           실제 `MapObject` 인스턴스를 만들고 리스트에 담는다.
    - 이렇게 만들어진 리스트를 `object_map[layer]` 에 저장한다.

자료 구조 / shape 정리
----------------------
입력
  - map_api (AbstractMap):
      · 실제 타입: NuPlanMap (아니면 TypeError 발생)
  - point (Point2D):
      · ego 위치 (x, y), 단위 m
  - radius (float):
      · 원의 반지름 [m]
  - layers (List[SemanticMapLayer]):
      · 예: [SemanticMapLayer.LANE, SemanticMapLayer.ROADBLOCK]

중간 변수
  - center: shapely.geometry.Point
      · (point.x, point.y)
  - patch: shapely.geometry.Polygon
      · `center.buffer(radius)` 로 만든 원 모양 영역
  - layer_df: VectorLayer (GeoDataFrame 비슷)
      · `layer_df["geometry"]`: 길이 = num_objects
  - mask: pandas.Series(bool)
      · shape: (num_objects,)
      · True 인 인덱스만 원과 겹치는 객체

출력
  - object_map: Dict[SemanticMapLayer, List[MapObject]]
      · key: 입력으로 받은 각 레이어
      · value: 해당 레이어에서 원과 조금이라도 겹치는 맵 객체 리스트

Args:
    map_api (AbstractMap):
        NuPlanMap 인스턴스여야 한다. (내부 벡터 레이어 접근 필요)
    point (Point2D):
        원의 중심이 될 포인트 (x, y).
    radius (float):
        원의 반경 [m].
    layers (List[SemanticMapLayer]):
        조회할 레이어 목록.

Returns:
    Dict[SemanticMapLayer, List[MapObject]]:
        레이어별로, 중심 원과 한 점이라도 겹치는 MapObject 들을 모은 딕셔너리.

Raises:
    TypeError:
        - map_api 가 NuPlanMap 타입이 아닐 때.
    AssertionError:
        - 요청한 레이어 중 지원되지 않는 레이어가 있을 때.
"""
# 1) 요청 레이어가 실제로 지원되는지 확인
supported_layers: List[
    SemanticMapLayer] = map_api.get_available_map_objects()
unsupported_layers: List[SemanticMapLayer] = [
    layer for layer in layers if layer not in supported_layers
]
assert len(unsupported_layers) == 0, (
    f"Object representation for layer(s): {unsupported_layers} is unavailable"
)

# 2) 원(패치) 생성: 중심은 point, 반경은 radius
center: geom.Point = geom.Point(point.x, point.y)
patch: geom.Polygon = center.buffer(radius)

object_map: Dict[SemanticMapLayer, List[MapObject]] = defaultdict(list)

# 3) 각 레이어에서, 원과 intersects 인 geometry 만 선택
for layer in layers:
    layer_df = map_api._get_vector_map_layer(layer)

    # geometry 가 원(patch)와 한 점이라도 겹치는 행만 선택
    # mask: (num_objects,) bool
    mask = layer_df["geometry"].intersects(patch)
    map_object_ids = layer_df.loc[mask]["fid"]

    object_map[layer] = [
        map_api.get_map_object(map_object_id, layer)
        for map_object_id in map_object_ids
    ]

return object_map

def build_agent_route_lane_order( npc_route_on_chosen_lane_idx_list: List[ List[int]], # 길이: chosen_agent_num, 각 원소 길이 가변 chosen_lane_num: int, dtype: np.dtype = np.int32, ) -> np.ndarray: # shape: (chosen_agent_num, chosen_lane_num) """에이전트별 “경로 위 차선 인덱스 리스트”를 정수 랭크 행렬로 바꾼다.

개념
----
- npc_route_on_chosen_lane_idx_list[i] 가 [5, 2, 7] 이라면,
  i번째 에이전트 입장에서
    · lane 5 → 0번째로 가까운 route 차선
    · lane 2 → 1번째
    · lane 7 → 2번째
  로 해석한다.
- 따라서 반환 행렬의 [i, j] 원소는
  “에이전트 i 입장에서 j번 차선이 route 위에서 몇 번째인지”를 뜻한다.
  · 0, 1, 2, ... : 해당 순서
  · -1           : 해당 에이전트의 route 에 없는 차선

동작 순서
--------
1) 결과 행렬을 -1 로 채운다. shape = (chosen_agent_num, chosen_lane_num).
2) 각 에이전트 i 에 대해:
   - npc_route_on_chosen_lane_idx_list[i] 를 배열로 만든다.
   - 음수 인덱스 또는 범위를 벗어나는 인덱스가 있으면 예외를 발생시킨다.
   - 중복된 lane 인덱스는 “첫 등장 순서”만 인정한다.
     (np.unique + first_indices + argsort 로 구현)
   - 유효한 lane 인덱스를 등장 순서대로 정렬하고,
     0,1,2,... 랭크를 한 번에 할당한다.

Args:
    npc_route_on_chosen_lane_idx_list:
        - 길이: chosen_agent_num
        - 각 원소는 해당 에이전트의 “경로 위 차선 인덱스 리스트”.
    chosen_lane_num:
        - 전체 차선 개수 (행렬의 두 번째 차원 크기).
    dtype:
        - 반환 행렬의 정수 dtype. 기본값 np.int32.

Returns:
    np.ndarray:
        - agent_route_lane_order
        - shape = (chosen_agent_num, chosen_lane_num)
        - 각 [i, j] 원소는 에이전트 i 에서 lane j 의 순서(0,1,2,...) 또는 -1.
"""
chosen_agent_num: int = len(npc_route_on_chosen_lane_idx_list)

# 결과 행렬 초기화: (chosen_agent_num, chosen_lane_num)
agent_route_lane_order: np.ndarray = np.full(
    (chosen_agent_num, chosen_lane_num),
    -1,
    dtype=dtype,
)

# 에이전트가 없거나 차선이 없으면 바로 반환
if chosen_agent_num == 0 or chosen_lane_num == 0:
    return agent_route_lane_order

for agent_i, route_on_chosen_lane_idx in enumerate(
        npc_route_on_chosen_lane_idx_list):
    # 이 에이전트는 route 상에 차선이 없는 경우
    if not route_on_chosen_lane_idx:
        continue

    # lane_idx_list: (L,)
    lane_idx_list: np.ndarray = np.asarray(route_on_chosen_lane_idx,
                                           dtype=int)

    # 범위 체크 (벡터화)
    if np.any(lane_idx_list < 0):
        bad_idx: int = int(lane_idx_list[lane_idx_list < 0][0])
        raise ValueError(f"음수 인덱스가 발견되었습니다: {bad_idx}")
    if np.any(lane_idx_list >= chosen_lane_num):
        bad_idx = int(lane_idx_list[lane_idx_list >= chosen_lane_num][0])
        raise ValueError(
            f"lane_idx {bad_idx} 가 chosen_lane_num={chosen_lane_num} 범위를 벗어났습니다."
        )

    # 중복 제거 (첫 등장 순서 유지)
    # unique_vals: (M,), first_indices: (M,)
    unique_vals, first_indices = np.unique(lane_idx_list, return_index=True)
    # 원래 등장 순서대로 정렬
    order: np.ndarray = np.argsort(first_indices)  # shape: (M,)
    unique_lane_idx: np.ndarray = unique_vals[order]  # shape: (M,)

    # 랭크 벡터: [0, 1, 2, ...], shape: (M,)
    ranks: np.ndarray = np.arange(unique_lane_idx.shape[0], dtype=dtype)

    # 한 번에 대입
    agent_route_lane_order[agent_i, unique_lane_idx] = ranks

return agent_route_lane_order

def _lane_min_dist_order( lanes_xy: np.ndarray, # shape: (chosen_lane_num, lane_len, 2) neighbor_current_xy: np.ndarray, # shape: (2,) ) -> np.ndarray: # shape: (chosen_lane_num,) """에이전트 위치에서 각 차선까지의 최소 거리를 계산해, 가까운 순으로 정렬된 인덱스를 만든다.

Args:
    lanes_xy:
        차선 폴리라인 좌표.
        shape = (chosen_lane_num, lane_len, 2).
    neighbor_current_xy:
        에이전트 현재 위치 [x, y].
        shape = (2,).

Returns:
    np.ndarray:
        lane 인덱스가 "에이전트와 가까운 순"으로 정렬된 배열.
        shape = (chosen_lane_num,).
"""
# diff: (chosen_lane_num, lane_len, 2)
diff: np.ndarray = lanes_xy - neighbor_current_xy[None, None, :]
# dists: (chosen_lane_num, lane_len)
dists: np.ndarray = np.linalg.norm(diff, axis=-1)
# min_dists: (chosen_lane_num,)
min_dists: np.ndarray = np.min(dists, axis=1)
# 가까운 순 정렬 인덱스 반환
return np.argsort(min_dists)

def _select_lanes_by_order( chosen_lane_dist_order: np.ndarray, # shape: (chosen_lane_num,) chosen_lanes_route_mask_arr: np.ndarray, # shape: (chosen_lane_num,) ) -> List[int]: """거리 정렬 결과와 True/False 마스크를 이용해 “경로 위에 있는 차선 인덱스”만 가까운 순으로 고른다.

동작 방식
--------
- `chosen_lane_dist_order` 는 “가까운 차선부터 먼 차선까지” 정렬된 인덱스 배열이다.
- 같은 길이의 `chosen_lanes_route_mask_arr` 에서 True 인 위치만 골라,
  그 인덱스를 리스트에 담아 반환한다.
- 별도의 개수 제한은 없으며, True 인 차선은 전부 사용한다.

Args:
    chosen_lane_dist_order:
        에이전트와의 최소 거리 기준으로 정렬된 lane 인덱스 배열.
        shape = (chosen_lane_num,).
    chosen_lanes_route_mask_arr:
        해당 lane 이 “경로 위(True)”에 있는지 여부 마스크.
        shape = (chosen_lane_num,).

Returns:
    List[int]:
        경로 위에 있는 lane 인덱스 리스트.
        · 가까운 순서대로 정렬되어 있음.
"""
route_on_chosen_lane_idx: List[int] = []
for lane_idx in chosen_lane_dist_order:
    if chosen_lanes_route_mask_arr[lane_idx]:
        route_on_chosen_lane_idx.append(int(lane_idx))

return route_on_chosen_lane_idx

def _select_token_and_ordered_npc_route_indices( car_token_to_chosen_lanes_route_mask: Dict[ str, List[bool]], # 길이: chosen_car_num / 값: 길이 = chosen_lane_num neighbor_track_token: List[str], # 길이: chosen_agent_num neighbor_agents_current: np.ndarray, # shape: (chosen_agent_num, 11) vector_map_lanes: np.ndarray, # shape: (chosen_lane_num, lane_len, D) ) -> np.ndarray: # shape: (chosen_agent_num, chosen_lane_num) """차량 토큰별 route 차선 마스크를 이용해, 에이전트별 “경로 위 차선 순서 행렬”을 만든다.

개념
----
- 입력으로, 각 차량 토큰에 대해
  `[차선이 그 차량 경로 위에 있으면 True, 아니면 False]` 리스트가 주어진다.
- 각 에이전트 슬롯에는 track_token 이 있으므로,
  해당 토큰이 가진 True/False 마스크를 꺼내 쓸 수 있다.
- 에이전트별로:
    1) 에이전트 위치와 각 차선 폴리라인(lanes_xy) 사이의 최소 거리를 구해,
       가까운 순으로 lane 인덱스를 정렬한다. (`_lane_min_dist_order`)
    2) 그 순서대로, True 인 차선만 골라 route 위 차선 인덱스 리스트를 만든다.
       (`_select_lanes_by_order`)
    3) 전체 에이전트에 대해 위 리스트들을 모아
       `build_agent_route_lane_order` 로 랭크 행렬을 만든다.

주의 사항
--------
- `car_token_to_chosen_lanes_route_mask` 에 해당 토큰이 없으면
  “경로 위 차선이 하나도 없다”고 보고, 그 에이전트는 빈 리스트(모든 lane=-1)를 가지게 된다.

Args:
    car_token_to_chosen_lanes_route_mask:
        - 차량 토큰 → 길이 chosen_lane_num 의 True/False 리스트.
    neighbor_track_token:
        - 길이 chosen_agent_num.
        - 각 슬롯에 대응하는 에이전트의 track_token.
    neighbor_agents_current:
        - 현재 프레임 이웃 에이전트 상태.
        - shape = (chosen_agent_num, 11).
        - 여기서는 위치 x,y 만 사용 ([:, 0], [:, 1]).
    vector_map_lanes:
        - 차선 벡터(좌표+기타 정보).
        - shape = (chosen_lane_num, lane_len, D).
        - 여기서는 좌표 부분 [:, :, :2] 만 사용.

Returns:
    np.ndarray:
        - agent_route_lane_order:
            에이전트별 route 차선 순서 행렬.
            shape = (chosen_agent_num, chosen_lane_num), dtype = int64.
"""
chosen_agent_num: int = int(neighbor_agents_current.shape[0])
chosen_lane_num: int = int(vector_map_lanes.shape[0])

# lanes_xy: (chosen_lane_num, lane_len, 2)
lanes_xy: np.ndarray = vector_map_lanes[:, :, :2]

# npc_route_on_chosen_lane_idx_list:
#   길이 = chosen_agent_num, 각 원소: 선택된 lane 인덱스 리스트
npc_route_on_chosen_lane_idx_list: List[List[int]] = []

for agent_idx in range(chosen_agent_num):
    token: str = neighbor_track_token[agent_idx]
    chosen_lanes_route_mask: List[
        bool] = car_token_to_chosen_lanes_route_mask.get(
            token,
            [False] * chosen_lane_num,
        )

    # chosen_lanes_route_mask_arr: (chosen_lane_num,)
    chosen_lanes_route_mask_arr: np.ndarray = np.asarray(
        chosen_lanes_route_mask,
        dtype=bool,
    )

    # 이 에이전트 경로 위에 있는 차선이 하나도 없으면 빈 리스트
    if chosen_lanes_route_mask_arr.sum() == 0:
        npc_route_on_chosen_lane_idx_list.append([])
        continue

    # neighbor_current_xy: (2,)
    neighbor_current_xy: np.ndarray = neighbor_agents_current[agent_idx, :2]

    # chosen_lane_dist_order: (chosen_lane_num,)
    chosen_lane_dist_order: np.ndarray = _lane_min_dist_order(
        lanes_xy=lanes_xy,
        neighbor_current_xy=neighbor_current_xy,
    )

    # 경로 위(True)인 lane 전부 선택 (가까운 순으로)
    route_on_chosen_lane_idx: List[int] = _select_lanes_by_order(
        chosen_lane_dist_order=chosen_lane_dist_order,  # (chosen_lane_num,)
        chosen_lanes_route_mask_arr=
        chosen_lanes_route_mask_arr,  # (chosen_lane_num,)
    )
    npc_route_on_chosen_lane_idx_list.append(route_on_chosen_lane_idx)

# 에이전트 수와 길이 정합성 체크
assert len(npc_route_on_chosen_lane_idx_list) == chosen_agent_num, (
    f"npc_route_on_chosen_lane_idx_list 길이({len(npc_route_on_chosen_lane_idx_list)}) "
    f"!= chosen_agent_num({chosen_agent_num})")

# (chosen_agent_num, chosen_lane_num)
agent_route_lane_order: np.ndarray = build_agent_route_lane_order(
    npc_route_on_chosen_lane_idx_list=npc_route_on_chosen_lane_idx_list,
    chosen_lane_num=chosen_lane_num,
)
return agent_route_lane_order.astype(np.int64)

def get_neighbor_track_tokens( present_tracked_objects: TrackedObjects, agents_cur_frame_indices: Union[Sequence[int], np.ndarray], agents_num: int, object_types: Optional[Sequence[TrackedObjectType]] = None, ) -> List[Optional[str]]: """현재 프레임에서 선택된 이웃 에이전트들의 track_token 리스트를 만든다.

개요
----
이 함수는 다음 두 정보를 합쳐서,
**“이웃 에이전트 슬롯 순서에 맞는 track_token 리스트”**를 만들어 줍니다.

1) `present_tracked_objects`
    - 현재 프레임에서 감지된 모든 객체 묶음입니다.
    - 여기서 차량/보행자/자전거 등 관심 있는 타입만 추려,
      내부적으로 “에이전트 배열”을 만들었다고 가정합니다.
    - 이때의 순서는 `_extract_agent_array` 에서 사용한 것과 동일합니다
      (즉, 같은 타입 필터 순서로 정렬됨).

2) `agents_cur_frame_indices`
    - `agent_past_process` 가 선택한 이웃 에이전트의
      “현재 프레임 기준 행 인덱스” 목록입니다.
    - 길이 K(≤ agents_num) 인 정수 시퀀스이며,
      이 순서가 곧 이웃 에이전트 슬롯 순서가 됩니다.

이 함수는,
- 현재 프레임에서 관심 타입 에이전트들을 순서대로 나열한 뒤
- `agents_cur_frame_indices[k]` 를 이용해 해당 행의 `track_token` 을 꺼내
  `neighbor_track_token[k]` 에 채워 넣습니다.
- 슬롯 개수 `agents_num` 만큼의 리스트를 항상 반환하며,
  인덱스 범위를 벗어나거나 매핑할 수 없는 경우에는 `None` 으로 채웁니다.

Args:
    present_tracked_objects (TrackedObjects):
        현재 프레임의 감지 결과.
        여러 타입의 객체를 포함할 수 있으며,
        내부에서 `get_tracked_objects_of_types(object_types)` 로
        관심 타입만 추려 사용합니다.
    agents_cur_frame_indices (Union[Sequence[int], np.ndarray]):
        - shape: (K,)
        - 이웃 에이전트들이 현재 프레임 에이전트 배열에서 차지하는 행 인덱스들.
        - `agent_past_process` 의 `agents_cur_frame_indices` 를 그대로 넘겨 사용합니다.
    agents_num (int):
        - 출력할 이웃 슬롯의 개수입니다.
        - 반환되는 리스트 길이가 됩니다.
    object_types (Optional[Sequence[TrackedObjectType]]):
        - 필터링할 객체 타입 목록입니다.
        - 기본값은 `(VEHICLE, PEDESTRIAN, BICYCLE)` 이며,
          `_extract_agent_array` 에서 사용한 타입 순서와 동일해야
          인덱스 매핑이 올바르게 유지됩니다.

Returns:
    List[Optional[str]]:
        - 길이: `agents_num`
        - 각 원소는 해당 이웃 슬롯에 대응하는 `track_token` (문자열) 이거나,
          매핑할 수 없을 때는 `None` 입니다.
        - `agents_cur_frame_indices` 가 `None` 이면
          길이 `agents_num` 의 `[None, None, ...]` 리스트를 반환합니다.

Raises:
    ValueError:
        - `agents_num` 이 음수인 경우.

Notes:
    - `agents_cur_frame_indices` 의 길이가 `agents_num` 보다 길면,
      앞에서부터 `agents_num` 개까지만 사용합니다.
    - 현재 프레임에서 관심 타입으로 필터링한 에이전트 개수를 `M` 이라 할 때,
      인덱스가 `0 <= idx < M` 범위를 벗어나면 해당 슬롯은 `None` 으로 남습니다.
"""
if agents_num < 0:
    raise ValueError(f"`agents_num`은 음수가 될 수 없습니다. got {agents_num}")

# `_extract_agent_array`와 동일한 타입 필터 순서 유지
if object_types is None:
    object_types = (
        TrackedObjectType.VEHICLE,
        TrackedObjectType.PEDESTRIAN,
        TrackedObjectType.BICYCLE,
    )

# 현재 프레임에서 관심 타입만 '그 순서 그대로' 나열
# current_agents: List[TrackedObject], 길이 = M
current_agents: List[
    TrackedObject] = present_tracked_objects.get_tracked_objects_of_types(
        object_types)  # type: ignore[assignment]
# tokens_in_present_order: (M,) — 관심 타입 에이전트들의 track_token 문자열
tokens_in_present_order: List[str] = [
    str(agent.track_token) for agent in current_agents
]

# 반환 버퍼 준비: 길이 = agents_num
neighbor_track_token: List[Optional[str]] = [None] * int(agents_num)

# agents_cur_frame_indices 정규화(int list)
if agents_cur_frame_indices is None:
    return neighbor_track_token
# numpy, list, tuple 등 모두 int 리스트로 캐스팅
# idx_list: List[int], 길이 = K
idx_list: List[int] = list(
    map(int,
        np.asarray(agents_cur_frame_indices).reshape(-1).tolist()))

# 앞에서부터 agents_num개만 매핑
max_fill = min(len(idx_list), agents_num)
for slot_idx in range(max_fill):
    src_idx = idx_list[slot_idx]
    if 0 <= src_idx < len(tokens_in_present_order):
        neighbor_track_token[slot_idx] = tokens_in_present_order[src_idx]
    else:
        # 범위를 벗어나면 안전하게 None 유지
        neighbor_track_token[slot_idx] = None

return neighbor_track_token

시나리오 전체 horizon(초) 계산: 시작~끝 타임스탬프 차이

def _scenario_total_horizon_s(scn: AbstractScenario) -> float: """ 시나리오 시작 시각(초)부터 로그 파일의 끝 시각(초) 까지의 horizon을 계산한다. - 시나리오 토큰이 1개뿐이라 duration이 0이어도, DB의 end time을 사용해 올바르게 계산한다.

Args:
    scn: nuPlan Scenario 객체

Returns:
    float: horizon [s]
"""
# 1) 시작 시각(초): 공개 API 사용
start_s = float(scn.get_time_point(0).time_s)

# 3) fallback: DB의 실제 끝 시각(마이크로초)으로 계산
# 내부 모듈: nuPlan devkit 표준

# NuPlanScenario는 _log_file을 보유 (public은 아니지만 일반적으로 접근 가능)
log_file_path: str = getattr(scn, "_log_file")
end_us: int = get_end_sensor_time_from_db(log_file_path,
                                          get_lidarpc_sensor_data())
end_s = float(end_us) * 1e-6
return max(0.0, end_s - start_s)

def get_npc_route_roadblock_ids2( scenario: NuPlanScenario, neighbor_track_token: List[Optional[str]], ) -> Dict[str, Optional[List[str]]]:

from collections import defaultdict
from typing import Dict, List, Optional, Set

def select_nearest_connectors_by_mean_distance(
    connector_candidates: List[RoadBlockGraphEdgeMapObject],
    sampled_trajectory_points: List["Point2D"],
    *,
    tolerance: float = 1e-6,
) -> List[RoadBlockGraphEdgeMapObject]:
    """평균 수선거리로 가장 가까운 Connector 후보(들)를 선택한다.

    Args:
        connector_candidates (List[RoadBlockGraphEdgeMapObject]):
            후보 Connector 객체 리스트. 길이 K(가변).
        sampled_trajectory_points (List[Point2D]):
            Connector 영역에서 샘플링한 궤적 점들. 길이 T(가변).
        tolerance (float):
            부동소수 오차 허용치. 최솟값과의 차이가 `≤ tolerance`면 동률로 간주.

    Returns:
        List[RoadBlockGraphEdgeMapObject]:
            평균 수선거리가 최솟값인 Connector 객체(들).
    """
    mean_distance_by_connector: Dict[RoadBlockGraphEdgeMapObject,
                                     float] = {}
    for connector in connector_candidates:
        mean_dist: float = _mean_perpendicular_distance(
            connector, sampled_trajectory_points)
        mean_distance_by_connector[connector] = mean_dist

    minimum_distance: float = min(mean_distance_by_connector.values())
    return [
        conn for conn, dist in mean_distance_by_connector.items()
        if abs(dist - minimum_distance) <= tolerance
    ]

def _mean_perpendicular_distance(
    roadblock_connector: RoadBlockGraphEdgeMapObject,
    trajectory_points: List["Point2D"],
) -> float:
    """궤적 점들과 Connector 폴리곤 간 평균 수선거리를 계산한다.

    Args:
        roadblock_connector (RoadBlockGraphEdgeMapObject):
            거리 계산 대상 Connector.
        trajectory_points (List[Point2D]):
            궤적 포인트 리스트. 길이 T.

    Returns:
        float: 평균 수선거리 값.
    """
    polygon = roadblock_connector.polygon
    return float(
        np.mean([
            Point(pt.x, pt.y).distance(polygon) for pt in trajectory_points
        ]))

def _decide_roadblock_ids_at_connector(
    connector_candidate_objects: Set['RoadBlockGraphEdgeMapObject'],
    sampled_points_inside_connector: List['Point2D'],
    roadblock_sequence: List[str],
    previous_roadblocks_set: Set['RoadBlockGraphEdgeMapObject'],
    current_roadblocks: Set['RoadBlockGraphEdgeMapObject'],
) -> None:
    """Connector 구간 종료 시, 후보 중 연결성/거리 기준으로 선택하여 시퀀스에 확정한다.

    우선 연결성(이전/다음 RoadBlock 연결) 조건을 만족하는 후보를 우선 선택하고,
    그렇지 않으면 연결성 중 어느 하나라도 만족하는 후보들 중에서 평균 수선거리
    최소 후보(들)를 선택한다. 해당 경우가 없으면 모든 후보 중 평균 수선거리
    최소 후보(들)를 선택한다.

    Args:
        connector_candidate_objects (Set[RoadBlockGraphEdgeMapObject]):
            구간 동안 누적된 Connector 후보 집합.
        sampled_points_inside_connector (List[Point2D]):
            해당 Connector 구간에서 수집한 궤적 점들. 길이 T.
        roadblock_sequence (List[str]):
            결과를 축적할 RoadBlock/Connector id 시퀀스(가변).
        previous_roadblocks_set (Set[RoadBlockGraphEdgeMapObject]):
            직전 프레임의 RoadBlock 집합.
        current_roadblocks (Set[RoadBlockGraphEdgeMapObject]):
            현재 프레임의 RoadBlock 집합(다음 구간 연결 확인용).
    """
    graph_linkable_connectors: List[RoadBlockGraphEdgeMapObject] = []
    graph_linkable_connectors_candidates: List[
        RoadBlockGraphEdgeMapObject] = []

    incoming_and_outcoming_condition = len(
        previous_roadblocks_set) > 0 and len(current_roadblocks) > 0
    incoming_or_outgoing_condition = len(
        previous_roadblocks_set) > 0 or len(current_roadblocks) > 0

    # (1) 이전/다음 RoadBlock 모두와 연결되는 Connector 우선
    if incoming_and_outcoming_condition:
        for conn in connector_candidate_objects:
            incoming_ids = {rb.id for rb in conn.incoming_edges}
            previous_ids = {rb.id for rb in previous_roadblocks_set}
            outgoing_ids = {rb.id for rb in conn.outgoing_edges}
            current_ids = {rb.id for rb in current_roadblocks}

            if bool(previous_ids & incoming_ids) and bool(current_ids &
                                                          outgoing_ids):
                graph_linkable_connectors.append(conn)

        if graph_linkable_connectors:
            roadblock_sequence.extend(
                [conn.id for conn in graph_linkable_connectors])
            return

    # (2) 이전 또는 다음 RoadBlock 중 하나와 연결되면 후보로 인정 후 거리 최소
    if (incoming_and_outcoming_condition or incoming_or_outgoing_condition):
        for conn in connector_candidate_objects:
            incoming_ids = {rb.id for rb in conn.incoming_edges}
            previous_ids = {rb.id for rb in previous_roadblocks_set}
            outgoing_ids = {rb.id for rb in conn.outgoing_edges}
            current_ids = {rb.id for rb in current_roadblocks}

            if bool(previous_ids & incoming_ids) or bool(current_ids &
                                                         outgoing_ids):
                graph_linkable_connectors_candidates.append(conn)

        if graph_linkable_connectors_candidates:
            closest_connectors = select_nearest_connectors_by_mean_distance(
                graph_linkable_connectors_candidates,
                sampled_points_inside_connector,
                tolerance=1e-6,
            )
            roadblock_sequence.extend(
                conn.id for conn in closest_connectors)
            return

    # (3) 연결성 조건 없으면 전체 후보 중 거리 최소
    closest_connectors = select_nearest_connectors_by_mean_distance(
        connector_candidate_objects,
        sampled_points_inside_connector,
        tolerance=1e-6,
    )
    roadblock_sequence.extend(conn.id for conn in closest_connectors)

# ───────────────────────── 보조 함수들(가독성) ─────────────────────────

def _collect_candidate_tokens(
        neighbor_track_token: List[Optional[str]]) -> Set[str]:
    """None을 제외한 후보 토큰 집합을 만든다.

    Args:
        neighbor_track_token (List[Optional[str]]):
            길이 = max_agent_num. 토큰 또는 None.

    Returns:
        Set[str]: 후보 토큰 집합.
    """
    return {t for t in neighbor_track_token if t is not None}

def _collect_future_vehicle_trajectories(
    scenario: NuPlanScenario,
    candidate_tokens: Set[str],
    total_horizon_s: float = 20.,
) -> Dict[str, List[TrackedObject]]:
    """미래 기간 동안의 차량 궤적을 토큰별로 수집한다.

    Args:
        scenario (NuPlanScenario): 시나리오.
        candidate_tokens (Set[str]): 후보 토큰 집합.
        total_horizon_s (float): 수집할 미래 수평선(초).

    Returns:
        Dict[str, List[SceneObject]]:
            키=토큰, 값=해당 차량의 시간 순 궤적 리스트(길이 가변).
    """
    car_token_to_object_list: Dict[str,
                                   List[TrackedObject]] = defaultdict(list)
    num_samples = int(total_horizon_s * 10)  # 0.1 s 간격
    for det_batch in scenario.get_future_tracked_objects(
            0, total_horizon_s, num_samples):
        for det in det_batch.tracked_objects:
            if det.tracked_object_type == TrackedObjectType.VEHICLE and (
                    det.track_token in candidate_tokens):
                car_token_to_object_list[det.track_token].append(det)
    return car_token_to_object_list

def _finalize_connector_segment_if_open(
    inside_connector_flag: bool,
    connector_candidate_objects: Set['RoadBlockGraphEdgeMapObject'],
    sampled_points_inside_connector: List['Point2D'],
    roadblock_sequence: List[str],
    previous_roadblocks_set: Set['RoadBlockGraphEdgeMapObject'],
    current_roadblocks: Set['RoadBlockGraphEdgeMapObject'],
) -> bool:
    """열린 Connector 구간이 있으면 후보 결정 후 버퍼를 리셋한다.

    Returns:
        bool: 정리 후 inside_connector_flag(False).
    """
    if inside_connector_flag:
        _decide_roadblock_ids_at_connector(
            connector_candidate_objects,
            sampled_points_inside_connector,
            roadblock_sequence,
            previous_roadblocks_set,
            current_roadblocks,
        )
        inside_connector_flag = False
        connector_candidate_objects.clear()
        sampled_points_inside_connector.clear()
    return inside_connector_flag

"""NPC 차량들의 경로(RoadBlock/RoadBlock-Connector 시퀀스)를 토큰별로 구성한다.

이 함수는 `neighbor_track_token`에 제시된 에이전트 토큰들(차량만)에 대해,
시나리오의 **미래 궤적(약 20 s)**을 주행 순서로 훑어보며 RoadBlock·RoadBlock‑Connector
교차 여부를 추출한다. Connector 구간에서는 평균 수선거리 기반으로 **가장 가까운**
Connector(동률 허용)를 선택하며, 구간 전환 시점에만 최종 확정한다.
구축된 시퀀스는 `route_roadblock_correction`로 보정한 뒤 토큰별로 반환한다.

Args:
    scenario (NuPlanScenario):
        NuPlan 시나리오 객체.
    neighbor_track_token (List[Optional[str]]):
        길이 = `max_agent_num`. 각 슬롯에 NPC의 `track_token`(없으면 `None`).

Returns:
    car_token_to_rr_ids
    Dict[str, Optional[List[str]]]:
        키 = 토큰(str).
        값 = 해당 NPC의 **보정된** RoadBlock id 시퀀스(List[str]) 또는 `None`
        (미추출 시).
        길이 : max_agent_num 중, 자동차 토큰 개수.

Raises:
    ValueError: 동일 프레임에서 RoadBlock과 RoadBlock‑Connector가 동시에
        관측될 경우(데이터 불일치).

Notes:
    - 궤적 수집은 `TrackedObjectType.VEHICLE` 에 한정.
    - Connector 구간이 여러 후보를 만들면 평균 수선거리가 **최소**인 후보들을 모두 선택.
    - 내부 보조 함수들로 단계별 처리를 분리(가독성 향상).
"""

# ───────────────────────── 메인 로직(동작 동일) ─────────────────────────

candidate_tokens = _collect_candidate_tokens(neighbor_track_token)

car_token_to_object_list: Dict[
    str, List[TrackedObject]] = _collect_future_vehicle_trajectories(
        scenario,
        candidate_tokens,  # 기존 필터는 유지 (neighbor 토큰 기반)
        total_horizon_s=20.,
    )
car_token_to_rr_ids: Dict[str, Optional[List[str]]] = {}

for car_token, car_list in car_token_to_object_list.items():
    if not car_list:
        car_token_to_rr_ids[car_token] = None
        continue

    roadblock_sequence: List[str] = []
    previous_roadblocks_set: Set['RoadBlockGraphEdgeMapObject'] = set()
    inside_connector_flag = False
    connector_candidate_objects: Set['RoadBlockGraphEdgeMapObject'] = set()
    sampled_points_inside_connector: List['Point2D'] = []

    for time_idx, car_ in enumerate(car_list):
        npc_point = car_.center.point
        current_roadblocks = set(
            scenario.map_api.get_all_map_objects(
                npc_point, SemanticMapLayer.ROADBLOCK))
        current_connectors = set(
            scenario.map_api.get_all_map_objects(
                npc_point, SemanticMapLayer.ROADBLOCK_CONNECTOR))

        # 동시 검출 예외
        # 둘 다 잡힌 경우: tie-breaker 로 하나만 사용하도록 current_*를 덮어쓴다.
        if current_roadblocks and current_connectors:
            # 시나리오의 글로벌 경로 (빈 리스트 형태 [''] 는 None 으로 처리)
            try:
                route_rr_ids = scenario.get_route_roadblock_ids()
                if isinstance(route_rr_ids, list) and route_rr_ids == ['']:
                    route_rr_ids = None
            except Exception:
                route_rr_ids = None

            rb_ids_list = [rb.id for rb in current_roadblocks]
            rbc_ids_list = [rc.id for rc in current_connectors]

            chosen_ids = _prefer_rr_on_conflict(
                rb_ids=rb_ids_list,
                rbc_ids=rbc_ids_list,
                route_rr_ids=route_rr_ids,
                verbose=False,
            )

            if chosen_ids:
                # 선택 결과와 교집합 되는 쪽을 남기고, 반대편은 비운다.
                chosen_rbc = {
                    rc for rc in current_connectors if rc.id in chosen_ids
                }
                chosen_rb = {
                    rb for rb in current_roadblocks if rb.id in chosen_ids
                }

                if chosen_rbc and not chosen_rb:
                    current_connectors = chosen_rbc
                    current_roadblocks = set()
                elif chosen_rb and not chosen_rbc:
                    current_roadblocks = chosen_rb
                    current_connectors = set()
                else:
                    # 혹시 양쪽과도 교집합이 없거나 둘 다 생기는 예외 상황이면 RBC 우선
                    if chosen_rbc:
                        current_connectors = chosen_rbc
                        current_roadblocks = set()
                    elif chosen_rb:
                        current_roadblocks = chosen_rb
                        current_connectors = set()
                    else:
                        current_roadblocks = set()  # RBC 우선 fallback
            else:
                # tie-breaker가 비었으면 RBC 우선
                current_roadblocks = set()

        # ── (A) Connector 영역 ──
        if current_connectors:
            if not inside_connector_flag:
                connector_candidate_objects.clear()
                sampled_points_inside_connector.clear()
                inside_connector_flag = True
            connector_candidate_objects.update(current_connectors)
            sampled_points_inside_connector.append(npc_point)

            # 마지막 프레임이면 곧바로 결정
            if time_idx == len(car_list) - 1:
                inside_connector_flag = _finalize_connector_segment_if_open(
                    inside_connector_flag,
                    connector_candidate_objects,
                    sampled_points_inside_connector,
                    roadblock_sequence,
                    previous_roadblocks_set,
                    current_roadblocks,
                )
            continue

        # ── (B) RoadBlock 영역 ──
        if current_roadblocks:
            # 직전이 Connector 구간이면 우선 결정
            inside_connector_flag = _finalize_connector_segment_if_open(
                inside_connector_flag,
                connector_candidate_objects,
                sampled_points_inside_connector,
                roadblock_sequence,
                previous_roadblocks_set,
                current_roadblocks,
            )
            # RoadBlock id 중복 없이 추가
            for roadblock in current_roadblocks:
                if not roadblock_sequence or roadblock_sequence[
                        -1] != roadblock.id:
                    roadblock_sequence.append(roadblock.id)
            previous_roadblocks_set = current_roadblocks

    # 결과 보정
    if roadblock_sequence:
        start = car_list[0]
        npc_state = SimpleNamespace(rear_axle=StateSE2(
            start.center.x, start.center.y, start.center.heading))
        corrected_ids = route_roadblock_correction(
            npc_state,
            scenario.map_api,
            roadblock_sequence,
            remove_route_loops_flag=False,
        )
        car_token_to_rr_ids[car_token] = roadblock_sequence
    else:
        car_token_to_rr_ids[car_token] = None

return car_token_to_rr_ids

=====================

1. Ego, agent, static coordination transformation

=====================

def _local_to_local_transforms( global_states1: np.ndarray, # (N, 3) = [x1, y1, heading1] ... global_states2: np.ndarray, # (3,) = [x_ref, y_ref, heading_ref] ) -> np.ndarray: """한 좌표계 기준의 포즈 집합을, 다른 좌표계 기준으로 한 번에 변환하는 함수.

이 함수는 다음과 같이 동작합니다.

- `global_states2`:
  · 새로운 기준 좌표계(로컬 프레임)의 포즈 [x, y, heading] 입니다.
- `global_states1`:
  · 예전 기준(세계 좌표계라고 가정)에서 표현된 포즈들의 집합입니다.
  · 각 행이 하나의 포즈 [x, y, heading] 입니다.

절차:
    1. `global_states2`로부터 3x3 변환행렬(포즈 → 동차변환)을 만든다.
    2. 이를 역행렬로 뒤집어, "세계 → 새로운 로컬 프레임" 변환행렬을 얻는다.
    3. `global_states1`의 각 포즈에 대해서도 3x3 변환행렬을 만든다.
    4. (2)의 행렬을 (3)에 왼쪽에서 곱해, 모두 새로운 로컬 좌표계 기준으로 변환한다.

결과적으로,
- 입력으로 주어진 여러 포즈의 변환행렬 묶음이
  "새 기준 좌표계에서 본 포즈"로 바뀐 형태로 반환됩니다.

Args:
    global_states1 (np.ndarray):
        - shape: (N, 3)
        - 각 행: [x, y, heading] (예: 세계 좌표계 기준 포즈들).
    global_states2 (np.ndarray):
        - shape: (3,)
        - 기준이 될 포즈 [x_ref, y_ref, heading_ref].

Returns:
    np.ndarray:
        - shape: (N, 3, 3)
        - 각 원소는 `global_states1`의 각 포즈를
          `global_states2` 기준 로컬 프레임으로 본 3x3 변환행렬입니다.
"""
# local_xform: (3, 3) — 기준 포즈(global_states2)에 대한 동차 변환행렬
local_xform = _state_se2_array_to_transform_matrix(global_states2)
# local_xform_inv: (3, 3) — 기준 포즈의 역변환(세계→로컬)
local_xform_inv = np.linalg.inv(local_xform)

# transforms: (N, 3, 3) — global_states1 의 각 포즈에 대한 변환행렬
transforms = _state_se2_array_to_transform_matrix_batch(global_states1)

# (N, 3, 3) — 새 로컬 프레임 기준으로 재표현된 변환행렬들
transforms = np.matmul(local_xform_inv, transforms)
return transforms

def _state_se2_array_to_transform_matrix( input_data: np.ndarray, # (3,) = [x, y, heading] ) -> np.ndarray: # (3, 3) """단일 SE(2) 상태 [x, y, heading] 을 3x3 동차 변환행렬로 바꾸는 함수.

행렬 구조:
    [[ cos(h), -sin(h), x ],
     [ sin(h),  cos(h), y ],
     [   0   ,    0   , 1 ]]

이 행렬을 점 [x', y', 1]^T 에 곱하면, 회전+평행이동이 한 번에 적용됩니다.

Args:
    input_data (np.ndarray):
        - shape: (3,)
        - [x, y, heading] (라디안).

Returns:
    np.ndarray:
        - shape: (3, 3)
        - 주어진 포즈를 표현하는 2D SE(2) 동차 변환행렬.
"""
x: float = float(input_data[0])
y: float = float(input_data[1])
h: float = float(input_data[2])

cosine = np.cos(h)
sine = np.sin(h)

# (3, 3)
return np.array([[cosine, -sine, x], [sine, cosine, y], [0.0, 0.0, 1.0]])

def _state_se2_array_to_transform_matrix_batch( input_data: np.ndarray, # (N, 3) = [[x1, y1, h1], ..., [xN, yN, hN]] ) -> np.ndarray: # (N, 3, 3) """여러 개의 [x, y, heading] 포즈를 한 번에 3x3 변환행렬 묶음으로 바꾸는 함수.

이 함수는 각 행이 [x, y, heading] 인 2D 포즈 배열을 입력받아,
각 포즈마다 SE(2) 동차 변환행렬을 만들어 (N, 3, 3) 형태로 반환합니다.

내부 아이디어:
    1. 각 포즈를 [x, y, cos(h), sin(h), 1] 형태로 확장한다.
    2. 미리 준비된 `reshaping_array` (5x9) 를 곱해,
       [c, -s, x, s, c, y, 0, 0, 1] 형태의 행(길이 9)을 만든다.
    3. 이를 (3, 3) 으로 reshape 하면, 개별 변환행렬이 완성된다.
    4. 이런 행을 N개 쌓아 (N, 3, 3) 배열을 얻는다.

Args:
    input_data (np.ndarray):
        - shape: (N, 3)
        - 각 행: [x, y, heading] (라디안).

Returns:
    np.ndarray:
        - shape: (N, 3, 3)
        - 각 원소는 해당 행 포즈에 대한 동차 변환행렬입니다.
"""
# input_data: (N, 3) = [x, y, heading]
# processed_input: (N, 5) = [x, y, cos(h), sin(h), 1]
processed_input = np.column_stack((
    input_data[:, 0],
    input_data[:, 1],
    np.cos(input_data[:, 2]),
    np.sin(input_data[:, 2]),
    np.ones_like(input_data[:, 0]),
))

# reshaping_array: (5, 9)
reshaping_array = np.array([
    [0, 0, 1, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 1, 0, 0, 0],
    [1, 0, 0, 0, 1, 0, 0, 0, 0],
    [0, -1, 0, 1, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 1],
])
# processed_input @ reshaping_array: (N, 9)
# → 각 행이 [c, -s, x, s, c, y, 0, 0, 1] 꼴이 되고,
#   이를 (N, 3, 3) 으로 reshape 하면 변환행렬 세트가 됨.
return (processed_input @ reshaping_array).reshape(-1, 3, 3)

def _transform_matrix_to_state_se2_array_batch( input_data: np.ndarray, # (N, 3, 3) ) -> np.ndarray: # (N, 3) """여러 개의 3x3 변환 행렬을 [x, y, heading] 형태의 포즈 배열로 되돌리는 함수.

이 함수는 SE(2) 동차 변환행렬 묶음(회전+이동 정보를 가진 3x3 행렬들)을 입력으로 받아,
각 행렬에 대해 다음 정보를 추출합니다.

- x: 3번째 열의 x 성분 (translation x)
- y: 3번째 열의 y 성분 (translation y)
- heading: 회전 행렬의 첫 번째 열로부터 atan2를 사용해 추출한 각도

즉, 다음과 같은 과정을 거칩니다.

1. 각 3x3 행렬의 첫 번째 열을 모아서
   [cos(heading), sin(heading), _] 꼴의 벡터들을 만든다.
2. 이로부터 `atan2(sin, cos)` 계산으로 heading(라디안)을 구한다.
3. 원래 변환행렬의 3번째 열(translation [x, y, 1])에 대해,
   마지막 요소를 heading 값으로 덮어써 [x, y, heading] 형태로 만든다.

Args:
    input_data (np.ndarray):
        - shape: (N, 3, 3)
        - 각 [i, :, :] 는 하나의 SE(2) 동차 변환행렬입니다.

Returns:
    np.ndarray:
        - shape: (N, 3)
        - 각 행은 [x, y, heading] 형태의 포즈를 나타냅니다.
"""
# first_columns: (N, 3) — 각 변환행렬의 첫 번째 열 [cos, sin, 0]
first_columns = input_data[:, :, 0].reshape(-1, 3)
# angles: (N,) — atan2(sin, cos) 로부터 구한 heading
angles = np.arctan2(first_columns[:, 1], first_columns[:, 0])

# result: (N, 3) — 원래는 3번째 열 [x, y, 1] 이었음
result = input_data[:, :, 2]
# 마지막 성분을 heading 으로 덮어써 [x, y, heading] 으로 만듦
result[:, 2] = angles

return result

def _global_state_se2_array_to_local( global_states: np.ndarray, # (N, 3) = [x_world, y_world, heading_world] local_state: np.ndarray, # (3,) = [x_ref, y_ref, heading_ref] ) -> np.ndarray: # (N, 3) = [x_local, y_local, heading_local] """여러 점의 [x, y, heading]을 기준 포즈(local_state) 기준 로컬 좌표계로 변환한다.

개념적으로 이 함수는
- `global_states` : 세계(월드) 좌표계 기준의 포즈들 집합
- `local_state`   : “새 기준 좌표계”가 될 포즈(예: ego 차량의 현재 포즈)
를 받아서, 각 포즈를 `local_state` 기준으로 보았을 때의
로컬 좌표 [x_local, y_local, heading_local] 로 바꿔 줍니다.

처리 순서:
    1. `local_state`로부터 3x3 변환 행렬(세계 → 로컬 프레임)을 만든다.
    2. `global_states`의 각 [x, y, heading]을 3x3 동차 변환행렬로 바꾼다.
    3. (1)의 역행렬을 (2)에 곱해, 모든 포즈를 로컬 좌표계 기준으로 재표현한다.
    4. 변환된 3x3 행렬 묶음을 다시 [x, y, heading] 형식의 배열로 되돌린다.

Args:
    global_states (np.ndarray):
        - shape: (N, 3)
        - 각 행: [x_world, y_world, heading_world]
          (세계 좌표계 기준 포즈들).
    local_state (np.ndarray):
        - shape: (3,)
        - [x_ref, y_ref, heading_ref]
          로컬 좌표계의 기준이 되는 포즈(예: ego 포즈).

Returns:
    np.ndarray:
        - shape: (N, 3)
        - 각 행: [x_local, y_local, heading_local]
          · `local_state`를 원점/기준으로 하는 좌표계 기준 포즈입니다.
"""
# local_xform: (3, 3) — 기준 포즈(local_state)에 대한 동차 변환행렬 (world→기준)
local_xform = _state_se2_array_to_transform_matrix(local_state)
# local_xform_inv: (3, 3) — 기준 포즈의 역변환 (기준→world) 의 역 → (world→local)
local_xform_inv = np.linalg.inv(local_xform)

# transforms: (N, 3, 3) — 각 global_state 를 world 기준 변환행렬로 표현
transforms = _state_se2_array_to_transform_matrix_batch(global_states)

# transforms: (N, 3, 3) — world 기준 포즈들을 local_state 기준 로컬 프레임으로 변환
transforms = np.matmul(local_xform_inv, transforms)

# output: (N, 3) — [x_local, y_local, heading_local]
output = _transform_matrix_to_state_se2_array_batch(transforms)

return output

def _global_velocity_to_local( velocity: np.ndarray, # (N, 2) = [vx_world, vy_world] anchor_heading: float, # 스칼라 heading(rad) 또는 브로드캐스트 가능한 값 ) -> np.ndarray: # (N, 2) = [vx_local, vy_local] """월드 좌표계 기준 속도 벡터를 ego(또는 기준 heading) 좌표계 기준 속도로 회전 변환한다.

이 함수는 2D 속도 벡터 [vx, vy] (세계 좌표계 기준)를,
기준 차량(ego)의 heading(방향각)을 기준으로 회전시켜
ego 좌표계 기준 속도 [vx_local, vy_local] 로 바꾸어 줍니다.

변환 방식:
    - 기준 heading = θ 라 할 때,
      · vx_local = vx * cos(θ) + vy * sin(θ)
      · vy_local = vy * cos(θ) - vx * sin(θ)

직관적으로,
- 세계 기준으로 측정된 속도를,
- ego 차량이 바라보는 방향을 x축으로 하는 좌표계로 "돌려서" 표현한다고 보면 됩니다.

Args:
    velocity (np.ndarray):
        - shape: (N, 2)
        - 각 행: [vx_world, vy_world] (세계 좌표계 기준 속도).
    anchor_heading (float):
        - 기준이 되는 heading 값(rad).
        - 보통 ego 차량의 heading 을 넣어 사용합니다.
        - 스칼라이지만, 넘파이 브로드캐스팅 덕분에 벡터화 연산이 가능합니다.

Returns:
    np.ndarray:
        - shape: (N, 2)
        - 각 행: [vx_local, vy_local]
        - 기준 heading 좌표계(ego 기준)로 회전된 속도 벡터입니다.
"""
# velocity_x: (N,) — ego 기준 x 방향 속도
velocity_x = velocity[:, 0] * np.cos(
    anchor_heading) + velocity[:, 1] * np.sin(anchor_heading)
# velocity_y: (N,) — ego 기준 y 방향 속도
velocity_y = velocity[:, 1] * np.cos(
    anchor_heading) - velocity[:, 0] * np.sin(anchor_heading)

# (N, 2) 로 스택
return np.stack([velocity_x, velocity_y], axis=-1)

def _build_ego_pose_from_state( ego_cur_pose_np: np.ndarray, # (3,) ) -> np.ndarray: # (3,) """EgoState 배열에서 ego 기준 좌표 변환에 사용할 [x, y, heading] 벡터를 만든다.

이 함수는 ego_cur_pose_np 배열에서
- x 좌표
- y 좌표
- heading(방향, rad 단위)
세 값을 뽑아서, 부동소수 형태의 1차원 벡터로 만들어준다.

Args:
    ego_cur_pose_np (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, heading_ego] 를 담고 있는 배열.

Returns:
    np.ndarray:
        - shape: (3,)
        - [x_ego, y_ego, heading_ego] 를 float64 타입으로 담은 벡터.
"""
# ego_pose: (3,) = [x_ego, y_ego, heading_ego]
ego_pose = np.array(
    [
        float(ego_cur_pose_np[EgoInternalIndex.x()]),
        float(ego_cur_pose_np[EgoInternalIndex.y()]),
        float(ego_cur_pose_np[EgoInternalIndex.heading()]),
    ],
    dtype=np.float64,
)
return ego_pose

def _convert_ego_history_to_relative( agent_state: np.ndarray, # (time_num, state_dim_ego=10) ego_pose: np.ndarray, # (3,) ) -> np.ndarray: # (time_num, state_dim_ego+1=11) """ego(자차) 궤적을 월드 좌표계에서 ego 기준 상대 좌표계로 변환한다.

이고의 과거~현재 상태 시퀀스를 받아서,
- 위치/방향: ego 기준 좌표계로 변환
- heading: cos, sin 두 값으로 나누어 저장
- 속도: 월드 기준 속도를 ego 기준 속도로 회전 변환
- 차량 크기/타입(one-hot) 등 뒤쪽 값은 그대로 복사

최종적으로 원래보다 차원이 1 늘어난 (N, state_dim+1) 형태의 배열을 만든다.

Args:
    agent_state (np.ndarray):
        - shape: (time_num, state_dim_ego=10)
        - ego seq 궤적 (월드 좌표계).
    ego_pose (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, heading_ego] (월드 좌표계 기준 현재 ego 상태).

Returns:
    np.ndarray:
        - shape: (time_num, state_dim_ego+1=11)
        - ego 기준 상대 좌표계로 변환된 ego 궤적.
"""
# agent_state: (time_num, state_dim_ego=10)
time_num, state_dim = agent_state.shape

# new_agent_state: (time_num, state_dim_ego+1=11)
new_agent_state = np.zeros((agent_state.shape[0], state_dim + 1),
                           dtype=np.float64)

# 크기/타입 등 뒤쪽 항목 복사
new_agent_state[:, 6:] = agent_state[:, 5:]

# agent_global_poses: (time_num, 3) = [x, y, heading]
agent_global_poses = agent_state[:, [
    EgoInternalIndex.x(),
    EgoInternalIndex.y(),
    EgoInternalIndex.heading()
]]  # (N, 3)

# transforms: (time_num, 3, 3)  — 월드→ego 변환 행렬
# agent_global_poses: (time_num, 3), 절대 좌표계 기준 값
# ego_pose # (3,) : 절대 좌표계 기준 값
transforms = _local_to_local_transforms(agent_global_poses, ego_pose)

# transformed_poses: (time_num, 3) — ego 좌표계 기준 [x, y, heading]
transformed_poses = _transform_matrix_to_state_se2_array_batch(
    transforms)  # transformed_poses: ego 좌표계 기준 값

# 위치/방향(→cos,sin) 갱신
new_agent_state[:, EgoInternalIndex.x()] = transformed_poses[:, 0]
new_agent_state[:, EgoInternalIndex.y()] = transformed_poses[:, 1]
new_agent_state[:, 2] = np.cos(transformed_poses[:, 2])
new_agent_state[:, 3] = np.sin(transformed_poses[:, 2])

# --- velocity (world -> anchor ego frame) ---
# agent_global_velocities: (time_num, 2) = [vx_world, vy_world]
agent_global_velocities = agent_state[:, [
    EgoInternalIndex.vx(), EgoInternalIndex.vy()
]]

# transformed_velocities: (time_num, 2) = [vx_ego, vy_ego]
transformed_velocities = _global_velocity_to_local(agent_global_velocities,
                                                   ego_pose[-1])

new_agent_state[:, 4] = transformed_velocities[:, 0]
new_agent_state[:, 5] = transformed_velocities[:, 1]

return new_agent_state

def _convert_agent_states_to_relative( agent_state: np.ndarray, # (N, state_dim_agent) ego_pose: np.ndarray, # (3,) ) -> np.ndarray: # (N, state_dim_agent) """주변 에이전트(차량/보행자/자전거)의 상태를 ego 기준 상대 좌표계로 변환한다.

월드 좌표계 기준으로 기록된 주변 에이전트들의 상태에서
- 위치 (x, y)
- 방향 (heading)
- 속도 (vx, vy)
만 이고 기준 좌표계로 바꿔준다.

나머지 값들(차량 크기, 기타 특성)은 그대로 유지하며,
입력 배열을 in-place 로 수정한 뒤 반환한다.

Args:
    agent_state (np.ndarray):
        - shape: (N, state_dim_agent)
            [track_id, vx, vy, heading, width, length, x, y]
        - 주변 에이전트 상태 배열.
          스키마는 AgentInternalIndex 를 따른다.
    ego_pose (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, heading_ego] (월드 좌표계 기준 현재 ego 상태).

Returns:
    np.ndarray:
        - shape: (N, state_dim_agent)
            - [track_id, vx, vy, heading, width, length, x, y]
        - 위치/방향/속도가 ego 기준으로 바뀐 에이전트 상태 배열.
"""
# agent_global_poses: (N, 3) = [x, y, heading]
agent_global_poses = agent_state[:, [
    AgentInternalIndex.x(),
    AgentInternalIndex.y(),
    AgentInternalIndex.heading()
]]

# agent_global_velocities: (N, 2) = [vx_world, vy_world]
agent_global_velocities = agent_state[:, [
    AgentInternalIndex.vx(
    ), AgentInternalIndex.vy()
]]

# transformed_poses: (N, 3) = [x_ego, y_ego, heading_ego]
transformed_poses = _global_state_se2_array_to_local(
    agent_global_poses, ego_pose)

# transformed_velocities: (N, 2) = [vx_ego, vy_ego]
transformed_velocities = _global_velocity_to_local(agent_global_velocities,
                                                   ego_pose[-1])

# 위치/방향/속도 갱신 (in-place)
agent_state[:, AgentInternalIndex.x()] = transformed_poses[:, 0]
agent_state[:, AgentInternalIndex.y()] = transformed_poses[:, 1]
agent_state[:, AgentInternalIndex.heading()] = transformed_poses[:, 2]
agent_state[:, AgentInternalIndex.vx()] = transformed_velocities[:, 0]
agent_state[:, AgentInternalIndex.vy()] = transformed_velocities[:, 1]

return agent_state

def _convert_static_states_to_relative( agent_state: np.ndarray, # (N, state_dim_static) ego_pose: np.ndarray, # (3,) ) -> np.ndarray: # (N, state_dim_static) """정적 객체(표지판, 배리어 등)의 위치/방향을 ego 기준 상대 좌표계로 변환한다.

정적 객체의 상태 배열에서 앞의 세 값
- x 좌표
- y 좌표
- heading(방향)
만 ego 기준 좌표계로 변환하고, 나머지 값(크기 등)은 그대로 둔다.

입력 배열을 in-place 로 수정한 뒤 반환한다.

Args:
    agent_state (np.ndarray):
        - shape: (N, state_dim_static)
        - 정적 객체 상태 배열. 앞 3차원이 [x, y, heading].
    ego_pose (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, heading_ego] (월드 좌표계 기준 현재 ego 상태).

Returns:
    np.ndarray:
        - shape: (N, state_dim_static)
        - 위치/방향이 ego 기준으로 바뀐 정적 객체 상태 배열.
"""
# agent_global_poses: (N, 3) = [x, y, heading]
agent_global_poses = agent_state[:, [0, 1, 2]]

# transformed_poses: (N, 3) = [x_ego, y_ego, heading_ego]
transformed_poses = _global_state_se2_array_to_local(
    agent_global_poses, ego_pose)

# 위치/방향 갱신 (in-place)
agent_state[:, 0] = transformed_poses[:, 0]
agent_state[:, 1] = transformed_poses[:, 1]
agent_state[:, 2] = transformed_poses[:, 2]

return agent_state

def convert_absolute_quantities_to_relative( agent_state: np.ndarray, # (N, state_dim) ego_cur_pose_np: np.ndarray, # (3,) agent_type: str = 'ego', ) -> np.ndarray: """월드 좌표계 기준 상태들을 ego(자차) 기준 상대 좌표계로 변환하는 함수.

이 함수는 세 가지 경우를 처리합니다.

1) agent_type == 'ego'
    - 입력: 이고(자차)의 과거/현재 궤적 (월드 좌표계) # (num_frames, 10)
        - x, y, heading, vx, vy, width, length, (car, pedestrian, cyclist)
    - 출력: 이고 기준으로 다시 표현된 궤적 # (num_frames, 11)
      (위치/방향/속도는 ego 기준, 차체 크기와 타입(one-hot)은 그대로 유지)
    - 결과 shape: (N, original_dim + 1)
      · heading → cos, sin 두 차원으로 나뉘면서 1차원 증가

2) agent_type == 'agent'
    - 입력: 주변 에이전트(차량/보행자/자전거 등)의 상태 (월드 좌표계)
        - [track_id, vx, vy, heading, width, length, x, y]
    - 출력: 이고 기준 상대 좌표계로 변환된 에이전트 상태
      · 위치/방향/속도만 ego 기준으로 바뀌고, 나머지는 그대로 유지
    - in-place 방식으로 `agent_state`를 수정 후 반환

3) agent_type == 'static'
    - 입력: 정적 객체(표지판, 배리어 등)의 상태 (월드 좌표계)
      · [x, y, heading] + 크기 등
    - 출력: 이고 기준 상대 좌표계로 변환된 정적 객체 상태
      · 위치/방향만 ego 기준으로 바뀜

Args:
    agent_state (np.ndarray):
        - shape: (N, state_dim)
        - 변환 대상 상태 배열.
          · ego 모드: 이고 궤적
          · agent 모드: 주변 동적 객체(에이전트)
          · static 모드: 정적 객체
    ego_cur_pose_np (np.ndarray):
        - shape: (3,)
        - [x_ego, y_ego, heading_ego] (월드 좌표계 기준 이고 현재 상태)
    agent_type (str, optional):
        - 'ego'   : 이고 궤적 변환 모드
        - 'agent' : 동적 에이전트 변환 모드
        - 'static': 정적 객체 변환 모드

Returns:
    np.ndarray:
        - 변환된 상태 배열.
        - 'ego' 모드: shape (N, state_dim + 1)
            - x, y, cos, sin, vx, vy, width, length, (car, pedestrian, cyclist)
        - 'agent' 모드: shape (N, state_dim) (in-place 수정)
            - [track_id, vx, vy, heading, width, length, x, y]
        - 'static' 모드: shape (N, state_dim) (in-place 수정)
"""
# ego_pose: (3,) = [x_ego, y_ego, heading_ego]
ego_pose = _build_ego_pose_from_state(ego_cur_pose_np)

if agent_type == 'ego':
    # (time_num, state_dim_ego+1=11)
    agent_state = _convert_ego_history_to_relative(agent_state, ego_pose)

elif agent_type == 'agent':
    # (N, state_dim_agent)
    agent_state = _convert_agent_states_to_relative(agent_state, ego_pose)

elif agent_type == 'static':
    # (N, state_dim_static)
    agent_state = _convert_static_states_to_relative(agent_state, ego_pose)

return agent_state

=====================

2. Map coordination transformation

=====================

def coordinates_to_local_frame(coords, anchor_state, precision=None): """ Transform a set of [x, y] coordinates without heading to the the given frame. :param coords: Coordinates to be transformed, in the form [x, y]. :param anchor_state: The coordinate frame to transform to, in the form [x, y, heading]. :param precision: The precision with which to allocate the intermediate array. If None, then it will be inferred from the input precisions. :return: Transformed coordinates. """ if len(coords.shape) != 2 or coords.shape[1] != 2: raise ValueError(f"Unexpected coords shape: {coords.shape}")

if precision is None:
    if coords.dtype != anchor_state.dtype:
        raise ValueError(
            "Mixed datatypes provided to coordinates_to_local_frame without precision specifier."
        )
    precision = coords.dtype

# torch.nn.functional.pad will crash with 0-length inputs.
# In that case, there are no coordinates to transform.
if coords.shape[0] == 0:
    return coords

# Extract transform
transform = _state_se2_array_to_transform_matrix(anchor_state)
transform = np.linalg.inv(transform)

# Transform the incoming coordinates to homogeneous coordinates
#  So translation can be done with a simple matrix multiply.
#
# [x1, y1]  => [x1, y1, 1]
# [x2, y2]     [x2, y2, 1]
# ...          ...
# [xn, yn]     [xn, yn, 1]
coords = np.pad(coords,
                pad_width=((0, 0), (0, 1)),
                mode='constant',
                constant_values=1.0)

# Perform the transformation, transposing so the shapes match
coords = np.matmul(transform, coords.T)

# Transform back from homogeneous coordinates to standard coordinates.
#   Get rid of the scaling dimension and transpose so output shape matches input shape.
result = coords.T
result = result[:, :2]

return result

def vector_set_coordinates_to_local_frame( coords, avails, anchor_state, output_precision=np.float32, ): """ Transform the vector set map element coordinates from global frame to ego vehicle frame, as specified by anchor_state. :param coords: Coordinates to transform. . :param avails: Availabilities mask identifying real vs zero-padded data in coords. . :param anchor_state: The coordinate frame to transform to, in the form [x, y, heading]. :param output_precision: The precision with which to allocate output array. :return: Transformed coordinates. :raise ValueError: If coordinates dimensions are not valid or don't match availabilities. """

# Flatten coords from (num_map_elements, num_points_per_element, 2) to
#   (num_map_elements * num_points_per_element, 2) for easier processing.
num_map_elements, num_points_per_element, _ = coords.shape
coords = coords.reshape(num_map_elements * num_points_per_element, 2)

# Apply transformation using adequate precision
coords = coordinates_to_local_frame(coords,
                                    anchor_state,
                                    precision=np.float64)

# Reshape to original dimensionality
coords = coords.reshape(num_map_elements, num_points_per_element, 2)

# Output with specified precision
coords = coords.astype(output_precision)

# ignore zero-padded data
coords[~avails] = 0.0

return coords

=====================

3. Numpy-Tensor transformation

=====================

from typing import Any, Dict, Mapping, Union import numpy as np from typing import Any, Dict, Mapping, Optional, Sequence import torch

def convert_data_dict_to_device_tensors( data: Mapping[str, Any], device: Union[torch.device, str], squeeze: bool, ) -> Dict[str, torch.Tensor]: """ 파이썬/NumPy 기반 딕셔너리를 모델 입력용 torch.Tensor 딕셔너리로 변환한다.

기능 요약
--------
- 값이 이미 torch.Tensor 인 경우: 다시 만들지 않고 .to(device, dtype) 만 호출
- NumPy 배열: from_numpy 로 래핑 후 float32 또는 bool 로 캐스팅
- 파이썬 스칼라/리스트: torch.as_tensor(...) 로 생성
- squeeze=False 인 경우: 앞쪽에 배치 차원 1개를 추가(unsqueeze(0))
- 키가 "agent_route_lane_order" 인 텐서는 항상 torch.int64 로 맞춤

Args:
    data:
        키-값 딕셔너리.
        값은 torch.Tensor / np.ndarray / 파이썬 수치/리스트 등을 허용한다.
    device:
        텐서를 올릴 디바이스. (예: "cuda:0", torch.device("cpu"))
    squeeze:
        False 이면, 모든 텐서 앞에 배치 차원 1을 추가한다.

Returns:
    Dict[str, torch.Tensor]:
        각 키에 대응하는 torch.Tensor 로 구성된 딕셔너리.
"""
out: Dict[str, torch.Tensor] = {}

for k, v in data.items():
    # 1) 이미 Tensor인 경우: 재생성 하지 말고 .to(...) 만
    if isinstance(v, torch.Tensor):
        target_dtype = torch.bool if v.dtype == torch.bool else torch.float32
        t = v.to(device=device, dtype=target_dtype, non_blocking=True)

    # 2) Numpy 배열인 경우: 복사 최소화를 위해 from_numpy/as_tensor 사용
    elif isinstance(v, np.ndarray):
        if v.dtype == np.bool_:
            # bool은 dtype 보존 -> 이후 device로만 이동
            t = torch.from_numpy(v).to(device=device, non_blocking=True)
            if t.dtype != torch.bool:
                t = t.to(dtype=torch.bool)
        else:
            # 수치형은 float32로
            t = torch.from_numpy(v).to(
                device=device,
                dtype=torch.float32,
                non_blocking=True,
            )

    # 3) 파이썬 bool 스칼라
    elif isinstance(v, (bool, np.bool_)):
        t = torch.tensor(v, dtype=torch.bool, device=device)
    # 4) none type 처리
    elif v is None:
        # ego_agent_next_11_dim # planner_future_11_dim
        continue
    # 5) 나머지(리스트/스칼라 등): as_tensor로 한 번에
    else:
        t = torch.as_tensor(v, dtype=torch.float32, device=device)

    if not squeeze:
        t = t.unsqueeze(0)

    if k == "agent_route_lane_order":
        t = t.to(torch.int64)

    out[k] = t

return out

================ import numpy as np import numpy.typing as npt from typing import List, Tuple, Generator

from nuplan.common.actor_state.state_representation import TimePoint from nuplan.common.actor_state.ego_state import EgoState from nuplan.planning.training.preprocessing.utils.agents_preprocessing import EgoInternalIndex from nuplan.planning.training.preprocessing.features.trajectory_utils import convert_absolute_to_relative_poses from nuplan.common.actor_state.vehicle_parameters import get_pacifica_parameters from nuplan.planning.scenario_builder.nuplan_db.nuplan_scenario import NuPlanScenario

from diffusion_planner.data_process.utils import convert_absolute_quantities_to_relative from nuplan.common.geometry.convert import numpy_array_to_absolute_velocity from typing import List, Tuple import numpy as np import numpy.typing as npt from nuplan.common.actor_state.ego_state import EgoState from nuplan.planning.training.preprocessing.utils.agents_preprocessing import EgoInternalIndex from nuplan.planning.scenario_builder.nuplan_db.nuplan_scenario import NuPlanScenario

from diffusion_planner.data_process.utils import convert_absolute_quantities_to_relative from nuplan.common.geometry.convert import numpy_array_to_absolute_velocity

def get_ego_past_array_from_scenario( scenario: NuPlanScenario, num_past_poses: int, past_time_horizon: float) -> Tuple[np.ndarray, np.ndarray]:

current_ego_state: EgoState = scenario.initial_ego_state

past_ego_states: Generator[EgoState, None,
                           None] = scenario.get_ego_past_trajectory(
                               iteration=0,
                               num_samples=num_past_poses,
                               time_horizon=past_time_horizon)
# list(past_ego_states): List[EgoState]
sampled_past_ego_states: List[EgoState] = list(past_ego_states) + [
    current_ego_state
]
# past_cur_ego_array: np (21, 10)
#  x, y, theta, vx, vy, width, length, (car, pedestrian, cyclist)
past_cur_ego_array = sampled_past_ego_states_to_array(
    sampled_past_ego_states)

past_time_stamps: List[TimePoint] = list(
    scenario.get_past_timestamps(
        iteration=0,
        num_samples=num_past_poses,
        time_horizon=past_time_horizon)) + [scenario.start_time]

def sampled_past_timestamps_to_array(
        past_time_stamps: List[TimePoint]) -> npt.NDArray[np.float32]:
    flat: List[int] = [t.time_us for t in past_time_stamps]
    return np.array(flat, dtype=np.int64)  # shape: (21)

# past_time_stamps_array: np (21,)
past_time_stamps_array = sampled_past_timestamps_to_array(past_time_stamps)

return past_cur_ego_array, past_time_stamps_array

def sampled_past_ego_states_to_array( past_ego_states: List[EgoState]) -> npt.NDArray[np.float32]: # (21, 10) # 원래 있던 함수임 past_cur_num = len(past_ego_states) past_cur_ego_array = np.zeros((past_cur_num, 10), dtype=np.float64) for time_i in range(0, past_cur_num, 1): past_cur_ego_array[ time_i, EgoInternalIndex.x()] = past_ego_states[time_i].center.x past_cur_ego_array[ time_i, EgoInternalIndex.y()] = past_ego_states[time_i].center.y heading_ = past_ego_states[time_i].center.heading past_cur_ego_array[time_i, EgoInternalIndex.heading()] = heading_ # --- 자차좌표계 → 세계좌표계 속도 변환: 회전만 적용 --- v_local = past_ego_states[ time_i].dynamic_car_state.center_velocity_2d # body-frame velocity he = float(heading_) c, s = np.cos(he), np.sin(he) vx_w = c * float(v_local.x) - s * float(v_local.y) vy_w = s * float(v_local.x) + c * float(v_local.y) past_cur_ego_array[time_i, EgoInternalIndex.vx()] = vx_w past_cur_ego_array[time_i, EgoInternalIndex.vy()] = vy_w # past_cur_ego_array[time_i, EgoInternalIndex.ax( # )] = past_ego_states[time_i].dynamic_car_state.rear_axle_acceleration_2d.x # past_cur_ego_array[time_i, EgoInternalIndex.ay( # )] = past_ego_states[time_i].dynamic_car_state.rear_axle_acceleration_2d.y

    past_cur_ego_array[
        time_i,
        EgoInternalIndex.ax()] = past_ego_states[time_i].car_footprint.width
    past_cur_ego_array[time_i, EgoInternalIndex.ay(
    )] = past_ego_states[time_i].car_footprint.length
    past_cur_ego_array[time_i, 7:10] = [
        1, 0, 0
    ]  # one-hot encoding for agent type (car, pedestrian, cyclist)

return past_cur_ego_array

def sampled_future_ego_states_to_array( future_ego_states: List[EgoState]) -> npt.NDArray[np.float64]: """미래 ego 상태 리스트를 “월드 좌표계 기준 10차원 배열”로 바꾼다.

이 함수는 시나리오에서 가져온 여러 개의 미래 ego 상태(EgoState)를
한 줄짜리 숫자 배열로 정리해준다. 나중에 다른 함수에서
ego 기준 좌표계로 바꾸기 전에, “월드 기준 원본 값”을 담는 역할이다.

각 시점마다 다음과 같은 값들을 담는다.

- 위치 : x, y                   (월드 좌표)
- 방향 : heading                (라디안, 월드 기준)
- 속도 : vx, vy                 (월드 좌표 기준 속도, ego 바디속도를 회전해서 구함)
- 차체 크기 : width, length
- 타입 one-hot : [1, 0, 0]      (항상 차량이라고 가정: car=1, ped=0, bike=0)

Args:
    future_ego_states (List[EgoState]):
        - 길이: future_len
        - 각 원소는 한 시점의 ego 상태(EgoState).

Returns:
    np.ndarray:
        - fut_ego_world_10
        - shape: (future_len, 10)
        - 열 순서:
            [x, y, heading, vx, vy, width, length, car, pedestrian, cyclist]
        - dtype: float64
"""
future_len: int = len(future_ego_states)
# fut_ego_world_10: (future_len, 10)
fut_ego_world_10 = np.zeros((future_len, 10), dtype=np.float64)

for time_i in range(future_len):
    # 위치 (x, y)
    fut_ego_world_10[
        time_i, EgoInternalIndex.x()] = future_ego_states[time_i].center.x
    fut_ego_world_10[
        time_i, EgoInternalIndex.y()] = future_ego_states[time_i].center.y

    # 방향 heading (월드 좌표 기준)
    fut_ego_world_10[time_i, EgoInternalIndex.heading(
    )] = future_ego_states[time_i].center.heading

    # --- 자차좌표계 → 세계좌표계 속도 변환: 회전만 적용 ---
    # v_local: Ego body-frame 속도 (vx_body, vy_body)
    v_local = future_ego_states[
        time_i].dynamic_car_state.center_velocity_2d  # body-frame velocity
    he = float(future_ego_states[time_i].center.heading)
    c, s = np.cos(he), np.sin(he)
    vx_w = c * float(v_local.x) - s * float(v_local.y)
    vy_w = s * float(v_local.x) + c * float(v_local.y)
    fut_ego_world_10[time_i, EgoInternalIndex.vx()] = vx_w
    fut_ego_world_10[time_i, EgoInternalIndex.vy()] = vy_w

    # 차체 크기 (width, length)
    fut_ego_world_10[time_i, EgoInternalIndex.ax(
    )] = future_ego_states[time_i].car_footprint.width
    fut_ego_world_10[time_i, EgoInternalIndex.ay(
    )] = future_ego_states[time_i].car_footprint.length

    # 타입 one-hot (car, pedestrian, cyclist) = (1, 0, 0)
    fut_ego_world_10[time_i, 7:10] = [1, 0, 0]

return fut_ego_world_10

def get_ego_future_array_from_scenario( scenario: NuPlanScenario, current_ego_state: EgoState, num_future_poses: int, future_time_horizon: float, ) -> Tuple[npt.NDArray[np.float32], npt.NDArray[np.float32]]: """시나리오에서 ego의 미래 궤적을 가져와, ego 기준 좌표계로 변환한 결과를 (T,3) / (T,11) 두 가지 형태로 돌려준다.

전체 흐름
----------
1) nuPlan 시나리오에서, 현재 ego 상태 기준
   `num_future_poses`, `future_time_horizon` 조건에 맞게
   미래 ego 상태들을 가져온다.
   - future_ego_states: List[EgoState], 길이 T

2) `sampled_future_ego_states_to_array` 로
   각 시점을 10차원 월드 좌표 배열로 바꾼다.
   - fut_ego_world_10: shape (T, 10)
     · [x, y, heading, vx, vy, width, length, one-hot(3)]

3) 현재 ego 포즈(current_ego_state.rear_axle)를
   [x_ego, y_ego, yaw_ego] 형태의 벡터로 만든다.
   - ego_cur_pose_np: shape (3,)

4) `convert_absolute_quantities_to_relative(..., 'ego')` 를 호출해
   월드 좌표 기반의 10차원 배열을 ego 기준 좌표계로 바꾸면서
   heading 을 cos, sin 두 값으로 풀어 1차원을 늘린다.
   - fut_ego_local_11: shape (T, 11), dtype float32
     · [x, y, cos(yaw), sin(yaw), vx, vy, width, length, one-hot(3)]

5) x, y 값으로부터 heading 을 다시 뽑아 (단순 arctan2 사용)
   (T, 3) = [x, y, heading] 형태의 간단한 궤적도 만들어서 함께 반환한다.
   - fut_ego_local_xyh: shape (T, 3)

Args:
    scenario (NuPlanScenario):
        nuPlan 시나리오 객체.
    current_ego_state (EgoState):
        현재 ego 상태. (보통 initial_ego_state 또는 시뮬레이터의 현재 상태)
    num_future_poses (int):
        몇 개의 미래 시점을 샘플링할지 (T 값).
    future_time_horizon (float):
        현재부터 몇 초 뒤까지를 커버할지 [초].

Returns:
    Tuple[np.ndarray, np.ndarray]:
        - fut_ego_local_xyh:
            · shape: (T, 3)
            · 각 행: [x_ego, y_ego, heading_ego] (ego 기준 좌표계)
            · dtype: float32
        - fut_ego_local_11:
            · shape: (T, 11)
            · 각 행:
                [x, y, cos(yaw), sin(yaw), vx, vy,
                 width, length, onehot_car, onehot_ped, onehot_bike]
            · dtype: float32
"""
# future_ego_states: List[EgoState], 길이 T
future_ego_states = scenario.get_ego_future_trajectory(
    iteration=0,
    num_samples=num_future_poses,
    time_horizon=future_time_horizon)

# fut_ego_world_10: (T, 10)
fut_ego_world_10 = sampled_future_ego_states_to_array(
    list(future_ego_states))
# ego_cur_pose_np: (3,) = [x_ego, y_ego, yaw_ego] (월드 좌표계)
ego_cur_pose_np = np.array(
    [
        current_ego_state.rear_axle.x,
        current_ego_state.rear_axle.y,
        current_ego_state.rear_axle.heading,
    ],
    dtype=np.float64,
)

# fut_ego_local_11: (T, 11)  ← 'ego' 모드로 상대 좌표 변환 후 float32
fut_ego_local_11 = convert_absolute_quantities_to_relative(
    fut_ego_world_10, ego_cur_pose_np, 'ego').astype(np.float32)
# fut_ego_local_xy: (T, 2)  ← x,y 만 분리
fut_ego_local_xy = fut_ego_local_11[:, :2]
fut_ego_local_cos_yaw = fut_ego_local_11[:, 2]
fut_ego_local_sin_yaw = fut_ego_local_11[:, 3]

# fut_ego_local_heading: (T,)  ← x,y 에서 heading 추출 (현재 구현 그대로 유지)
fut_ego_local_heading = np.arctan2(fut_ego_local_sin_yaw,
                                   fut_ego_local_cos_yaw)

# fut_ego_local_xyh: (T, 3) = [x, y, heading]
fut_ego_local_xyh = np.concatenate(
    [fut_ego_local_xy, fut_ego_local_heading[:, None]], axis=-1)
return fut_ego_local_xyh, fut_ego_local_11

def calculate_additional_ego_states(ego_agent_past, time_stamp): # ego_agent_past: (N, 7) where N is the number of past states. # 7: x, y, heading, vx, vy, width, length # transform haeding to cos h, sin h and calculate the steering_angle and yaw_rate for current state

current_state = ego_agent_past[-1]
prev_state = ego_agent_past[-2]

dt = (time_stamp[-1] - time_stamp[-2]) * 1e-6

cur_velocity = current_state[3]
angle_diff = current_state[2] - prev_state[2]
angle_diff = (angle_diff + np.pi) % (2 * np.pi) - np.pi
yaw_rate = angle_diff / dt

if abs(cur_velocity) < 0.2:
    steering_angle = 0.0
    yaw_rate = 0.0  # if the car is almost stopped, the yaw rate is unreliable
else:
    steering_angle = np.arctan(
        yaw_rate * get_pacifica_parameters().wheel_base / abs(cur_velocity))
    steering_angle = np.clip(steering_angle, -2 / 3 * np.pi, 2 / 3 * np.pi)
    yaw_rate = np.clip(yaw_rate, -0.95, 0.95)
# ego_agent_past: (T, 7)
# past: (T, 8) # +3 for one-hot encoding of the agent type (car, pedestrian, cyclist) and ego is always car.

current = np.zeros((ego_agent_past.shape[1] + 3), dtype=np.float32)
current[:2] = current_state[:2]
current[2] = np.cos(current_state[2])
current[3] = np.sin(current_state[2])
current[4:8] = current_state[3:7]
current[8] = steering_angle
current[9] = yaw_rate

return current

time embedding

Tue, 12 Aug 2025 04:47:23 GMT

글 1

좋아요. 아래는 **“멀티주파수 연속시간 임베딩 + $dt$ 파라미터화”**가 무엇을 하는지, 왜 좋은지, 수식으로 어떻게 정의되는지, 하이퍼 선택 기준은 무엇인지까지 한 번에 이해할 수 있게 정리한 설명입니다. (전문 용어를 쓰면 바로 옆에 쉬운 뜻도 붙일게요.)

1) 목적 — “시간”을 숫자열로 똑똑하게 알려주기

모델이 시간 정보를 잘 쓰려면, 단순히 “스텝 인덱스(0,1,2,…)”가 아니라 초 단위의 연속값과 여러 길이의 시간 패턴(짧은 리듬, 긴 리듬)을 함께 알려주는 게 유리합니다. 그걸 위해 **하나의 시각 $t$ (초)**를 벡터로 바꿉니다.

2) 핵심 수식 — 시간 $t$를 벡터로 바꾸는 방법

각 시각 $t$에 대해 아래 **임베딩(벡터화)**를 만듭니다.

$$ \phi(t) = \Big[,t,\ \sin(2\pi f_1 t),\ \cos(2\pi f_1 t),\ \ldots,\ \sin(2\pi f_K t),\ \cos(2\pi f_K t),\Big] \in \mathbb{R}^{,1+2K}. $$

$t$: 초 단위의 실제 시간값(연속 시간성 유지).
$\sin, \cos$: 주기 성분(리듬)을 표현.
$f_k$: 여러 주파수(여러 길이의 리듬), 보통 로그 간격으로 잡습니다:

$$ f_k ;=; f_{\min}\cdot r^{,k-1}\quad (k=1,\ldots,K). $$
- $f_{\min}$: 가장 느린 리듬(긴 패턴),
- $r>1$: 간격 배수(예: $r{=}2$면 두 배씩 증가),
- $K$: 사용하는 주파수 개수.

이렇게 하면 짧은 변화(급가속/급정지 등)부터 긴 흐름(여러 초에 걸친 추세)까지 한 번에 표현됩니다.

3) 왜 $\sin$과 $\cos$을 둘 다 쓰나? (위상 정보 보존)

$\sin(\cdot)$만 쓰면 “위상(phase, 파형의 시작점)”을 구분하기 어렵습니다. $\sin$과 $\cos$을 쌍으로 쓰면

$$ \sin(a)\sin(b) + \cos(a)\cos(b) = \cos(a-b) $$

가 되어, 두 시각 $t,s$의 차이에 대한 정보를 깔끔히 담습니다. 점들끼리의 유사도(내적)를 보면

$$ \langle \phi(t), \phi(s) \rangle = t,s ;+; \sum_{k=1}^K \cos!\big(2\pi f_k (t-s)\big). $$

즉, 임베딩 간 유사도는 **시간 차이 $(t-s)$**에 주로 의존합니다. 이건 모델이 “시간이 가까우면 비슷하고, 멀면 다르게” 보게 돕습니다.

4) 왜 “연속시간 + $dt$ 파라미터화”가 중요한가?

**$dt$**는 샘플 간격(초)입니다. 시계열이 10Hz든 20Hz든, 우리는 시간을 초 단위로 넣습니다:

$$ t_n = n\cdot dt. $$
예전처럼 “스텝 인덱스를 0~1로 정규화”하면, 샘플링 레이트가 바뀌면 의미가 달라지는 문제가 생깁니다. (예: 같은 2초여도 10Hz에선 20스텝, 20Hz에선 40스텝 → 임베딩이 달라짐)
지금 방식은 절대 시간(초) 기준이라, $dt$가 달라도 같은 물리 시간이면 같은 임베딩이 나옵니다. → 일반화와 도메인 전이에 유리합니다.

5) 과거/미래 시간 축 생성(코드와 일치)

과거 $V$ 스텝(예: 21스텝, 0.1s 간격)일 때:

$$ {t_v}_{v=1}^V ;=; {-dt,(V-1),\ \ldots,\ -2dt,\ -dt,\ 0}. $$

(예: $-2.0, -1.9, \ldots, 0.0$ 초)
미래 플랜 $F$ 스텝(예: 80스텝):

$$ {t_f}_{f=1}^F ;=; {dt,\ 2dt,\ \ldots,\ F,dt}. $$

(예: $0.1, 0.2, \ldots, 8.0$ 초)

각 $t$에 대해 $\phi(t)$를 계산해 입력에 붙입니다. 이게 코드의 make_time_feat(times_s)에 해당합니다.

6) 하이퍼파라미터 선택 가이드 (실전 중요)

지평 $T$: 미래 총 길이(초). 예: $T=8$초.
$f_{\min}$: 보통 $1/T$ 또는 $1/(2T)$ 근처가 안전.
**최대 주파수 $f_{\max}$**는 나이퀴스트 한계(샘플링의 물리 한계)를 넘기면 **뒤섞임(에일리어싱)**이 생깁니다.

$$ f_{\max} ;\le; \frac{1}{2,dt}. $$
로그 간격이면 $f_{\max} = f_{\min}, r^{K-1}$ 이므로,

$$ K ;\le; 1 + \log_{r}!\Big(\frac{1}{2,dt,f_{\min}}\Big). $$

예시 (dt=0.1s, T=8s)

나이퀴스트 한계: $1/(2dt)=5,\text{Hz}$.
$f_{\min}=1/T=0.125,\text{Hz}$, $r=2$라면 $0.125\cdot 2^{K-1} \le 5 \Rightarrow 2^{K-1}\le 40 \Rightarrow K \le 6$. 권장: $K=5$~$6$ 정도(너무 큰 $K$는 득보다 실).

정리: **$f_{\min}$**은 지평과 맞추고, **$K$**는 $\frac{1}{2dt}$를 넘지 않게 자동 산출하거나 안전 여유를 남겨 잡으세요.

7) 단일 주파수 vs 멀티주파수 — 왜 좋아지나?

단일 주파수(기존): 한 가지 리듬만 있어 짧은 변화/긴 변화를 동시에 잘 표현하기 어렵습니다.
멀티주파수: 짧은/중간/긴 리듬을 동시에 제공 → MLP/어텐션이 상황에 맞는 조합을 학습하기 쉬움.
수렴 속도/일반화에 유리하고, 샘플링 주기 변화에도 의미가 유지됩니다.

8) 모델에 어떤 효과가 있나? (해석 포인트)

시간 근접성: $\langle \phi(t),\phi(s)\rangle$가 $\sum_k \cos(2\pi f_k(t-s))$ 형태라, 가까운 시간끼리 더 비슷하게, 먼 시간은 덜 비슷하게 나타납니다.
부드러운 변화: $\sin/\cos$는 미분 가능하고 부드러워, 학습이 안정적입니다.
과거/미래 분리: 과거는 음의 $t$, 미래는 양의 $t$라 **경계(0초)**를 기준으로 변화가 또렷합니다.
결합 표현 용이: 위치/속도 등 다른 입력과 같이 들어가도, MLP가 쉽게 섞어 쓸 수 있습니다.

9) 계산/메모리 비용

시점당 추가 차원은 $1+2K$. $K=6$이면 13차원 추가에 불과 → 메모리/연산 부담 매우 작음.
코드상 in_ch만 자동으로 늘려놨기 때문에, 드롭인 교체로 끝납니다.

10) 자주 받는 질문 (FAQ)

Q1. 왜 $t$ 원시값을 첫 채널로 넣나요? A. $\sin/\cos$만 쓰면 절대 시간의 흐름(증가/감소)을 직접적으로 표현하기 어렵습니다. $t$를 함께 주면 “미래로 갈수록 커진다” 같은 단조 정보가 자연히 들어갑니다.

Q2. 시퀀스 길이(스텝 수)가 바뀌면? A. $dt$와 $t$는 초 단위라, 길이가 달라도 같은 물리 시간이면 같은 임베딩 패턴이 유지됩니다.

Q3. $f_{\max}$가 나이퀴스트를 넘으면? A. 서로 다른 시각들이 동일한 임베딩으로 보일 수 있어(뒤섞임) 학습이 흔들릴 수 있습니다. 반드시 $f_{\max}\le 1/(2dt)$를 지키거나 여유 있게 더 낮게 잡으세요.

11) 한 줄 요약

시간 $t$를 $\phi(t) = [t, \sin(2\pi f_k t), \cos(2\pi f_k t)]$로 바꾸되,
초 단위 $t$와 여러 주파수 $f_k$를 쓰고,
최대 주파수는 $1/(2dt)$ 이하로 제한하면,
샘플링 변화에 강하고 짧고 긴 패턴을 동시에 포착하는 튼튼한 시간 표현이 됩니다.

필요하면, 위 수식을 기준으로 **$K,f_{\min},r$**를 자동으로 산출하는 작은 헬퍼(“나이퀴스트 안전 $K$” 계산) 코드도 바로 드릴게요.

글 2

좋아! 아래는 네가 붙여 넣은 단일 시간 임베더(TimeEmbed) 로직을, 수식 + 쉬운 말로 차근차근 풀어쓴 설명이야. 최대한 전문용어를 줄이고, 꼭 필요한 개념은 바로 옆에 뜻을 달았어.

1) 목표: “시간”을 한눈에 알아듣게 벡터로 바꾸자

우리가 다루는 입력은 과거 $[-2.0\text{s},\dots,0]$ 구간과 미래 $[0.1\text{s},\dots,8.0\text{s}]$ 구간이 섞여 있어. 모델이 “지금 몇 초 시점의 값인지”, “두 시점이 서로 얼마나 가까운지”를 쉽게 느끼게 하려면, 각 시각 $t$를 정보가 풍부한 벡터로 바꿔서 토큰에 붙여줘야 해. 이 일을 하는 게 시간 임베더(TimeEmbed) 야.

2) 핵심 아이디어(요지)

시각 $t$ 한 개(스칼라)를 여러 개의 사인/코사인 값으로 펼친다 → 여러 시간 패턴(느린 변화, 빠른 변화)을 한 번에 담게 됨.
거기에 정규화된 시간 $\frac{t}{T_{\max}}$과 부호 $\mathrm{sign}(t)$도 붙여서, “전체 구간에서의 위치”와 “과거/미래”를 단서로 준다.

수식으로 한 줄 요약:

$$ \boxed{ \phi(t)=\big[,\sin(\omega_1 t),\cos(\omega_1 t),\ \dots,\ \sin(\omega_{K} t),\cos(\omega_{K} t),\ \tfrac{t}{T_{\max}},\ \mathrm{sign}(t),\big] } $$

여기서 $\omega_k$는 주파수(사인의 빠르기)이고, $K$는 사용하는 주파수 개수야.

3) 코드와 정확히 1:1로 대응되는 수식

코드의 time_embed_seconds(t, d_fourier=12, Tmax=8.0)는 다음을 계산해:

주파수 선택(느린→빠른, 로그 스케일)

$$ \omega_k ;=; 2^{k}\cdot \frac{\pi}{T_{\max}}, \qquad k=0,1,\dots,K{-}1 $$

왜 이렇게? $2^k$ 배로 늘리면 아주 느린 파형부터 아주 빠른 파형까지 고르게 덮을 수 있어. (느린 변화=장기 추세, 빠른 변화=급격한 이벤트)

사인/코사인 펼치기(시간을 여러 각도로 비춰본 사진)

$$ \big[\ \sin(\omega_k t),\ \cos(\omega_k t)\ \big]_{k=1}^{K} $$

사인/코사인은 반복 무늬라서, 다양한 $\omega_k$를 쓰면 “시간에 따라 어떻게 흔들리는지”를 여러 눈금으로 관찰하는 효과가 있어.

정규화된 시간과 부호(거친 힌트)

$$ \left[,\frac{t}{T_{\max}},\ \mathrm{sign}(t),\right] $$

$\tfrac{t}{T_{\max}}$: 전체 8초 중에서 얼마나 진행됐나(0에 가까움 vs 8에 가까움).
$\mathrm{sign}(t)$: 과거(-), 현재(0), **미래(+)**를 즉시 구분하는 한 줄 힌트.

최종 벡터 차원은 $\underbrace{2K}{\sin,\cos} + \underbrace{2}{t/T_{\max},\ \text{sign}} = 2K+2$.

4) 왜 사인/코사인(여러 주파수)을 쓰나? (직관)

느린 파형(작은 $\omega$): “장기적인 추세”에 민감 (예: 8초 동안 서서히 가속/차선 이동)
빠른 파형(큰 $\omega$): “짧은 순간의 급변”에 민감 (예: 0.5초 근처의 급제동/합류 타이밍)
여러 $\omega_k$를 동시에 쓰면, 모델이 여러 시간 눈금을 보며 배울 수 있어.

수학적으로도 좋은 점이 하나 있어:

$$ \cos(\omega (t-\tau))=\cos(\omega t)\cos(\omega\tau)+\sin(\omega t)\sin(\omega\tau). $$

즉, $\big[\sin(\omega t),\cos(\omega t)\big]$와 $\big[\sin(\omega \tau),\cos(\omega \tau)\big]$의 내적이 $t$와 $\tau$의 시간 차를 반영해. 그래서 어텐션의 점수(내적) 가 가까운 시각끼리 자연히 높게 나오기 쉬워져(학습을 도와주는 “좋은 버릇”).

5) $\tfrac{t}{T_{\max}}$ 과 $\mathrm{sign}(t)$를 왜 추가하나?

$\tfrac{t}{T_{\max}}$는 단조롭게 증가하는 선형 눈금이야. 모델이 “시작→중간→끝” 같은 전체 흐름을 빠르게 감 잡도록 돕는다.
$\mathrm{sign}(t)$는 과거/미래를 실수 한 칸으로 즉시 구분. 사인/코사인만으로도 구분은 가능하지만(주기가 있어 애매해질 수 있음), 이 한 줄이 들어가면 학습 초기에 안정적으로 방향 감각(과거/미래)을 잡는다.

6) “과거/미래 일관화”가 왜 중요한가?

예전 코드는 모듈마다 시간 스케일을 따로 만들었어(과거는 $[-2,0]$을 자체 정규화, 미래는 다시 $[0.1,8.0]$ 정규화…). 그러면 같은 “1초 차이”라도 모듈마다 숫자 의미가 달라져서, 모델이 혼란스러울 수 있지.

지금은 초 단위 절대 시간을 그대로 쓰고, 하나의 임베더로 과거(-) 와 미래(+) 를 같이 표현해. 이렇게 하면:

어디서 쓰든 같은 1초는 같은 1초로 느껴짐(일관성).
토큰 길이가 달라져도(미래가 80스텝이든 1스텝이든) 같은 함수로 처리됨(확장성).

7) 이 임베딩이 실제로 어디 붙나?

(a) 과거 이웃/자차 인코더: `AgentFusionEncoder`

과거 시각들 $t\in[-2.0,,0.0]$을 초 단위로 만들어 $\phi(t)$를 계산하고, 각 프레임의 $[x,y,\cos,\sin]$ 뒤에 붙여 토큰 채널로 합쳐 줘.
그다음 작은 MLP(선형+활성함수) 를 통과해 네트워크 공간으로 투사하고, 토큰 섞기( Mixer ) 를 하며 요약해.

(b) 미래 플랜 인코더: `EgoFutureEncoder`

미래 시각들 $t\in(0.1,,8.0]$에 대해서도 똑같이 $\phi(t)$를 만들어서 각 미래 스텝 토큰에 붙여 줘(이 토큰들은 풀링 없이 스텝별로 유지).

요컨대, 모든 시계열 토큰이 같은 스타일의 시간 꼬리표를 달고 흐른다.

8) 이 설계가 주는 실질적 이득

정렬력(시간 매칭)↑ 비슷한 시각끼리 임베딩이 “비슷하게” 생겨, 어텐션/MLP가 쉽게 “누가 누구랑 맞는지” 배운다.
일관성↑ 과거/미래/모듈을 가리지 않고 같은 수학으로 처리 → 다른 모듈 조합/하이퍼파라미터 바뀌어도 행동이 예측 가능.
일반화↑ 시퀀스 길이 $F$가 바뀌어도, $\phi(t)$는 초 단위로만 보니 그대로 쓸 수 있다.
빠른 학습 초반 $\tfrac{t}{T_{\max}}$, $\mathrm{sign}(t)$ 같은 저주파 힌트가 있어서, 초기엔 거친 패턴부터, 학습이 깊어지면 사인/코사인의 세밀 패턴까지 흡수.

9) 수학적 뒷받침(부담 없이 읽기)

어떤 시간에 따른 함수도(궤적 속성, 상호작용 강도 등) 충분한 사인/코사인 합으로 가깝게 표현 가능해. 이건 푸리에(삼각) 급수의 기본 사실이야. $\Rightarrow$ $\phi(t)$를 입력으로 받는 작은 MLP만으로도 다양한 시간 패턴을 근사할 수 있어.
위에 썼던 항등식

$$ \cos(\omega (t-\tau))=\cos(\omega t)\cos(\omega\tau)+\sin(\omega t)\sin(\omega\tau) $$

덕분에, 두 시각 $t,\tau$의 차이가 벡터들의 내적(즉, 어텐션 점수)과 직결돼. $\Rightarrow$ “가까운 시간끼리 주목”이 자연스럽게 일어나기 쉬움.

10) 하이퍼파라미터(쉽게 고르는 법)

$K(=\texttt{d_fourier})$: 12–16 권장. 작으면 너무 둔감(세밀 타이밍 못 잡음), 너무 크면 잡음↑·계산↑.
$T_{\max}$: 예측 지평(현재 8초)에 맞춰 두면 충분. 더 긴 지평을 다룰 때만 늘리자.
$dt$: 데이터의 실제 샘플 간격(예: 0.1초)을 그대로 넣자.

11) 마지막 한 줄 정리

시간 임베더는

$$ \phi(t)=\big[\sin(\omega_k t),\cos(\omega_k t)\big]{k=1}^K\ \oplus\ \big[\tfrac{t}{T{\max}},\mathrm{sign}(t)\big] $$

로 시각을 풍부한 벡터로 바꾼다.
이렇게 과거/미래를 같은 좌표계(초 단위) 로 통일하면, 모델이 시점과 시점 사이의 관계(누가 누구와 맞물리는가)를 쉽고 안정적으로 배운다.
구현은 단순하지만, 정렬력·일관성·일반화를 동시에 끌어올리는 데 효과적이야.

원하면, 여기에 어텐션 “시간 가산점” $\text{logits}_{t,\tau} {+}{=} -\beta|t-\tau|$ 을 더했을 때 왜 시점 매칭이 더 쉬워지는지까지(간단한 도식 포함) 이어서 설명해줄게.

[metadrive] DiffusionPlannerEnv

Sat, 28 Jun 2025 04:44:23 GMT

- 해야할 것
종료 조건
- 원래: 목적지 도착 / 최대 스텝 초과 / 차선 이탈 / 충돌(차량/물체/인도) / 사람 이동체 충돌 등
- 추가해야할 것: 최대 스텝 초과하면 끝나도록 (20초?)
N개 시나리오를 돌리고, 종료하도록 하기
추후: DiffusionPlannerEnv 를 아래 2가지 목적에 맞게 나눠야 함
- RL 학습용
- 평가용
알아야 할 것
- 지금은 어떤 시나리오가 어떤 기준으로 나오는지 잘 모르겠음
  - 아래와 같은 것들은 수행하긴 함
    - 차선 갯수 다양화, 차로 폭 다양화, 차량 밀도 다양화
    - 제한 속도 다양화
    - 사고 확률 다양화
  - 다양한 도로 구조가 나오는데, 이는 어떻게 나오는건지?를 확인해야함

전체 큰그림

1. 전체적인 역할

DiffusionPlannerEnv는 MetaDrive 시뮬레이터를 기반으로 만들어진, “확산 기반 주행 계획(diffusion planning)” 연구용 환경입니다.
쉽게 말해, 자율주행차가 다른 차량들의 미래 움직임을 동시에 예측·제어하면서,
- 본인(ego) 의 경로를 계획 · 검증할 수 있도록 특화된 실험 무대를 제공하는 것이 목적

2. 구성 요소와 동작 흐름

환경 설정(컨피그)
- 기본 엔진 설정(BaseEnv) → MetaDrive 기본 설정 → DiffusionPlanner 전용 설정의 세 단계를 합쳐 하나의 최종 설정을 만듭니다.
- 여기서 DiffusionPlanner 전용 설정은
  - 차로 폭·차로 수 무작위화
  - 구간별 최고·최저 속도 제한
  - 사고 확률, 관측 범위, 트래픽 밀도 등을 따로 지정 등 다양한 시나리오 변형을 켜 두어 일반화 능력을 시험할 수 있게 합니다.
지도와 트래픽 관리
- SpeedLimitPGMapManager가 블록 기반 도로를 만들고, 각 블록에 속도 제한을 부여합니다.
- DiffusionTrafficManager는 NPC 차량을 생성·관리합니다. 필요하면 TrafficObjectManager가 사고/장애물도 추가해 돌발 상황을 만듭니다.
관측(Observation)
- DiffusionPlannerObservation 클래스가 제공하는 전용 센서 패키지를 사용 예) 차량·차선 위치, 주변 차량의 과거 궤적 등 → 확산 모델이 바로 사용할 수 있는 형태로 가공.
행동 정책(Policy)
- 기본값은 LQR(선형‑이차 레귤레이터) 로 안전한 데모 주행을 합니다.
- 연구자가 직접 작성한 Diffusion Planner를 대체 삽입하면, 곧바로 훈련·평가가 가능합니다.
외부 NPC 제어 인터페이스
- set_external_npc_actions() 함수로
  - 크기: (NPC 수, 예측 길이, 4차원 제어) (4차원은 일반적으로 $[x, y, v, \psi]$ 처럼 위치·속도·방향)
  - 0으로 채워진 행(row)은 무시하여 필요한 NPC만 부분 변경 하는 식으로 외부 모델이 계산한 다수 차량의 미래 제어를 주입할 수 있습니다.
엔진 단축키
- r 리셋, p 일시정지, b 탑뷰 전환 등 실험 중 빠른 반복·시각화를 지원합니다.

3. 한 타임스텝(step)의 내부 알고리즘

행동 수집
- 연구자가 넘긴 ego 행동
- DiffusionTrafficManager나 set_external_npc_actions()에서 채워진 NPC 행동을 모읍니다.
시뮬레이터 진행
- 실제 물리 시간 $\Delta t$를 여러 번 잘게 쪼개서(결정 반복) 차량 상태를 업데이트합니다.
보상 계산 핵심식은 다음과 같이 요약할 수 있습니다.

$$ r = r_{\text{주행}} + r_{\text{속도}} + r_{\text{종료}} $$
- $r_{\text{주행}}$ : 진행 거리 × 차로 중심 유지 정도
  
  $$ r_{\text{주행}} = w_1 , \Delta s , \bigl(1 - 2\lvert d\rvert / w_{\text{lane}}\bigr) $$
- $r_{\text{속도}}$ : 현재 속도 비율
  
  $$ r_{\text{속도}} = w_2 , \frac{v}{v_{\max}} $$
- $r_{\text{종료}}$ : 도착(+), 충돌·차선이탈(−) 등 종료 이벤트 보상/패널티
종료 조건
- 목적지 도착, 최대 스텝 초과, 차선이탈, 충돌(차량·물체·인도), 사람이동체 충돌 등.
관측·정보 반환
- 다음 스텝에 사용할 관측, 보상, 종료 플래그, 그리고 통계 정보를 에이전트에게 돌려줍니다.

4. DiffusionPlannerEnv만의 특징 정리

다차선·가변 차로 지도 + 속도 제한이 혼합되어 보다 현실적인 규제 환경을 재현합니다.
교통 밀도 0 상태부터 연구자가 직접 NPC 행동을 전부 생성할 수 있어 다중 참여자 계획(multi‑agent planning) 실험에 탁월합니다.
관측 범위·사고 확률·속도 한계 등을 매 에피소드마다 바꾸는 설정이 기본이라 일반화 학습과 로버스트 테스트가 용이합니다.
기본 LQR로 “안전 주행 예시”를 보여주지만, 코드 한 줄로 Diffusion 기반 정책으로 대체 가능합니다.

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

Tue, 24 Jun 2025 04:34:05 GMT

scene initialization

scene initialization
- 자율주행 시뮬레이션을 돌릴 때, 제일 처음 시작하는 환경 구성
- 도로: 도로 차선 갯수, 모양, 차선 간 연결 관계 등
- agent
  - 차량/사람/자전거/정적 장애물 등의 초기 상태
  - 여기서 초기상태는, 위치/속도/방향/차 크기/종류 를 의미

scene initialization을 왜 difssuion 으로 풀어야 하는데?

기존 시뮬레이터(carla, waymax, metadrive 등)에서는
- 규칙 기반으로, 도로 모양을 만듦(단순하게 직선 차선/원 호 곡률 차선의 랜덤한 조합)
- 규칙 기반으로, agents를 배치함
scene initialization이 실제 세상과 최대한 유사해야, 시뮬레이션을 돌리는 의미가 더 커짐
아래 사진처럼, 도로 모양(좌회전, 횡단보도)과 목표 속도에 따른 미묘한 차량간 위치 배치는 규칙 기반으로 초기화하기 어려움
생성 모델(diffusion)은
- 실제 차량 데이터셋의 확률 분포를 학습하면서도,
- 데이터셋에는 없었지만, 실제 세상과 유사한 다양한 결과를 생성해내는 능력
즉, 현실적이면서도, 데이터셋보다 훨씬 다양한 시나리오를 만들어낼 수 있게 됩니다.

latent diffusion model을 쓰는 이유?

그냥 diffusion model을 쓰는 경우와 동일한 성능을 유지하면서도
- 학습시 학습 비용(적은 학습 파라미터 수, 더 빠른 학습) 달성 가능
- inference시 더 경량화되고 더 빠른 추론 속도 달성 가능

그럼 왜 베타-VAE? autoencoder로 충분하지 않아?

왜 VAE?
- autoencoder에 비해서 VAE는, latent space를 guassian distribution에 투영시켜준다.
- diffusion model은 normal guassian dist에서 출발하여 -> 데이터를 복원 하는 테스크
  - diffusion은 normal guassian dist에서 출발하여 -> guassian distribution 으로 복원하는게 훨씬 학습하기 쉬움
    - 훈련 시간이 단축되고, 더 적은 파라미터로도 같은 높은 성능을 낼 수 있게 됨
왜 베타 VAE?
- 베타를 통해, KL divergence loss 의 weight를 조절 가능
- 베타가 작아질수록
  - 복원 화질을 높이면서도
  - latent space를 깔끔하게 구획 가능 (속성별로 독립적 조작이 가능)
  - 대신, latent space의 공간이 넓어져서, diffusion Model 학습 시 비교적 오래 걸린다.
논문에서는 beta를 매우 낮게 설정하였다.

In-painting 맵을 무한히 확장

inference시

위 그림에서, 먼저 검은색 실선 영역에 대해 차선과 agent를 생성합니다.
- 이 때, 차선의 갯수와 agent의 갯수는 내가 임의로 설정할 수 있습니다.
그 후, 검은색 실선 영역의 오른쪽 절반 부분만 다시 활용합니다.
- 오른쪽 절반 부분의 차선과 agent를 latent diffusion model의 입력으로 사용합니다.
- 그리고, 미리 학습해놓은 분류기를 이용하여, 오른쪽 부분에 몇 개의 차선이 있는지를 출력해주는 네트워크를 이용해서,
  - 차선에 대한 초기 random noise 벡터의 갯수를 정합니다.
  - 차량 댓수는 여기서도 내가 임의로 넣어주는건가 봅니다.
- 미리 학습해놓은 분류기?
  - 좌측 장면 상황을 보고 -> 우측에 몇 대의 차량과 차선이 있는지를 맞추도록 학습해 놓은 분류기입니다.
이제 오른쪽 절반 부분의 차선과 agent + 초기 random noise 벡터들을 input으로 하는 denoising과정을 여러 번 수행하면 됩니다.
- 오른쪽 절반 부분의 차선과 agent는 매번 복구해줍니다.

학습 시

일반 학습 루프와, in-painting 특화 학습 루프를 돌립니다.
왼쪽만 보고, 오른쪽의 차선 갯수를 유추하는 예측 보조 클래스도 학습합니다.
- learnable query가 lane 토큰과 cross attention 하면서 학습한다고 합니다.

behaviour simulation

ctrl-sim이라는 논문을 기반으로 만들었다고 합니다.
안전 모드(천천히, 차간 거리 유지)나 도전 모드(충돌 유도)로 난이도를 바꿔가며 테스트할 수 있습니다.
차량이 볼 수 있는 범위(약 반경 80 m) 내에서만 계산해 효율을 높입니다.

scene initialization 정량적 평가

한 단락 요약

Scene Initialization(초기 장면 생성) 평가는 “도로 그래프 품질”과 “초기 객체(차량) 배치 품질”의 두 축으로 나뉩니다. 연구진은 Waymo Open Motion(미국 6개 도시)·nuPlan(4개 도시) 데이터셋에서 50 k개의 실제 장면과 50 k개의 모델이 생성한 장면을 뽑아, ① Lane Graph에 대해 Urban Planning Metric 4종 + Fréchet Distance (FD), ② Agent 배치에 대해 Jensen‑Shannon Divergence(JSD) 6종 + Collision Rate를 계산하였다. 모든 지표는 히스토그램 또는 가우시안 통계량을 기반으로 하며, 값을 읽기 쉽게 10~100배로 스케일링해 Table 1, 2에 정리했다. 비교 대상은 SLEDGE(DiT‑L/XL)·DriveSceneGen(Upper‑bound)으로, Scenario Dreamer가 대부분의 지표에서 오차를 대폭 줄이며 파라미터·속도 면에서도 우위를 보였다.

데이터셋과 샘플링

Waymo Open Motion: 487 002 훈련·44 097 검증 시나리오에서 64 m × 64 m FOV를 중심·회전 정규화해 50 k 장면을 무작위 추출. 시뮬레이션 호환성(신호등 부재) 플래그도 함께 기록해 후처리에 활용.
nuPlan: 1 300 h 로그를 30 s/1 s 간격으로 샘플링하여 450 k 훈련·50 k 검증 프레임을 구성, 동일한 FOV로 50 k 장면을 추출.
생성 샘플: 두 데이터셋에서 훈련한 Scenario Dreamer Base(B)·Large(L) 모델이 각각 50 k 초기 장면을 생성.

정량 지표 정의

1. Lane Graph – Urban Planning Metrics

Lane Centerline Graph의 degree ≠ 2 노드를 key‑points로 정의한 뒤, 다음 일차원 통계량에 대해 Fréchet Distance(FD)를 계산한다.

$$ \operatorname{FD}(X,Y)=\lVert\mu_X-\mu_Y\rVert_2^2 +\operatorname{Tr}!\left(\Sigma_X+\Sigma_Y-2(\Sigma_X\Sigma_Y)^{1/2}\right) $$

Connectivity : key‑point 차수 분포
Density : 그래프당 key‑point 수
Reach : 각 key‑point에서 도달 가능한 경로 수
Convenience : 모든 key‑point 쌍의 최단 경로 길이
가독성을 위해 $\times{10,1,1,10}$ 로 스케일링.

2. Lane Perceptual Quality FD

Autoencoder 마지막 전 층 임베딩(펜얼티 없는 특징)에 대해 FD를 계산하여 “형태적 유사도”를 측정. ([openaccess.thecvf.com][1], [openaccess.thecvf.com][1])

3. 경로 길이·끝점 거리

Route Length: 원점에서 DFS로 탐색 가능한 최장 경로(m).
Endpoint Distance: 예측된 successor edge (i, j)마다 $|p^{\text{end}}_i-p^{\text{start}}_j|_2$ 평균(m). ([openaccess.thecvf.com][1])

4. Agent JSD Metrics

50 k 실·생성 장면의 히스토그램을 아래 bin 설정으로 만들고 JSD 계산.

$$ \operatorname{JSD}(p|q)=\tfrac12!\left[ D_{\mathrm{KL}}(p|m)+D_{\mathrm{KL}}(q|m) \right],; m=\tfrac12(p+q) $$

Nearest Dist. (0–50 m, 1 m bin, ×10)
Lateral Dev. (0–1.5 m, 0.1 m bin, ×10)
Angular Dev. (−200°–200°, 5° bin, ×100)
Length (0–25 m, 0.1 m bin, ×100)
Width (0–5 m, 0.1 m bin, ×100)
Speed (0–50 m/s, 1 m/s bin, ×100)

5. Collision Rate

동일 샘플에서 두 객체의 bounding‑box overlap 존재 비율 (%) 측정. ([openaccess.thecvf.com][1])

평가 프로토콜 Step‑by‑Step

실제·생성 셋 구축
- Waymo·nuPlan에서 50 k 실제 장면, Scenario Dreamer (B/L)에서 50 k 생성 장면 확보.
Lane Graph 분석
- 그래프 노드 중 degree≠2 지점만 선별 → key‑points.
- 4개 통계량 벡터 $f_X$·$f_Y$를 만들고 FD 계산.
- Autoencoder penultimate feature로 Perceptual FD 추가. ([openaccess.thecvf.com][1])
Agent 분석
- 각 차량에 대해 6 특성 히스토그램 $p,q$ → JSD.
- Synchronous AABB overlap 검사로 Collision Rate 산출. ([openaccess.thecvf.com][1])
스케일링
- 표준편차가 작은 항목은 10·100배 곱해 테이블을 읽기 쉽게 보정.
성능 레코딩
- 각 모델에 대해 Generate Time(s), Param (M), GPUh 병기. ([openaccess.thecvf.com][1])
비교
- nuPlan: SLEDGE (DiT‑L/XL) 재학습 결과와 비교.
- Waymo: DriveSceneGen의 GT‑Raster upper bound 사용. ([openaccess.thecvf.com][1])

결과 해석 핵심

Lane Graph: Scenario Dreamer (L) FD 0.67 ↘ (↓53 %) vs SLEDGE (DiT‑XL) 1.44 (nuPlan). 연결성·Reach 등 모든 항목 동반 개선. ([openaccess.thecvf.com][1])
Agent 분포: JSD 전 항목 4–8 배 감소, Collision Rate도 5–12 %대로 절반 이하. ([openaccess.thecvf.com][1])
연산 효율: Scene 당 0.08–0.16 s, 파라미터 377 M(‑50 %), 트레이닝 GPUh 96–256h(‑10×). ([openaccess.thecvf.com][1])

왜 이 평가가 타당한가

추상 시뮬레이터에서 도로 구조 다양성과 객체 상호관계가 RL 학습 난이도를 좌우하므로, 각각을 분리해 통계적 거리(FD, JSD)로 검증.
대량 샘플(50 k)·다도시 데이터로 모델 과적합·편향 완화.
스케일링 후에도 원본 값은 내부 스크립트로 보존(추가 검증 가능).
Collision Rate는 시뮬레이션 안정성을 직접 측정해 실용성을 보강.

기억하면 좋은 Tip

Urban Planning Metric 4종은 Lane Graph 품질 비교의 업계 표준으로 자리 잡은 지표(원형은 SLEDGE 논문).
FD·JSD 모두 값이 0에 가까울수록 실제 데이터와 유사하므로 테이블에서 “↓”가 더 좋은 성능.
실험 코드는 GitHub repo 에 공개 예정이므로(링크 참조) 같은 절차로 자신의 데이터셋을 검증 가능. ([github.com][5])

behavior generation 정량적 평가

아래에는 Scenario Dreamer 논문의 Behaviour Simulation(행동 시뮬레이션) 부분에 대한 정량적 평가 과정을, 데이터 준비부터 지표 계산·결과 해석까지 단계별로 상세히 정리했습니다.

요약

1000개의 Waymo 테스트 장면을 대상으로, Ego 차량 기준 80 m × 80 m FOV 내에서 IDM, Trajeglish, CtRL-Sim(+/- Tilting) 모델을 시뮬레이션하고,
**Jensen-Shannon Divergence (JSD)**로 모션 현실성(선속도·각속도·가속도·최근접 거리)을,
**충돌률(Agent Collision, Planner Collision)**을 계산하여 비교했습니다.

1. 데이터 및 비교 대상 설정

테스트 장면 추출
- Waymo Open Motion Dataset에서 시뮬레이션 호환 조건을 충족하는 1 000개 장면을 무작위로 선택했습니다 .
시뮬레이션 영역
- 각 장면은 Ego 차량을 중심으로 80 m × 80 m FOV 안에서만 시뮬레이션을 수행하여 계산량을 절감했습니다 .
비교 모델
- IDM (규칙 기반), Trajeglish (데이터 기반), CtRL-Sim (Scenario Dreamer의 다중 에이전트 Transformer) 모델을 평가 대상으로 삼았습니다 .

2. 평가 지표 정의

2.1 모션 현실성 (Motion Realism via JSD)

특성 선택:
- 선속도(Linear Speed), 각속도(Angular Speed), 가속도(Acceleration), 최근접 거리(Nearest Distance)
히스토그램 세팅:
- 각 특성별로 200개 균등 bin을 사용하여 분포를 근사
- 범위:
  - 선속도: [0, 30] m/s
  - 각속도: [−50, 50] °/s
  - 가속도: [−10, 10] m/s²
  - 최근접 거리: [0, 40] m
JSD 계산:

$$ \mathrm{JSD}(p,q) = \tfrac12\bigl(D_{\mathrm{KL}}(p\parallel m)+D_{\mathrm{KL}}(q\parallel m)\bigr), ;m=\tfrac{p+q}{2}, $$
- 실제 분포 vs. 생성 분포 간 유사도를 측정하며, 논문에서는 ×10⁻² 스케일로 보고했습니다 .

2.2 충돌률 (Collision Rates)

Agent Collision (%): 시뮬레이션 중 에이전트 간 충돌이 발생한 비율 .
Planner Collision (%): 에이전트 vs. Ego(IDM 플래너) 간 충돌 비율 .

3. 시뮬레이션 및 측정 절차

초기 장면 설정: Scenario Dreamer로 생성된 lane graph 위에 각 모델의 행동 시뮬레이터를 올립니다 .
롤아웃: 각 장면에 대해 최대 100 스텝(≈20 s) 동안 에이전트 행동을 순차 실행합니다 .
데이터 수집: 모든 에이전트의 위치·속도·가속도·이웃 거리 등을 각 타임스텝별로 기록합니다.
분포 생성: 기록된 값들로 히스토그램을 구성해 실제 로그와 비교합니다.
JSD 산출: 위 2.1 절차에 따라 각 특성별 JSD를 계산합니다.
충돌 검사: AABB 충돌 판정으로 Agent와 Planner 충돌률을 집계합니다.

4. 평가 결과 (Supplementary Table 1)

모델	제어?	선속도↓	각속도↓	가속도↓	최근접 거리↓	Agent Coll.(%)↓	Planner Coll.(%)↓
IDM	✗	9.2	0.4	19.8	1.6	7.2	5.8
Trajeglish†	✗	19.5	0.3	19.7	4.0	6.4	7.0
CtRL-Sim (κ=+10)	✓	4.1	0.1	20.1	1.3	6.2	4.9
CtRL-Sim (κ=−50)	✓	4.2	0.2	26.1	1.5	10.9	11.9

Positive Tilting(κ=+10) 버전은 선속도·각속도·최근접 거리 JSD에서 최저를 기록하며, Planner 충돌률은 **4.9%**로 가장 낮았습니다 .
Negative Tilting(κ=−50) 버전은 계획자 충돌률이 **11.9%**로 상승했으나, 모션 현실성은 비교적 유지되었습니다 .

5. 해석 및 시사점

**데이터 기반 행동 모델(CtRL-Sim)**은 규칙 기반(IDM)·단순 토큰 모델(Trajeglish) 대비 모션 유사도와 충돌 회피 성능에서 전반적으로 우수합니다.
κ 조절을 통해 **난이도(공격성/온순성)**를 직관적으로 제어할 수 있어, 강화학습 플래너의 스트레스 테스트에 활용 가능합니다.
실험 환경을 일관된 FOV와 동일 lane graph 표현으로 고정함으로써, 행동 모델 간 공정한 비교가 이뤄졌습니다.

더 궁금하신 세부 구현이나 추가 분석(예: JSD 계산 스크립트, 충돌률 정의 세부 조건 등)이 필요하시면 알려 주세요!

Grhfgh

Tue, 17 Jun 2025 11:51:49 GMT

Scenario Dreamer는 자율주행 차량을 위한 시뮬레이션 환경을 완전히 데이터만으로 생성할 수 있는 새로운 방법입니다. 이 모델의 핵심 아이디어와 구성 요소를 최대한 쉬운 말로 풀어 설명하면 다음과 같습니다. • 문제와 목표 기존 시뮬레이터들은 • 도로와 차량 정보를 픽셀 이미지(바닥에서 본 뷰) 형태로 바꿔 처리하기 때문에 쓸데없는 계산이 많고 • 차량이나 보행자의 움직임은 정해진 규칙(예: 속도 줄이기, 차선 변경하기 등)에 따라 움직여서 실제처럼 다양하지 못합니다. Scenario Dreamer는 이 두 가지 한계를 넘어서고자 합니다. • 초기 장면 생성(Initial Scene Generation) 1. 도로의 차선 연결 구조(lane graph)와 자동차·보행자 등을 감싸는 상자(agent box)를 “벡터” 형태로 표현합니다. 2. 이 벡터들을 작은 숫자 집합(latent space)으로 압축하는 과정을 거칩니다. 3. 압축된 상태에 약간의 무작위(노이즈)를 섞었다가, 이를 조금씩 제거하면서 새로운 도로와 차량 배치를 샘플링합니다. 이렇게 하면 도로가 복잡해도 빠르게, 그리고 더 적은 계산량으로(모델 크기 2×↓, 처리 속도 6×↑) 고품질의 초기 장면을 얻을 수 있습니다. • 에이전트 행동 시뮬레이션(Behaviour Simulation) 1. 위에서 생성한 초기 장면을 그대로 받아옵니다. 2. 각 차량과 보행자가 실제 운전 로그 데이터를 학습한 트랜스포머(Transformer) 모델에 맡겨서 움직임을 생성합니다. 3. 이 모델은 과거 행동 패턴을 보고 다음 행동을 예측하기 때문에, 단순 규칙 기반보다 훨씬 자연스럽고 다양한 상황에 대응할 수 있습니다. • 장점 및 결과 • 효율성: 기존 방법 대비 모델 크기 절반, 계산 속도 6배 빠름, 학습에 드는 GPU 시간 10배 절약 • 현실성: 실제 운전 데이터를 그대로 반영해 더 자연스럽고 다양한 장면과 움직임 생성 • 무한 확장성: 시뮬레이션 길이 제한이 없고, 원하는 만큼 이어 붙여 긴 도로 환경을 만들 수 있음 • 도전적 환경: 이렇게 생성된 환경에서 강화학습 에이전트는 더 까다로운 상황을 학습하게 되어, 실제 주행 성능 향상에 도움을 줌 • 활용 예시 • 긴 구간의 고속도로 주행 테스트 • 복잡한 도심 환경에서의 긴밀한 차량 간 상호작용 실험 • 보행자·자전거 등 다양한 교통 주체가 섞인 상황에서의 안전성 평가

요약하면, Scenario Dreamer는 도로와 차량을 벡터화해 압축·확산(diffusion) 과정을 통해 새로운 시뮬레이션 장면을 만들고, 그 위에서 데이터로 학습된 행동 모델로 자연스러운 움직임을 시뮬레이션하는, 속도와 현실성을 모두 갖춘 완전 데이터 기반 주행 시뮬레이터입니다.

아래 요약은 Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments (CVPR 2025) 논문 전체를 구조 – 수식 – 실험까지 빠짐없이 풀어쓴 해설입니다. 원문을 직접 열어보지 않아도 논문의 핵심 기여, 방법론, 결과를 이해하실 수 있도록 길게 정리했습니다.

⸻

1 배경과 문제의식 • 데이터 기반 주행 시뮬레이터의 한계 기존 공개 시뮬레이터(예: Waymax, GPUDrive)는 실제 주행 로그를 재생(replay) 하거나 소규모 변형만 적용합니다. • 주행 로그 길이는 보통 30 s 이하, 지도로 환산해도 수천 km 수준에 그쳐 희귀 · 위험 상황을 충분히 다루지 못합니다. • Rasterized BEV 표현의 비효율 최근 Diffusion 기반 생성기가 등장했지만(예: SLEDGE) 입력을 픽셀 그리드로 바꿔 처리하면서 • 빈 픽셀까지 계산해 네트워크가 비대해지고 • 도로 토폴로지 정보(차선 연결 등)를 후처리 휴리스틱으로 복원해야 했습니다. • 행동(roll‑out) 모델의 단조로움 Rule‑based 혹은 비‑반응형(non‑reactive) 에이전트로는 실제 교통의 다양성과 상호작용을 재현하기 어렵습니다.

Scenario Dreamer는 위 세 가지 약점을 한꺼번에 해결하는 완전 데이터 기반(fully data‑driven) 생성 시뮬레이터입니다.

⸻

2 구성 개요 1. Initial Scene Generator • Vectorized Latent Diffusion (VLD): 차선(벡터), 차량·보행자 박스(8‑D 벡터)를 그대로 토큰으로 취급해 생성. 2. Behaviour Simulator • CtRL‑Sim 확장판: 반환(return) 조건부 다중 에이전트 Transformer. 3. Simulation Framework • GPUDrive와 통합해 RL 플래너를 학습·평가할 수 있는 폐루프 환경 제공.

⸻

3 Vectorized Latent Diffusion (VLD)

3.1 문제정의 • 64 m × 64 m FOV 안 초기 장면 • 객체 집합 O={o_i}_{i=1}^{N_o} (위치·속도·헤딩 등 8‑D) • 차선 그래프 M={L,A} • L={\ell_i}{i=1}^{N\ell}, 각 \ell_i는 20×2 center‑line 포인트 • 인접 행렬 A\in{0,1}^{N_\ell\times N_\ell\times4} (succ/prev/left/right)

3.2 오토인코더 • Encoder E_\phi • 객체·차선 토큰별 MLP 임베딩 → Factorized Attention 블록 N_E 회 • lane‑to‑lane, lane‑to‑object, object‑to‑object • 각 토큰을 평균·분산으로 매핑하여 VAE 잠재 K_\mathcal O, K_\mathcal L. • Decoder D_\gamma • ND개의 Factorized Attention 후 회귀(ℓ2) + 차선 연결 분류(CE). • 저‑β ELBO 로 학습해 정보 손실 최소화.

3.3 잠재 Diffusion • 잠재 집합 H={h_i^{\mathcal O}}{1:N_o}\cup{h_i^{\mathcal L}}{1:N_\ell} 분포 p(H)=\sum_{N_o,N_\ell} p!\bigl(H\mid N_o,N_\ell\bigr),p(N_o,N_\ell) • DDPM 목적 \mathcal L_{\text{dm}} =\mathbb E_{H_t,\boldsymbol\epsilon_t,t} \bigl| \boldsymbol\epsilon_t-\epsilon_\theta(H_t,t)\bigr|_2^2 • Factorized Transformer ϵθ • AdaLN‑Zero 조건화, 토큰 타입별(hidden dim) 분리. • Permutation Ambiguity 해결 • 훈련 시 재귀적(x‑최소 → y‑최소 → x‑최대 → y‑최대) 토큰 정렬 + 사인 positional enc.

3.4 생성 모드 • (a) Init Scene: (N_o,N_\ell) 샘플 또는 사용자 지정 → 100 step denoise 후 디코딩. • (b) Lane‑Conditioned Object: 지도 latents 고정, 객체만 샘플. • (c) Scene In‑painting: 기존 I_{FN} 고정, 새로운 전방 영역 I_{FP}을 잡음 → denoise. • N_\ell^{FP} 예측용 MLP 분류기 f_\varphi 도입.

⸻

4 Behaviour Simulation (CtRL‑Sim 확장) • Tokenization: k‑disks 스킴으로 차종별(차, 보행자, 자전거 등) 동형 처리. • Return Decomposition p_\theta(A_t,G_t\mid S_t)=\pi_\theta(A_t\mid S_t,G_t),p_\theta(G_t\mid S_t) • 2 s horizon H 누적보상 G_t=\sum_{k=t}^{t+H}r_k. • Exponential Tilting: 테스트 시 보상 분포를 양/음으로 가중해 friendly 또는 adversarial 행동 유도.

⸻

5 실험

5.1 데이터·셋업 • Waymo Open Motion (WOMD), nuPlan • 64 m FOV, T=100 diffusion step, Ko/Kl 둘 다 256(large)·192(base). • 학습 비용 비교: Scenario Dreamer (B) 96 GPU h vs. SLEDGE (DiT‑XL) 960 GPU h.

5.2 정성 결과 • 복잡 교차로에서도 차선 연결이 자연스럽게 이어진 in‑painting 사례 다수.

5.3 정량 평가 • Lane Graph (Urban Planning 4종 + FD) • nuPlan: Scenario Dreamer (L) > SLEDGE (DiT‑XL) 모든 지표, 파라미터 2×↓, 추론 6× 빠름. • Agent Boxes (JSD 6종 + Collision) • Waymo: Collision 5 % 내외 (SLEDGE ≈ 22 %, DriveSceneGen 0.2 %). • RL 플래너(Baseline PPO) • Waymo 55 m 코스 ↔ Scenario Dreamer 55 m 코스 성능 유사 → 현실성 검증. • 100 m + Negative tilt 시 충돌·미완률 급증 ⇒ 더 도전적 환경 생성 가능.

5.4 Ablation • Factorized Attention, Lane Ordering, Learned Topology 중 하나라도 제거하면 FD·Connectivity 악화 & 속도 저하.

⸻

6 장점 요약 • • 벡터 지향 설계 → 불필요한 픽셀 연산 제거, 모델·추론·학습 비용 대폭 절감. • • 토폴로지 직접 학습 → 휴리스틱 후처리 제거, 차선 연결 오류 감소. • • 3‑in‑1 생성 모드 → 초기, 조건부, 무한 확장(in‑painting) 모두 지원. • • 행동 모델 컨트롤러 → 보상 기울기만 바꿔 위험 / 우호적 시나리오 즉석 생성. • • RL 벤치마크에 실용적 → GPUDrive 연동, 길이·난이도 조절로 정책 일반화 평가 가능.

⸻

7 한계 및 향후 과제 • • 차선 신호등 상태를 정확히 생성하지 못해, 교차로 신호논리 오류 관찰. • • 현재 지도는 centerline만 포함; 향후 도로 가장자리, 횡단보도 등 포함 예정.

⸻

8 결론 & 의의

Scenario Dreamer는 “픽셀 없는 생성(fully‑vectorized)”과 “반복 가능한 행동 모델”을 결합해, 자율주행 연구자의 데이터 부족 · 현실 감도 · 도전 난이도 문제를 동시에 해결한 첫 번째 종합 플랫폼이라 할 수 있습니다. Lane graph + Agent state의 토큰화된 잠재 공간을 Diffusion으로 모델링했다는 점은, 향후 로보틱스·도시교통 시뮬레이션 전반에 적용 가능한 범용 기법으로 자리잡을 가능성이 큽니다.

Deggre

Tue, 17 Jun 2025 11:50:55 GMT

아래 내용은 CVPR 2025에 게재된 “Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments” 논문을 ‘원문을 보지 않아도 될 정도’로 상세히 풀어 쓴 것입니다.

⸻

연구 배경·문제 정의 • 자율주행 검증의 병목 공개 주행 로그(예: Waymo Open Motion)의 길이는 보통 30 초 미만이고, 총 도로 거리는 1 천 여 km 수준에 그친다. 이는 자동차 한 대가 1년 동안 주행하는 2 만 km보다 두 자릿수 이상 부족해 대규모 시뮬레이션‑기반 안전 검증에 한계가 있다. • 기존 생성형 시뮬레이터의 한계 • BEV(탑다운) 이미지를 래스터화해 처리하므로 대부분의 빈 픽셀에도 연산이 낭비된다. • 규칙 기반(agent‑based) 또는 짧은 길이(≤20 초)의 비반응형 시나리오가 주류라 실제성과 다양성이 떨어진다.

⸻

논문의 핵심 기여 • 벡터화(latent set) 기반 확산 모델로 초기 장면(차선 그래프·객체 Box)을 직접 생성 → 픽셀‑낭비 제거 & 토큰 수가 장면 복잡도에 비례해 계산 효율 향상. • CtRL‑Sim 기반 다중 에이전트 Transformer 행동 모델을 도입 → 완전 데이터 기반·폐루프(closed‑loop) 시뮬레이터 구현. • 장면 인페인팅(in‑painting) 기능으로 타일을 무한 확장해 길이 제한 없이 주행 환경을 합성. • 시뮬레이션 난이도 제어: • (No, Nl) 토큰 개수로 교통 밀도 조절, • 리턴‑틸팅(return‑tilting)으로 우호/적대적 행동 선택 가능.

⸻

초기 장면 생성: Vectorized Latent Diffusion Model

3‑1. 문제 설정 • 64 m × 64 m FOV를 Ego 중심·방향 기준으로 고정. • 객체 집합 $$O={o_i}_{i=1}^{N_o},\quad o_i\in\mathbb R^{8}$$ (x, y, speed, \cos\theta, \sin\theta, length, width, class) • 차선 집합 $$L={\ell_i}{i=1}^{N\ell},; \ell_i\in\mathbb R^{20\times 2},; A\in{0,1}^{N_\ell\times N_\ell\times 4}$$ (후속·선행·좌·우 인접관계).

3‑2. 두 단계 학습 • (1) VAE 오토인코더 • 요소별 MLP + Factorized Attention • lane→lane / lane→object / object→object 블록. • 객체와 차선에 서로 다른 잠재 차원 K_o, K_\ell 사용. • lane latents가 object 정보에 의존하지 않도록 설계 → 차선 기반 객체 생성이 가능. • (2) Latent Diffusion • 가변 토큰 수 (N_o,N_\ell)를 조건으로 하는 Transformer DDPM. • Factorized Attention 블록을 반복하며 AdaLN‑Zero로 스케일·시프트 조건 부여. • 학습 목적 \mathcal L_{\text{dm}} =\mathbb E_{\mathbf H_t,\boldsymbol\epsilon_t\sim\mathcal N(0,1),t} \Bigl[|\boldsymbol\epsilon_t-\epsilon_\theta(\mathbf H_t,t)|_2^2\Bigr] • Permutation Ambiguity 해결 벡터 셋은 고유 순서가 없기 때문에, 재귀적 (정렬 기반) 사인 위치 인코딩을 토큰에 가해 노이즈 단계에서도 상대 위치를 복원 가능하게 한다.

3‑3. 세 가지 생성 모드 • 초기 장면 샘플링: 학습 분포에서 (N_o,N_\ell)을 뽑거나 사용자가 지정. • 차선 조건 객체 생성: 미리 주어진 맵 M을 인코딩 후 객체 토큰만 확산. • 장면 인페인팅: 후면(F_N) 컨텍스트를 고정·노이즈 재설정하며 전방(F_P) 영역을 생성 → 연속 타일링 가능.

⸻

행동 시뮬레이션: CtRL‑Sim 확장판 • 정책 분해 p_\theta(\mathbf A_t,G_t\mid S_t) =\pi_\theta(\mathbf A_t\mid S_t,G_t),p_\theta(G_t\mid S_t) • G_t: 주어진 2 초 할인 리턴. • Exponential Tilting: 테스트 시 p_\theta(G_t\mid S_t)에 온도 조절 ⇒ 도전적(negative tilt) 또는 안전한(positive tilt) 운전 스타일을 유도. • k‑disks 토큰화로 차량·보행자·자전거 등 다종 에이전트 대응.

⸻

실험

5‑1. 데이터·평가 지표 • Waymo Open Motion·nuPlan 두 세트 모두 학습·테스트. • Lane Graph 품질: Urban Planning 4종(Connectivity, Density, Reach, Convenience) + Frechet Distance + 최대 Route Length 등. • Agent 배치 품질: 6종 JSD(이웃 거리, 차선 편차·각도, 길이·폭, 속도) + 충돌률.

5‑2. 주요 결과 • Lane Graph • Scenario Dreamer‑Base가 SLEDGE‑DiT‑XL보다 모든 항목 우수하면서 파라미터 2×, 추론 지연 6–7× 절감. • Agent Box • Waymo·nuPlan 양쪽에서 JSD와 충돌률을 대폭 낮춤. 예: nuPlan 충돌률 22.3 % → 11.9 %(Base) / 9.3 %(Large). • Ablation • Factorized Attention, Lane Ordering, 학습형 Topology가 모두 성능‑효율을 좌우함. Factorized 디자인 제거 시 지연 2× 증가. • RL Planner 벤치마크 • 동일 PPO 플래너(55 m 경로)는 Waymo 로그와 Scenario Dreamer 장면에서 유사 성적 → 현실성 입증. • 경로 길이를 100 m로 늘리거나 Negative Tilting을 적용하면 성공률이 59.8 %→38.2 %→32.1 %까지 하락 → 더 긴·적대적 시나리오가 실제로 ‘더 어렵다’는 점을 보여줌.

⸻

결론·한계·향후 과제 • 결론 시나리오 Dreamer는 첫 완전 데이터‑드리븐, 벡터화 기반, 폐루프 생성 시뮬레이터로서 기존 방식 대비 품질·속도·학습 비용 모두 우수함을 입증했다. • 한계 • 교통신호 상태가 충분히 일관되지 않음. • 차선(centerline)만 생성 → 도로 경계·횡단보도 등 다중 HD 맵 요소는 미지원. • 미래 방향 • 정적 구조물(가드레일·신호기), 동적 기상 조건 등을 포함한 다중 모달 장면 생성. • 언어·상황 조건 생성(“우천·야간·혼잡 교차로” 요청 등)과 정책‑시뮬레이터 공동 학습.

⸻

실무 시사점 • 자율주행 RL/IL 연구자 →稀륜(rare)·위험 시나리오를 대량 합성해 안전 마진을 극적으로 늘릴 수 있다. • 산업 실증팀 → GPU 한 대에서 멀티‑FPS로 장면을 스트리밍 생성·주행 가능하므로, 기존 수작업 환경 제작 비용·시간을 획기적으로 절감. • 교통 정책·검증 기관 → 현실 로그로만 탐색하기 어려운 극한 케이스를 생성‑검증 루프에 삽입해 규제·인증 패러다임을 개선할 수 있다.

⸻

위 설명으로 논문의 동기, 방법, 수학적 기초, 구현·성능, 의의를 모두 포괄했으니 원문 없이도 연구 내용을 재현·확장하실 수 있을 것입니다.

ray VS concurrent.futures.ProcessPoolExecutor

Tue, 10 Jun 2025 03:45:19 GMT

Ray 방식
- 여러 대의 컴퓨터(혹은 여러 개의 CPU 코어)에 작업을 “나눠서” 시켜줍니다.
- 코드에 @ray.remote를 붙이고 ray.get()만 해주면, 내부에서 알아서 작업을 분산 실행해 줍니다.
- 장점
  - 여러 대 컴퓨터를 동시에 쓸 수 있어, 처리를 아주 크게 늘릴 수 있어요.
  - 작업이 실패해도 다시 시도하거나 멈췄던 부분부터 이어서 할 수 있는 기능이 있어요.
- 단점
  - 시작할 때 관리 프로그램을 띄우는 데 시간이 걸려요.
  - 복잡한 내용을 컴퓨터끼리 주고받다 보면 오히려 느려질 때가 있어요.
  - 설정하고 유지하기가 조금 번거로워요.

SingleMachineParallelExecutor 방식
- 한 대의 컴퓨터 안에서만, 여러 프로세스나 스레드를 띄워서 동시에 처리합니다.
- worker.submit() 같은 간단한 호출만으로 병렬 처리가 돼요.
- 장점
  - 준비가 빠르고 설정이 단순해요.
  - 파일 읽고 쓰는 일이 많을 때, 같은 컴퓨터 안에서 처리하니 속도가 안정적이에요.
- 단점
  - 한 대 컴퓨터만 쓰기 때문에, 처리량을 훨씬 더 늘리고 싶으면 한계가 있어요.
  - 프로세스 하나가 뻗으면 전체 작업이 멈출 위험이 있어요.

어떤 방식을 쓰는 게 좋을까?
- 지금처럼 한 대 리눅스 머신에서 수백~수천 개 시나리오를 처리할 때는, 준비가 빠르고 오버헤드가 적은 SingleMachineParallelExecutor 쪽이 더 편하고 빠를 거예요.
- 나중에 컴퓨터 여러 대를 묶어 수만 개 시나리오를 처리해야 할 상황이 온다면, 그때는 Ray 도입을 고려해 보시면 좋습니다.

NVIDIA 드라이버 / CUDA Toolkit / cuDNN

Fri, 06 Jun 2025 09:50:27 GMT

개요

정리하자면

NVIDIA 드라이버: GPU를 컴퓨터가 인식하고 제대로 쓰게 해 주는 필수 프로그램
CUDA Toolkit: GPU에서 일반 계산을 빠르게 돌릴 수 있도록 도와주는 도구 세트
cuDNN: 딥러닝 연산을 더욱 빠르게 해 주는, GPU 전용 최적화 코드 묶음

쉽게 연결해 보면

컴퓨터에 GPU가 있으려면 먼저 NVIDIA 드라이버를 설치해서 운영체제와 GPU가 대화할 수 있게 합니다.
GPU를 계산에 쓰려면 CUDA Toolkit을 깔아서 프로그램(특히 C/C++로 짠 코드)을 GPU에서 돌릴 수 있도록 준비합니다.
딥러닝 같은 복잡한 계산은 cuDNN이 GPU용으로 최적화된 코드를 제공해 주니까, 프레임워크가 이걸 이용해 계산 속도를 높입니다.
PyTorch를 쓰면, 모델을 설계할 때 GPU 사용을 자동으로 처리합니다. PyTorch 내부에서 CUDA Toolkit을 통해 GPU로 데이터를 보내고, cuDNN으로 연산을 빠르게 수행합니다.

NVIDIA 드라이버
- 컴퓨터에 붙어 있는 NVIDIA 그래픽카드(GPU)가 제대로 작동하도록 돕는 소프트웨어
- 쉽게 말해, 운영체제(Windows나 Linux)와 그래픽카드 사이에서
  - “내가 이 그래픽카드를 쓰려면 이렇게 말해줘”라고 연락해 주는 중간 다리 역할을 합니다.
- 드라이버가 없으면 GPU가 제대로 움직이지 않거나, 아예 인식이 안 될 수 있어요.

CUDA Toolkit
- GPU를 계산(특히 병렬 계산) 용도로 쓰게 해 주는 도구 모음
- 예를 들어,
  - C나 C++으로 만든 프로그램이 “이 계산은 GPU로 더 빠르게 할 수 있어”라고 코드로 지시하면,
  - Toolkit 안에 들어 있는 컴파일러와 라이브러리가
    - 그 코드를 GPU가 이해할 수 있는 형태로 바꿔 줍니다.
- 결과적으로, 일반 CPU만 사용할 때보다 일정 종류의 계산(영상 처리, 과학 계산, 머신러닝 등)을 훨씬 빠르게 처리할 수 있게 해 줍니다.

cuDNN
- 딥러닝(인공신경망)용으로 특히 최적화된 작은 프로그램 묶음
- GPU에서 “컨볼루션” 같은 신경망 연산을 할 때 필요한 구체적인 계산을 빠르게 수행하도록 미리 짜여진 코드들
- 즉, 사용자가 직접 복잡한 최적화 알고리즘을 구현하지 않아도, cuDNN만 설치해 두면
  - 딥러닝 프레임워크(예: TensorFlow, PyTorch)가 자동으로 이 코드를 불러와서 연산 속도를 크게 높여 줍니다.

버전 맞추기

요약

PyTorch 공식표에서 “어떤 CUDA 버전과 짝이 맞는지” 확인하고
- ( “PyTorch 1.12는 CUDA 11.3과 호환됩니다”처럼)
- 그 표에서, 그 표에서 내 GPU driver 가 지원하는 CUDA 버전을 확인하세요.
conda 가상환경을 만들어서 conda install pytorch cudatoolkit=내CUDA버전 -c pytorch 한 줄로 설치하거나
Docker를 써서 이미 설정된 환경을 그대로 가져오면
- 매번 복잡하게 버전 맞추지 않아도 되고
- 프로젝트마다 겹치지 않게 관리할 수 있어요.

미리 정해진 버전표를 참고하기
- PyTorch 홈페이지(또는 각 깃허브 README)에 보면 “어떤 CUDA 버전과 함께 쓰세요”라는 표가 있어요.
- https://pytorch.org/get-started/locally/
- https://pytorch.org/get-started/previous-versions/
- 예를 들어, “PyTorch 1.12는 CUDA 11.3과 호환됩니다”처럼요.
- 먼저 그 표에서 내 GPU 드라이버(운영체제→GPU가 알아먹도록 도와주는 프로그램)가 지원하는 CUDA 버전을 확인하세요.
conda 같은 가상환경을 활용하기
- 시스템 전체에 설치하는 대신, 프로젝트마다 별도의 공간(가상환경)을 만듭니다.
- 예시:
```
conda create -n myenv python=3.9
conda activate myenv
```
- 이렇게 하면 여기 안에만 PyTorch와 CUDA가 설치되고, 다른 프로젝트와 충돌하지 않습니다.
CUDA 툴킷은 따로 설치하지 않아도 되는 방법 사용하기
- 요즘은 PyTorch에 “CUDA 포함 버전”이 있어서, 따로 CUDA 툴킷을 깔지 않아도 됩니다.
- 예를 들어, 아래 명령어 하나면 PyTorch와 필요한 CUDA가 같이 설치돼요:
```
conda install pytorch=1.12 cudatoolkit=11.3 -c pytorch
```
- 이때 cudatoolkit=11.3은 PyTorch 공식표에 맞춰서 쓰는 겁니다. 그러면 내 시스템 드라이버가 CUDA 11.3을 지원하면 바로 잘 동작해요.
cuDNN은 PyTorch 패키지 안에 포함된 경우가 많음
- conda로 위처럼 설치하면 cuDNN도 자동으로 맞춰서 깔아 줍니다.
- 따로 cuDNN을 설치할 필요 없이, conda 안에서 알아서 버전 호환을 맞춰 줘요.
만약 시스템에 CUDA 툴킷을 직접 깔아야 한다면
- 내 드라이버가 지원하는 CUDA 버전을 먼저 확인:
```
nvidia-smi
```
  이 명령어를 치면 “CUDA Version: XX.X”라고 나옵니다.
- 그 다음, 그 CUDA 버전에 맞는 cuDNN 버전을 NVIDIA 홈페이지에서 다운받아서 설치하세요.
- 그리고 나서 PyTorch를 “CUDA 없이 CPU 버전만” 설치하거나, CPU/CUDA 버전이 분리되어 있다면 “내 CUDA 버전과 맞는” PyTorch를 설치하면 됩니다.
매번 버전 맞추기 귀찮으면 Docker 쓰기
- Docker 이미지 안에는 이미 GPU 드라이버·CUDA·cuDNN·PyTorch가 다 설정돼 있어요(예: NVIDIA의 “CUDA Toolkit” 공식 이미지).
- 예시:
```
docker run --gpus all -it pytorch/pytorch:1.12.0-cuda11.3-cudnn8-runtime
```
- 이렇게 하면 “버전 충돌” 걱정 없이 바로 딥러닝 코드를 돌릴 수 있습니다.

[python] multi processing VS threading

Thu, 05 Jun 2025 02:30:36 GMT

입출력 작업을 병렬로 하고 싶으면, 멀티쓰레딩

여러 작업자가 같은 파일을 동시에 읽어야 할 때는 멀티스레딩이 더 적합합니다.
한 스레드가 입출력 작업을 기다리는 동안 GIL이 풀리기 때문에,
- 다른 스레드가 CPU 계산이나 다른 I/O 작업을 계속 수행할 수 있습니다.

이유

메모리 공유:
- 스레드는 하나의 프로그램 안에서 메모리를 함께 쓰기 때문에,
- 파일을 한 번만 열어 두면 같은 데이터를 여러 스레드가 그대로 꺼내 쓸 수 있습니다.
캐시 활용:
- 운영체제가 “한 번 읽어온 파일 블록”을 메모리에 저장해 두면,
- 다른 스레드가 같은 부분을 읽을 때 디스크를 다시 읽지 않고 메모리 캐시를 바로 활용합니다.
반면 멀티프로세싱은
- 각 프로세스가 독립적으로 메모리를 가져서, 동일한 파일을 여러 번 열고 읽어야 합니다.
- 이때 디스크에 동시에 여러 번 접근하면서 입출력 대기 시간이 늘어납니다.

입출력 작업(I/O 중심 작업)이 뭔데?

프로그램이 “외부 장치와 데이터를 주고받는 일” 때문에 대부분 시간을 보내는 경우를 말해요.
- 예를 들어:
파일 복사:
- 큰 동영상 파일을 복사할 때, 하드디스크에서 읽고 또 쓰는 데 시간이 더 오래 걸리죠.
웹에서 이미지 다운로드:
- 서버에서 이미지가 도착할 때까지 기다리는 시간이 길어요.
데이터베이스 조회:
- DB 서버에서 결과가 돌아올 때까지 멈춰 있는 시간이 많습니다.
이런 작업은 내부 계산(숫자 연산)을 빠르게 끝내더라도, 외부 I/O를 기다리느라 실제로 프로그램이 쉬는 시간이 많아요.
그래서 “CPU를 바쁘게 쓰느냐, 아니면 외부 장치를 기다리느냐”로 구분할 때, 후자에 해당하는 작업을 I/O 중심 작업이라고 부릅니다.

GIL(Global Interpreter Lock)?

파이썬 인터프리터 파이썬으로 쓴 코드를 한 줄씩 읽어서 바로 실행하는 프로그램입니다. 따로 변환 과정을 거치지 않기 때문에, 코드를 쓰고 바로 실행해 볼 수 있습니다.
GIL 파이썬 인터프리터 안에 있는 잠금 장치로, 한 번에 하나의 스레드만 파이썬 코드를 실행할 수 있게 합니다. 계산 위주의 작업은 동시에 처리되지 않지만, 파일 읽기나 네트워크처럼 기다리는 일이 많을 때는 잠금이 풀려서 다른 스레드가 실행됩니다.
멀티 프로세싱과 GIL의 관계 멀티 프로세싱은 파이썬 프로그램(인터프리터)을 독립된 여러 프로세스로 나눠 실행하는 방식입니다. 각 프로세스는 자체 파이썬 인터프리터와 GIL을 가지므로, 한 프로세스가 잠겨 있어도 다른 프로세스는 영향을 받지 않고 동시에 계산을 할 수 있습니다.

DiffusionDrive VS Diffusion Planner

Thu, 27 Mar 2025 02:40:06 GMT

서로의 장단점 상대적 비교

DiffusionDrive

(장점) 초기 anchors 기반의, multiple 미래 궤적s 생성

ego와 주변 agents의 multiple 미래 궤적 생성을 위해, 초기 anchors를 기반으로 생성함.
- 인간 운전자는 random distribution이 아닌, 확립된 주행 패턴을 따르므로 합리적 (좌회전 / 직진 / 우회전 / 차선 변경 등)
inference 속도를 더 빠르게 할 수 있음
TODO: 이 방법이 이론적 보증이 되는 방법인지 확인 필요

(장점) 궤적만 출력하는게 아니라, 각 궤적의 확률까지 출력

한번 생성만으로, 좌회전 / 직진 / 우회전 경로와 그 확률을 생성할 수 있음

(장점) 네트워크 아키텍쳐에서의 장점

ego의 과거 정보를 활용하여, 주변 agents 과거 정보들과 attention을 취한다.

(장점) 어려운 시나리오 위주의 데이터셋인 NAVSIM 으로 학습

직진 시나리오만 많은게 아니라, 어려운 시나리오가 많아서,
- 큰 lateral movement 가 필요한 주행 전략을 잘 학습한 것으로 기대됨

Diffusion Planner의 장점

(장점) 합리적인 network input

navigation information을 사용 (차량이 어디를 가야하는지 알고 학습함)
(장점) predicted trajectories에 yaw까지 출력함
물론 속도까지 출력하거나, 아예 accel이나 steering을 출력하는게 더 나을 수 있다.
(장점) guidance sampling 로직 제공
주변 차량의 미래 궤적과의 충돌 방지
목표 속도 준수
승차감 준수
drivable area 안에 머물도록
(장점) 네트워크 아키텍쳐에서의 장점
모든 agents의 미래 궤적 간 attention을 수행함
DiT를 썼는데, navigation과 timestep의 처리 방법이 합리적입니다.
- navigation information과 timestep 같은 경우에는 adaLN-Zero 방식으로 적용하는게 더 합리적.
- 나머지 정보(주변 차량 정보, 차선 정보 등)은 cross-attention으로 처리
  (특징) 네트워크 아키텍쳐에서의 특징
mlp-mixer 을 이용하여, 주변 agents과거 정보와 lane의 특징정보들을 뽑습니다.
- 빠른 추론 속도, 적은 모델 paramter 목표

네트워크 비교

DD

ego와 주변 agents 포함 = 모든 agents
과정 1
모든 agents가 아래의 과정을 통해 super agents 정보(instance_feature)를 도출함
- 자신의 과거정보끼리 attention도 하고,
- 서로 agents끼리 attention도 진행
- map(도로정보) 와도 attention 진행
  과정 2
주변 궤적 예측
- super agents 정보(b, 900, 6, 256)와 agents anchor trajectories (b, 900, 256)을 이용해서, -> (b, 900, 6, 256) 으로 concat
- fcs를 통과시켜, 주변 궤적 예측(과 확률)을 예측함
  과정 3: diffusion 으로 수행
ego 궤적 예측
- 중간 궤적 생성 결과(plan_anchor가 진화하는 과정)에 해당하는 카메라 image pixel들을 가져와서,
  - 각 궤적과 DAF 를 수행합니다.
- ego 미래 궤적들 끼리, self attention 수행합니다.
- 중간 궤적 생성 결과와 super agents 정보간 attention 수행합니다.
- anchor_cross_attention 도 수행합니다. (말로 표현하기 어려움)
  - 중간 궤적 생성 결과와 자기 자신의 멋진 정보들과, cross attention
- timestep modulation 진행
- fcs를 통과시켜, ego 궤적 예측(과 확률)을 예측함

장점

ego의 과거 정보를 활용하여, 주변 agents 과거 정보들과 attention을 취한다.
Deformable DETR을 활용하여, camera 정보와 attention을 수행했다.

단점

모든 agents의 미래 궤적 간 attention을 수행하지 않음

DP

과정 1

super 정보(주변 차량들, 차선, 정적 장애물 등) 만드는 과정
- 각 모달리티 마다 MLP-Mixer을 통과시킵니다.
- 그 후, self-attetion을 수행합니다.
  과정 2: diffusion 으로 수행
모든 agent(ego+주변) 미래 궤적 끼리 self attetion 수행
모든 agent(ego+주변) 미래 궤적와 super agents 정보를 cross attention

장점

모든 agents의 미래 궤적 간 attention을 수행함

단점

ego의 과거 정보를 활용하지 않는다.

특징

mlp-mixer 을 이용하여, 주변 agents과거 정보와 lane의 특징정보들을 뽑습니다.
- 빠른 추론 속도, 적은 모델 paramter 목표
DiT를 썼습니다.
- navigation information과 timestep 같은 경우에는 adaLN-Zero 방식으로 적용하는게 더 합리적

학습 방법 비교

DD

$x_0$를 맞추는 방식으로 학습 했습니다.

DP

score을 맞추는 방식으로 학습 했습니다.

기타

DD

guidance 로직을 제공하지 않습니다.

DP

guidance로직을 제공합니다.

MPC VS Multi-modal trajectory IL VS Diffusion

Wed, 26 Mar 2025 08:07:49 GMT

-1. trajectory prediction + MPC VS diffusion model

-1.1. trajectory prediction + MPC

장점
- 1. 안전성 및 설명가능성
  - 명시적으로 cost function과 Constraint를 걸 수 있다.
단점
- 성능 극대화의 이론적 한계
  - 주변 agent trajectory prediction 모듈과, MPC의 cost function이 다르기 떄문에,
    - 각 모듈의 성능을 극대화하는 것이, 전체 시스템의 성능을 극대화하는 방향이라고 볼 수 없다.
- 복잡한 상황으로 갈수록 -> 계산 요구량이 증가
  - MPC 로직의 예측 단계에서 경로 생성 후보군의 개수가 더 많이 필요해질 수 있음
  - 제약 조건을 만족하는 최적의 제어 입력을 찾는 것이 어려워짐
  - 주변 환경이 복잡해지면 -> 비선형성이 증가하고 -> 더 많은 계산 시간 필요
    - 비선현성을 단순화하거나 무시한 모델은 -> 정확도가 떨어질 수 있음

-1.2. Diffusion Model

복잡한 환경에서 뛰어난 성능 기대 가능
장점
- 1. 데이터 기반 학습이기 때문에, 복잡한(어려운) 환경에서의 높은 성능을 기대할 수 있다.
  - trajectory prediction 처럼 명시적인 주변환경 모델링이 필요 없을 수 있고, 데이터로부터 환경의 특징을 자동으로 학습한다.
단점
- 1. 전문가 데이터 의존성
- 1. 설명 가능성 X
- 1. 느린 FPS가 문제였는데 해당 논문에서는 상당히 해결한듯

why diffusion?

두괄식 결론: Multi-modal trajectory IL는 이론적으로 Diffusion 기반 플래닝처럼 다중 모달 특성을 잘 학습할 수 있는 잠재력을 가지고 있음. 다만 아래 3가지가 잘 갖춰져야 함.
- 충분하고 다양한 훈련 데이터
  - 모델이 다중 모드를 학습하기 위해서는, 다양한 상황에서 전문가가 취한 여러 행동이 데이터에 충분히 포함되어야 합니다.
- 대규모 궤적 후보 집합의 품질
  - 후보 집합이 현실적이고 다양한 궤적들을 포함한다면, 모델은 그 후보들 사이에서 올바른 확률 분포를 학습할 수 있습니다.
- 확률 분포의 정확한 캘리브레이션
  - 모델이 각 후보의 상대적 가능성을 정확하게 예측해야 하며,
  - 그렇지 않으면 하나의 모드로 확률이 집중되는 문제(모드 붕괴)가 발생할 수 있음

항목	Diffusion 기반 플래닝	VADv2 (Probabilistic Planning)
멀티모달 표현	- 연속적인 확률 분포를 암시적으로 모델링하여 이론적으로 무한한 다양성을 생성할 수 있음 - 노이즈 제거(denoising) 과정을 통해 다양한 모드를 자연스럽게 캡처	- 대규모 이산 궤적 후보 집합(trajectory vocabulary)을 사용하여, 각 후보에 대한 확률 분포를 학습 - 후보 집합 내에서 다양한 모드를 표현하려 함
확률적 샘플링 및 다양성	- 반복적인 denoising 단계에서 내재된 확률성을 통해 매 실행마다 서로 다른 궤적을 생성 - 매우 다양한 행동 모드를 자연스럽게 샘플링 가능	- 한 번의 순전파로 후보 궤적에 대한 확률 분포를 출력하고, 그 중 하나를 샘플링 - 후보 집합의 크기와 캘리브레이션에 따라 다양성 한계가 있을 수 있음
추론 속도	- 다단계 iterative 샘플링이 필요하여 계산 비용이 높고, 실시간 적용에는 다소 느림	- 단일 순전파로 확률 분포를 출력하므로 빠른 추론이 가능, 실시간 주행 플래닝에 유리
제약 조건 통합	- 물리적/안전 제약을 명시적으로 반영하기 어려워 추가적인 guidance 메커니즘이 필요할 수 있음	- 미리 정의된 후보 궤적이 물리적으로 실행 가능한 범위 내에 구성되어 있어, 제약 조건(예: 충돌 회피, 차선 준수 등)을 쉽게 반영할 수 있음
표현력 및 유연성	- 연속적이고 세밀한 행동 변화까지 모델링할 수 있어 매우 높은 표현력을 가짐 - 다양한 상황에서 미세한 차이를 반영할 수 있음	- 후보 집합의 크기와 품질에 따라 표현력이 결정됨 - 후보 집합에 포함되지 않은 미세한 변화는 표현하기 어려울 수 있음
데이터 의존성	- 전체 데이터 분포를 학습하기 위해 매우 다양한 대규모 데이터가 필요함 - 데이터가 부족하면 모드 붕괴의 위험이 있음	- 다중 모달 분포 학습 역시 충분한 다양성의 데이터에 의존 - 후보 집합이 데이터 내 모드를 충분히 반영하지 못하면, 제한된 모드만 표현될 위험이 있음
확률 분포 캘리브레이션	- 최대우도 기반 학습으로 자연스럽게 확률 분포를 캘리브레이션함	- 각 후보에 대한 확률을 정확히 학습해야 함 - 한 모드에 확률이 과도하게 집중되면, 다중 모달성이 제대로 표현되지 않을 위험이 있음

표현력 면에서는
- diffusion 모델이 이론적으로 무한히 많은 다양한 궤적을 생성할 수 있는 반면,
- VADv2는 후보 집합 내에서만 표현이 가능
- 그러나 후보 집합의 크기가 매우 크다면 VADv2도 충분한 다양성을 가질 수 있습니다.
또한,
- diffusion 모델은 내재된 확률적 샘플링으로 인해 매번 다르게 생성되는 반면,
- VADv2는 주어진 입력에 대해 고정된 확률 분포를 예측하므로, 모델이 학습한 분포의 정확한 캘리브레이션에 의존하게 됩니다.

다음 토픽

이제 imitation learning에서 발생하는 multi-objective conflicts와 학습 신호 부족 문제를 논리적이고 전문적으로 설명해 드리겠습니다.
또한, diffusion-based planning이 이러한 단점을 어떻게 극복하는지도 비교 분석하겠습니다.
1. IL에서의 Multi-Objective Conflict 발생 원인
여러 목표의 충돌:
- 자율주행 분야의 모방 학습(IL)은 안전성, 주행 효율성 등 여러 목표를 동시에 만족해야 합니다.
- 그러나 이러한 목표들은 상충되기 쉽습니다. 예를 들어 “빠른 주행”(효율성)과 “충돌 회피”(안전성)는 트레이드오프 관계에 있습니다.
- 이렇듯 IL 정책이 둘 다 만족하도록 학습하기는 어려워, 한 목표를 충족하면 다른 목표가 희생되는 충돌 상황이 발생합니다.

보조 손실(auxiliary loss)의 영향:
- 이러한 다목적 상황을 다루기 위해 IL 모델에 보조 손실 함수를 추가하는 접근이 있습니다.
- 예를 들어, 전문가 시연 데이터를 모방하는 기본 손실 외에 안전 위반에 대한 패널티 손실을 추가해 충돌이나 이탈을 벌점으로 주는 것입니다.
- 그러나 이러한 다중 손실 최적화는 모델 학습을 어렵게 만듭니다.
- 보조 손실이 도입되면 IL 모델은 여러 학습 목표 간의 균형을 맞춰야 하는데, 자칫하면 한쪽 손실을 줄이려다 다른 쪽 손실이 늘어나는 문제가 생깁니다
- 요약하면, IL에서 여러 목적을 동시에 최적화하려다 보면 각 목적의 요구사항이 충돌하여 학습 및 정책의 품질 저하로 이어집니다.

2. 실수로부터의 회복 학습 신호 부족 문제 분석

데이터 분포 및 일반화 한계:
- 모방 학습은 훈련 데이터 분포 내에서만 최적화되기 때문에 분포 밖(out-of-distribution, OOD) 상황에 대한 일반화가 취약
- 다시 말해, IL 에이전트는 전문가 시연이 담긴 경로만 따라 배우므로, 그 범위를 벗어난 상태나 상황에 직면하면 적절한 대응을 알지 못합니다.
- 이는 Covariate Shift 문제로도 알려져 있는데,
  - 작게 시작된 오류가 시간이 지날수록 상태 분포를 점점 전문가의 분포와 다르게 만들면서 오류가 누적되는 현상
- 예를 들어 자율주행 IL 모델이 한 번 핸들을 잘못 틀어 차선을 살짝 이탈하면, 그 이후 상태는 훈련 때 본 적 없는 이탈된 위치가 됩니다.
- 이 상태에서 모델은 전문가 데이터에 없는 동작을 해야 하지만, 어떤 행동이 바람직한지 학습된 신호가 없기 때문에 엉뚱한 출력을 내거나 추가 실수를 범하기 쉽습니다.
- 결국 한 번의 실수가 자체 교정되지 않고 연쇄적인 실패로 이어질 수 있습니다.

“회복(recovery)” 행동 학습의 부재:
- 일반적인 IL은 전문가의 올바른 행동 시퀀스만 학습하므로, 실수를 저지른 후 이를 만회하는 전략을 배울 기회가 없습니다.
- 훈련 데이터에 실수 후 복구하는 사례가 거의 없기 때문에, 에이전트는 잘못된 상태에서 무엇을 해야 다시 정상 상태로 돌아올지 알지 못합니다.
- 보조 손실을 통해 일부 잘못된 행동에 패널티를 준다고 해도, 이는 그 행동을 피하도록 벌주는 것일 뿐 어떻게 복구할지 가르쳐주는 구조적 신호는 아닙니다
- 예컨대 차선 이탈 상태 자체에 큰 벌점을 주는 손실을 추가하면, 모델은 “이탈하지 말라”는 것은 배우지만 이미 이탈한 경우 다시 차선으로 복귀하는 조향 방법은 배우지 못합니다.
- 요컨대, IL의 지도학습 구조는 실패 후 보상에 대한 정보가 없어서 “잘못된 상태에서 옳은 상태로 복귀”하는 행동을 학습시키기가 어렵습니다.

IL 대비 RL의 회복 학습:
- 강화학습(RL)은 환경으로부터 보상(reward) 신호를 받아 실수 이후의 결과까지 고려하며 학습합니다.
- RL 에이전트가 잘못된 행동으로 불이익을 받았다가도 다시 목표를 달성하면 보상을 획득하므로, 시간에 걸친 누적 보상을 극대화하는 과정에서 자연스럽게 오류 이후에 상황을 수습하는 행동까지 학습됩니다.
- 반면 IL에는 이러한 환경 상호작용 신호가 없고, 오직 정답 행동 모방만 있으므로 실수 이후 전략을 터득하기 어렵습니다
- 요약하면, IL의 한계는 “실수 예방”에만 초점이 있고 “실수한 이후 어떻게 정상으로 돌아오는지” 가르칠 방법이 없다는 점입니다.
- 그 결과 IL로 학습된 정책은 안전 장치나 추가 규칙 없이는 실수가 발생할 경우 복구하지 못하고 실패로 이어질 가능성이 높습니다.

3. Diffusion 기반 플래닝을 통한 문제 해결

(1) Multi-Modal 주행 데이터 학습:
- Diffusion 모델 기반 플래닝은 다중 모드의 주행 행동 분포를 자연스럽게 학습할 수 있습니다.
- Diffusion 모델은 데이터를 생성하는 확률 분포 자체를 모델링하므로, 주어진 상황에서 나올 수 있는 여러 가지 행동 경로(모드)를 모두 포괄적으로 학습합니다.
- 예를 들어 동일한 교차로 상황에서 좌회전, 우회전, 직진 같은 복수의 합리적 선택지가 있다면,Diffusion 기반 모델은 표본 샘플링을 통해 좌회전, 우회전 각각의 가능성을 명확히 표현할 수 있습니다.
- 그 결과 모델은 인간 운전자가 보이는 복잡한 다중 양상 행동까지도 모사할 수 있고, 특정 상황에서 안전을 우선시하는 행동과 효율을 우선시하는 행동 모두를 잠재적으로 생성할 수 있습니다.
- 이는 하나의 결정만 내리는 일반적인 정책과 달리 여러 목표를 충족하는 다양한 정책 공간을 학습한 것으로 볼 수 있어, multi-objective 갈등을 줄여줍니다.

(2) 안전 및 복구 메커니즘 (Classifier Guidance):

Diffusion Planner에서는 분류기 기반 가이드(classifier guidance) 메커니즘을 활용하여 안전한 계획을 달성
구체적으로, 추가 학습 없이도 주행 경로의 안전도나 쾌적성에 관한 분류기를 통해 생성 과정을 실시간으로 제어
이 가이드 분류기는 일종의 “즉각적인 피드백” 역할을 하여, 샘플링 중 위험도가 높아지는 방향으로 가면 비용을 높이고 안전한 방향으로 생성되도록 확률 분포의 경사(gradient)를 조정
그 결과 Diffusion Planner는 학습 단계에서 별도의 안전 손실을 두지 않고도도 추론 단계에서 안전 목표를 달성할 수 있습니다.
이는 곧 훈련 시 다중 목적 충돌을 피하면서도 실행 시에는 안전을 확보하는 방법입니다.
예를 들어 Diffusion Planner의 출력 경로 중 충돌 가능성이 보이는 경로는 분류기 가이드에 의해 확률이 낮아지고, 대신 장애물을 피하거나 속도를 줄이는 대체 경로가 선택됩니다.
이러한 구조적 메커니즘 덕분에 에이전트가 실시간으로 위험을 감지하고 경로를 수정하여 실수를 미연에 방지하거나, 작은 실수가 발생해도 즉각 경로를 보정할 수 있습니다.
요컨대, Diffusion 모델은 직접 “복구” 행동을 별도로 학습하지 않았더라도,
- 생성 과정 내에 내재된 피드백 루프를 통해 실수로 치우친 경로를 안전한 방향으로 끌어오는 효과를 얻습니다.

(3) 학습 분포 밖 상황에 대한 유연한 대응:

Diffusion 기반 플래닝은 훈련된 분포를 약간 벗어난 행동이 필요한 경우에도 상대적으로 유연하게 대응합니다.
일반 IL 모델이 전혀 보지 못한 상황에서는 출력 품질이 급격히 저하되지만, Diffusion Planner는 생성 모델의 풍부한 표현력과 가이드 메커니즘 덕분에 드문 상황도 비교적 그럴듯한 행동을 만들어낼 수 있습니다.
분포를 벗어난 행동이 요구되는 극단 상황에서, 분류기 가이드에 새로운 제약이나 목표를 주입하면 모델은 이를 반영해 기존 데이터에는 없던 해결책도 모색합니다.
예를 들어 데이터에는 거의 없었던 긴급 차량 양보 상황이 발생해도, “정지 상태 유지”와 같은 안전 우선 목표를 분류기 신호로 주면 모델은 안전하게 정차하는 경로를 생성할 수 있습니다.
이는 사후 조정이 어려운 기존 IL 정책과 달리, Diffusion 모델이 추론 시의 조건 변화에 따른 행동 수정을 가능하게 함을 의미합니다 .
실제 연구에서도 Diffusion Planner가 학습 때 사용하지 않은 주행 스타일에도 강인하게 일반화함을 보였고

요약:

Diffusion 기반 플래닝은 다중모달 행동생성 능력과 유연한 제어 메커니즘을 통해 IL의 한계를 보완합니다.
훈련 시에는 보조 손실 없이도 복잡한 운전자 행동 분포를 학습하여 다목적 충돌 문제를 완화하고,
실행 시에는 분류기 지침을 통한 실시간 피드백으로 안전한 경로 선택과 오류 수정을 가능케 합니다.

Probabilistic IL Planning은 Out-of-Distribution(OOD)에 강인한가? Diffusion 기반 Planning과 비교

✅ 두괄식 결론
- VADv2와 같은 Probabilistic Planning은 단순한 BC보다 OOD에 강인하지만, Diffusion 기반 Planning보다는 OOD 일반화 능력이 낮을 가능성이 큽니다.
- Diffusion 기반 Planning은 연속적인 분포를 모델링할 수 있어 OOD 상황에서 더 유연한 대처가 가능
Probabilistic Planning의 OOD 강인성 분석:
- 이 방식은 단순한 deterministic behavior cloning보다 더 나은 일반화 성능을 보일 수 있습니다.
- 예를 들어, VADv2가 보지 못한 도로 환경에서 실행될 경우, 여러 개의 후보 궤적을 평가하고 확률적으로 가장 적절한 선택지를 샘플링할 수 있습니다.
Probabilistic Planning의 OOD 성능의 주요 한계
- VADv2는 여전히 학습 데이터 내의 후보 궤적 집합에 의존
- 따라서 학습 데이터에서 경험하지 못한 극단적인 OOD 상황에서는 비현실적인 궤적을 출력하거나 모드 붕괴(mode collapse)가 발생할 가능성이 있음
Diffusion 기반 Planning과 비교:
- Diffusion 모델은 OOD 상황에서 더 유연하게 대응
- 이는 단순한 후보 궤적 집합을 선택하는 방식보다 연속적인 일반화 능력을 제공
- Diffusion 모델은 데이터 분포를 직접 모델링하는 생성적 접근법을 사용하므로, 학습되지 않은 OOD 상황에서도 기존 데이터 분포에 가깝게 샘플링할 가능성이 높습니다.
- Diffusion 기반 Planning의 OOD 한계:
  - 학습 데이터에 포함되지 않은 완전히 새로운 환경에서는 Diffusion 모델도 여전히 제한적인 성능을 보일 수 있습니다.
  - 하지만, classifier guidance와 같은 보조 신호를 이용하면, OOD 상황에서 안전하고 일관된 행동을 강화할 수 있습니다.

2. Hydra-MDP가 Multi-Objective Conflict를 해결할 수 있는가?

✅ 두괄식 결론:
- Hydra-MDP는 Multi-Objective Conflict 문제를 완화할 수 있지만, 완전히 해결하지는 못함.
- 특히 inference 단계에서 여러 head의 출력을 적절히 결합하는 방식이 필요하며, 단순한 Multi-Head 구조만으로는 완벽한 해결책이 될 수 없음.

Hydra-MDP가 MOC를 해결하는 방식:
- Hydra-MDP는 여러 개의 학습 목표를 개별 head에 할당하여 학습하는 Multi-Target Learning 접근 방식을 사용합니다.
- 즉, 한 모델이 모든 목표를 동시에 최적화하는 것이 아니라, 각 head가 특정 목표를 전담하여 학습함으로써, 다중 목표 충돌을 완화할 수 있습니다.
Hydra-MDP의 강점:
- 각 head가 특정 목표를 담당하여 학습하기 때문에, 하나의 네트워크에서 모든 목표를 동시에 최적화하려고 할 때 발생하는 손실 함수 간의 충돌을 줄일 수 있음.
- 예를 들어, 하나의 head는 안전한 주행을 학습하고, 다른 head는 효율적인 주행을 학습하는 방식으로 학습이 진행되면,
  - 개별적으로 최적화가 이루어져 MOC 문제가 완화될 수 있음.
Hydra-MDP의 한계:
- Multi-Head 구조를 사용하더라도, inference 시에는 결국 하나의 행동을 선택해야 하므로 다중 목표 간 충돌이 완전히 해소되는 것은 아님.
- Head 간 학습이 독립적이지만, inference 시 어떤 head를 선택할지 결정하는 방식이 적절하지 않다면,
  - 여전히 모순된 목표를 반영하는 행동이 선택될 위험이 존재함.

3. Diffusion 기반 Planning이 Multi-Objective Conflict 문제를 피할 수 있는 이유

✅ 두괄식 결론:
- Diffusion 기반 Planning은 훈련 시 Multi-Objective Conflict 문제를 피하고, 추론 시 목표를 주입하는 방식으로 유연한 조정이 가능하므로, 학습 단계에서 여러 목표를 한꺼번에 최적화하면서 생기는 충돌을 피할 수 있습니다.
- Diffusion은 Signle Objective Function(인간 모방)만으로 학습해도, 어느 정도 Multi-Objective를 달성할 수 있습니다.

이유 1: Diffusion은 Signle Objective Function(인간 모방)만으로 학습해도, 어느 정도 Multi-Objective를 달성할 수 있습니다. 그 이유는
- 인간 시연 자체가 다양한 양상(multi-modal)을 가지는데, 행동 모방 기법은 이러한 복잡한 분포를 완전히 학습하지 못하기 쉽습니다.
- IL은 diffusion과 다르게, 비 연속적인 확률 분포를 학습하기 때문
- 즉, diffusion은 데이터에 내재된 바람직한 연속적인 행동 분포를 그대로 학습하도록 해줍니다.
이유 2: guidance
- 무엇보다도 가이던스는 추론 단계에서 적용되므로, 학습 단계에서 여러 목표를 한꺼번에 최적화하면서 생기는 충돌을 피할 수 있습니다.
- 또한 inference시, 별도의 추가 학습 없이
  - 각 목표(승차감, 목표속도, 충돌위험도)에 대한 가이드 강도를 가중치로 조절할 수 있으므로,
  - 다중 목표 간 트레이드오프를 세밀하게 조정할 수 있습니다. (다만 이 이유는 Hydra-MDP 도 공통으로 가질 수 있는 장점임.

Trajectory에 속도/가속도/곡률변화 추가하기

Wed, 19 Mar 2025 12:13:54 GMT

문제 정의

input (주어진 정보)
- delta t초 간격으로 M개 point의 궤적이 있음.
- 다만, point 당 정보는 x,y,yaw만 있는 상태
output (구하고 싶은 것)
- 위 궤적에서, 각 point당 velocity, acceleration 등을 구하고 싶음

방법 1: LQR 코드에서 reference point 잡는 방식

함수: get_velocity_curvature_profiles_with_derivatives_from_poses
1. velocity, acceleration 구하기
여기서 속도, 가속도는 각각 스칼라 값이다. x , y 방향을 따로 구하지 않는다.

1.1. _fit_initial_velocity_and_acceleration_profile

목적
- 첫 점의 속도와 모든 점의 가속도를 구하는 함수
- 이는 각 미래 점의 위치 변화량을 잘 성명하기 위함이다.
가정
- 각 구간 당 (delta t초 길이) 등가속도 운동을 가정한다.
방식

1.2. _generate_profile_from_initial_condition_and_derivatives

목적
- 위 함수에서 구한 output으로부터, 모든 지점의 속도를 구한다.
방식
- 등가속도 운동공식을 통해 쉽게 도출

2. curvature와 curvature rate 구하기

2.1. _fit_initial_curvature_and_curvature_rate_profile

목적
- 첫 점의 curvature와 모든 점의 curvature rate를 구하는 함수
- 이는 각 미래 점의 heading 변화량을 잘 설명하기 위함이다.
방식
- $\Delta heading \approx v * curvature * \Delta t$ 수식 활용
각 구간(delta t)에서, curvature rate가 일정하다는 가정

2.2. _generate_profile_from_initial_condition_and_derivatives

목적
- 위에서 구한 정보들을 이용해서, 모든 점의 curvature을 구한다.

자 이제, 모든 future trajectory의 속도와, 곡률을 구했다.
3. 추가 정보: reference point 구하기
_compute_reference_velocity_and_curvature_profile
목적
- 위에서 구한 velocity_profile, curvature_profile을 바탕으로,
- 현재 시점 + lookahead 만큼 떨어진 시점에서의
  - 아래 2개를 구한다.
    - 속도 : reference_velocity를 보간(np.interp)으로 찾는다.
    - 곡률 : reference_curvature_profile을 추출한다. (np.interp 사용)
  - 참고: lookahead 점 선정 방법
    - (= current_iteration.time_point.time_s + tracking_horizon(=10 point) * discretization_time)
      3.1. 구체적 방법
속도와 곡률 모두 선형 보간 하는게 전부임

Inductive Moment Matching

Tue, 18 Mar 2025 02:00:29 GMT

0. 원본 자료

논문: https://arxiv.org/pdf/2503.07565 (2025년 3월 10일 생 논문, DDIM 저자들이 쓴 논문)
서브-논문: https://arxiv.org/pdf/2503.07154
영문 블로그: https://lumalabs.ai/news/inductive-moment-matching
github : https://github.com/lumalabs/imm

Abstract

기존 연구에서는 diffusion의 추론 속도를 높이기 위해 두 가지 방법을 사용했습니다.

첫번째 방법: Distillation
- 목표: 원래 수백 단계의 생성 과정을 몇 단계(few-step)로 압축합니다.
- 문제점:
  - 한 번에 큰 비선형 변화를 예측해야 하므로, 학습이 불안정해지고 에러가 누적될 위험
  - 생성 분포와 실제 데이터 분포의 평균, 분산 등 모든 통계 정보를 맞춰야 하는데,
    - 단계가 줄어들면 이 과정이 훨씬 민감해집니다.
    - (각 단계가 담당하는 역할이 커지면서 목표 함수의 변화폭도 커짐)
  - 결과적으로 매우 세밀한 하이퍼파라미터 튜닝이 필요

두번째 방법: Consistency Model (CM)
- 목표: 한 스텝으로 생성한 결과와 여러 스텝으로 생성한 결과가 비슷하도록 만듭니다.
- 한계점:
  - CM은 주로 개별 샘플의 각 시점의 예측값 차이를 줄여 평균(첫 번째 모멘트)만 맞추게 됩니다.
  - 그래서 분산이나 왜도와 같은 고차원 통계 정보(나머지 모먼트 값들)는 충분히 반영되지 않을 수 있습니다.
  - 이로 인해 CM은 학습 도중 불안정해짐

IMM의 해결책:
- sammpling 을 1~4 step으로 줄여도, 성능은 오히려 더 좋음 (diffusion & flow matching 보다 )
- 기존 sampling을 빠르게 하기 위한 방법과 달리,
  - 단일 모델, 단일 목적 함수로 한 번에 끝까지 학습하므로, 복잡한 2단계 훈련이나 teacher 모델 준비가 필요 없다
  - 이론적으로 분포 수렴을 보장
- 특수한 하이퍼파라미터 설계에 의존하지 않고도 잘 동작하여, 모델 구조나 학습 설정을 유연하게 선택할 수 있다.
- 훈련 안정성이 높아 모델 규모나 학습 예산을 늘리는 대로 성능이 향상되는 긍정적인 스케일링 특징

구체적 동작 방식
- 생성된 데이터 분포와 실제 데이터 분포의 분포 통계 모멘트들(평균,분산, 왜도, 첨도 등)를 일치시키는 Maximum Mean Discrepancy (MMD) 기반 목표함수를 사용하여 분포를 맞추는 것이 핵심
- t (더 노이지)에서 s( 덜 노이지) 데이터 분포 를 유추하는게, 딥러닝 네트워크의 목표.
- 학습 초반에는, t와 s 차이를 적게 하다가, 갈수록 키우는 방법을 제안
- 아래 그림으로 느낌만 잡고, 넘어가도, 밑 글들에서 더 깊게 이해할 수 있습니다:)
- Consistency Model과 IMM의 관계
  - 논문에서는 Consistency Model(CM)이 사실 IMM의 특수 케이스라는 점을 수학적으로 보입니다.
  - 구체적으로, CM은 ‘단일 입자(single-particle)’ 그리고 ‘1차 모멘트(평균값만)’ 정도만 맞추는 버전이어서, 전체 분포를 폭넓게 커버하지 못합니다.
  - 이런 제한이 CM을 훈련할 때 안정성이 떨어지는 이유 중 하나라고 설명하죠.
- 즉, IMM은 더 일반적인 분포 매칭 방식(모든 모멘트까지 고려)으로 안정적으로 학습하고, CM은 그중 일부(1차 모멘트)만 맞추는 경우라서 불안정성이 클 수 있다는 얘기예요.

1. Introduction

1.1. stochastic interpolants & time-dependent marginal distributions

본 논문은 “stochastic interpolants(확률적 보간 함수)의 시간별 분포(time-dependent marginal distributions) 를 다룬다.”

stochastic interpolant(확률 보간 함수)이란?

Albergo et al. (2023)이 개념 제안
아이디어: 데이터($\mathbf{x}$)와 노이즈($\boldsymbol{\epsilon}$) 사이를 확률적으로 “연결(interpolate)”하는 방식을 통칭.
Albergo 등은 $\mathbf{x}_t$가 $\mathbf{x}$와 $\boldsymbol{\epsilon}$을 섞되, 그 섞임이 확률적 분포(가우시안)를 이룬다고 정의
- 즉, $\mathbf{x}_t \mid (\mathbf{x}, \boldsymbol{\epsilon}) \sim \mathcal{N}\big(I_t(\mathbf{x}, \boldsymbol{\epsilon}), \gamma_t^2 \mathbf{I}\big)$ 분포로 정해짐. (가장 중요)
- 여기서 $I_t(\cdot)$는 평균(보간의 중심), $\gamma_t$는 해당 시점의 잡음 크기.
- 경계조건:
  - $t = 0$에서 $I_0(\mathbf{x}, \boldsymbol{\epsilon}) = \mathbf{x}$ (즉, 데이터)
  - $t = 1$에서 $I_1(\mathbf{x}, \boldsymbol{\epsilon}) = \boldsymbol{\epsilon}$ (즉, 노이즈)
  - $\gamma_0 = \gamma_1 = 0$ (초기·최종 시점엔 추가 잡음 없이 확정적)

time-dependent marginal distributions 란?

각 시간 (t)마다의 “진짜 분포”
$\mathbf{x}_t \mid (\mathbf{x}, \boldsymbol{\epsilon}) \sim \mathcal{N}\big(I_t(\mathbf{x}, \boldsymbol{\epsilon}), \gamma_t^2 \mathbf{I}\big)$
stochastic interpolant(확률 보간 함수) 로 인해 정의됨

IMM이 어떻게 활용하나?

IMM은 노이즈 상태에서 데이터로 돌아오는 과정을 소수 단계(few-step)로 끝낼 수 있게 하려 합니다.
이를 위해, 임의의 시간 (t)에서, 모델이 목표 시간 (s)로 한 번에 점프(jump)할 수 있게 만든 뒤,
그 결과가 실제 (s) 시점 분포(time-dependent marginal distribution) 와 같아지도록(= “moment matching”) 훈련합니다.
- stochastic interpolant를 활용해 매 시점의 분포를 직접 정의해놓으면, 모델이 “현재 시간 분포→목표 시간 분포”를 맞추기 쉬워져요.
이렇게 하면 확산 모델처럼 모든 단계 하나하나를 거치지 않고도, 필요한 단계만큼만 써서 최종 이미지를 얻을 수 있어요.

1.2. construction by induction

수학적 귀납법(induction)이란?

작은(기본) 구간에서 참인 성질이 더 큰 구간에도 이어지도록 하여, 전체 구간에 대해 성립함을 보이는 논리.
예: (n=1)에 참인 것이 $n$에서 참이면 $n+1$에서도 참이라면, 모든 (n)에 대해 참이 되는 것과 같죠.

IMM에서의 ‘construction by induction’ (귀납적으로 구성함)

시간축을 data (0)부터 noise (1)까지 놓고, 예를 들어 (0 < s < r < t < 1) 구간을 잡아요. (아래 그림처럼)
시점 (r)에서 (s)로 가는 결과와, 시점 (t)에서 (s)로 가는 결과가 같은 분포를 내도록 학습해요. (즉, 둘의 차이를 최소화) (아래 그림의 주황색 형광팬)
- 시작점이 달라도, 결과는 같게
- 유식한 말로 self-consistency라고 부르기도 해요.
이렇게 작은 구간마다 “시작점 달라도 결과는 같다”는 조건을 계속 쌓아가면(시간 전 구간에 적용(귀납적으로 확장)하면),
- 결국 전체 구간(0~1)에서 노이즈 분포→데이터 분포 변환 과정이 일관되게 맞춰지게 되죠.
  - s,r,t의 gap을 처음에는 작게 해서 학습하다가 -> 갈수록 커지게 함 -> 임의의 s,r,t에서 construction의 결과가 같아짐.
- 즉, construction by induction 을 수행하면, noise에서 data distrubtion으로의 변환과정에서의 수렴을 보장한다고 합니다.

2. Preliminaires

2.1 Diffusion, Flow Matching and Interpolants

요약하면,Flow Matching이나 $v$-prediction Diffusion은,
- 각각 Stochastic Interpolants의 특수 케이스로 볼 수 있다.
VP(Variance Preserving) Diffusion 모델과 Flow Matching은 둘 다
- “데이터↔노이즈” 사이를 시간에 따라 보간하고,
- “속도(=시간 변화율)”를 신경망으로 학습함으로써,
- 최종적으로 inference 시 “확률적 ODE”를 풀어 이미지를 생성한다는 원리를 공유

데이터 $\mathbf{x}$와 잡음 $\boldsymbol{\epsilon}$ 사이의 보간(Interpolation)

VP(Variance-Preserving) 확산 모델이나 Flow Matching(FM) 모델은, 원래 데이터($\mathbf{x}$)와 가우시안 잡음($\boldsymbol{\epsilon}$)을 시간 $t$에 따라 섞은 $\mathbf{x}_t$를 만듭니다.
수식으로는 $$ \mathbf{x}_t = \alpha_t \mathbf{x} + \sigma_t \boldsymbol{\epsilon} $$ 형태인데,
- $t=0$일 때($\alpha_0 = 1, \sigma_0 = 0$): $\mathbf{x}_t$는 거의 “데이터 $\mathbf{x}$”
- $t=1$일 때($\alpha_1 = 0, \sigma_1 = 1$): $\mathbf{x}_t$는 거의 “순수 잡음 $\boldsymbol{\epsilon}$”
VP 확산은 예를 들어 $$ \alpha_t = \cos!\bigl(\tfrac{\pi t}{2}\bigr), \quad \sigma_t = \sin!\bigl(\tfrac{\pi t}{2}\bigr) $$ 같은 식을 쓰고,
Flow Matching(FM)은 $$ \alpha_t = 1 - t, \quad \sigma_t = t $$ 같이 더 단순한 선형 보간을 사용합니다.

속도 $\mathbf{v}_t$를 이용한 학습

$v$-prediction 확산이나 Flow Matching 둘 다, $\mathbf{x}_t$가 시간에 따라 어떻게 변해야 하는지(= “속도” $\mathbf{v}_t$)를 뉴럴넷으로 예측하도록 합니다.
속도는 $$ \mathbf{v}_t = \alpha'_t,\mathbf{x} + \sigma'_t,\boldsymbol{\epsilon} $$ 처럼 정의되며, $\alpha'_t, \sigma'_t$는 $\alpha_t, \sigma_t$의 미분 같은 개념입니다.
즉, 신경망 $G_{\theta}(\mathbf{x}_t, t)$ = “${d{x}_t}/dt$가 더 미세한 시간 변화에서 어느 방향으로 움직여야 하는가?”를 예측하게 만듭니다.

확률적 ODE(확률 흐름 ODE, PF-ODE)로 샘플 생성

학습을 마친 뒤, $\mathbf{x}t$에 대한 미분방정식 $$ \frac{d\mathbf{x}_t}{dt} = G{\theta}(\mathbf{x}_t, t) $$ 를 풀어서 데이터를 생성합니다.
초기값은 잡음 $\boldsymbol{\epsilon}$에서 시작(=$t=1$ 근처)해서, 점차 $\mathbf{x}_0$ 쪽(데이터)으로 이동하게 되죠.
이것이 확산 모델이나 Flow Matching 방식에서 실제 이미지를 뽑는 과정입니다.

왜 Stochastic Interpolants가 “diffusion & FM”을 통합한거야?

Diffusion 모델도 데이터와 잡음 사이를 점진적으로 섞으면서,
Flow Matching (FM)도 선형(혹은 간단한) 보간으로 데이터↔잡음을 연결합니다.
Stochastic Interpolants는 이 둘을 일반화/통합한 개념으로,
- “데이터와 노이즈 사이의 경로를 확률적으로 설정하고, 그 경로를 학습한다”라는 시각을 제공합니다.
- $\mathbf{x}_t \mid (\mathbf{x}, \boldsymbol{\epsilon}) \sim \mathcal{N}\big(I_t(\mathbf{x}, \boldsymbol{\epsilon}), \gamma_t^2 \mathbf{I}\big)$

[flow Matching] $\gamma_t = 0$, $I_t(\mathbf{x}, \boldsymbol{\epsilon}) = \alpha_t,\mathbf{x} + \sigma_t,\boldsymbol{\epsilon}$
- 완전히 정해진(랜덤 없이) 선형 보간이므로, 이때 $\mathbf{v}_t = \alpha_t' \mathbf{x} + \sigma_t' \boldsymbol{\epsilon}$ 은 Flow Matching(FM)에서 쓰는 속도와 동일.
- 따라서 학습/추론이 FM과 똑같이 됩니다.
[Diffusion Model] $\boldsymbol{\epsilon} \sim \mathcal{N}(\mathbf{0}, \mathbf{I})$인 경우임
- 이때 stochastic interpolants를 쓰면, 사실상 $v$-prediction Diffusion 형태로 돌아갑니다. (즉, 기존 확산 모델과 동일한 수식 구조)

Conditional Interpolant Velocity $\mathbf{v}_t$

$\mathbf{v}_t$는 시간에 따라 $\mathbf{x}_t$가 어떻게 변해야 하는지(=속도)를 나타냅니다.
식: $\mathbf{v}_t = \partial_t I_t(\mathbf{x}, \boldsymbol{\epsilon}) + \gamma_t' \mathbf{z}$
- $\partial_t I_t$는 평균(보간 중심)이 $t$에 따라 변하는 방향,
- $\gamma_t' \mathbf{z}$는 추가 잡음(무작위성)이 포함될 수 있는 방향을 의미해요.
뉴럴넷 $G_{\theta}$로 $\mathbf{v}_t$를 예측하려면, $\mathbf{x}_t$가 주어졌을 때의 평균적인 속도를 추정하면 됩니다.
- $G_{\theta}(\mathbf{x}_t, t) \approx \mathbb{E}[\mathbf{v}_t \mid \mathbf{x}_t]$

2.2. Maximum Mean Discrepancy (MMD) ?

Maximum Mean Discrepancy 이름의 유래 (재미로 보고 넘어가자)
- 두 분포를 가장 잘 구분해주는(="평균 차이(Mean Discrepancy)"를 최대화(Maximum) 해주는) Metric Function이다.

이걸 쓰는 목적: 쉬운 비유로 예시 들어보기

예) “클래스A, 클래스B 학생들의 시험점수 분포”

예시 상황
- 분포 (p): 클래스A 학생들 점수.
- 분포 (q): 클래스B 학생들 점수.
평균 점수(1차 모멘트)만 비교하면, “A반 평균 70, B반 평균 72, 비슷하다” 정도 알겠죠.
하지만 실제로는 A반은 분산이 매우 작고(점수가 고르게 60~80 사이), B반은 분산이 크고 양 극단(0점부터 100점까지)일 수도 있잖아요.
MMD는 단순 “평균값”뿐 아니라, 분포 형태 전체(모멘트 전부 포함, 평균/분산/왜도/첨도)를 커널 함수을 통해 파악하므로,
- “두 점수 분포가 얼마나 다른가”를 훨씬 자세히 알 수 있게 됩니다.
아래 그림을 꼼꼼히 이해하려고 해보자.

RBF 커널의 특징

커널 = "데이터 간 유사도 측정" 함수
위 그림의 (주황색 글씨) 커널 수식인, ($\exp(-|\mathbf{x}-\mathbf{y}|^2)$)이 RBF 커널의 하나의 예 라고 합니다.
- Radial Basis Function 이라고 함 (이름이 무슨 뜻인지는 모르겠음)
RBF 커널을 쓰면, 그 “무한 차원 확장” 속에 분포의 다양한 모멘트 정보가 다 들어 있다고 해석할 수 있습니다.
결과적으로 MMD를 작게 만들려면, 분포의 1차 모멘트, 2차 모멘트, 3차 모멘트... 등등이 전부 유사해야(거의 같아야) 해요.
그래서 RBF 커널 기반 MMD가 분포 전반을 잘 구분할 수 있게 됩니다.

왜 MMD는 “유용”할까?

안정적 비교: MMD는 GAN처럼 “적대적 학습(미니맥스)” 없이도, 분포 사이 차이를 측정하고 줄여나갈 수 있어요.
분포 전반: 단순히 평균이나 분산만 맞추는 게 아니고, 더 풍부한 정보(모멘트 전부)를 포함해 “거의 모든” 차이를 발견할 수 있습니다.
간단한 구현: 실제 코딩 때는 샘플들끼리 커널 함수값을 계산해 평균 내면 되므로, 비교적 구현이 쉽습니다.

3. 본문: Inductive Moment Matching

제목 해석
- 귀납적인(Inductive) "평균, 분산, 왜도, 첨도 ..."(moment) 맞추기(Matching)!
t에서의 distribution에서, s로의 distribution으로 변형하는
- implicit one-step model(one-step sampler)을 학습하는 것을 제안
one-step model은
- t=1 -> t=0 으로 한번에 가는것도 가능하고,
- any t -> r -> s 로 recursive하게 가는 것도 가능하다. (s=0이 될 때까지)
  - 이것은 bootstrapping(점진적 확장)을 통해, 모델이 생성한 sample로 부터, 모델이 스스로 학습할 수 있게 하는 것을 허용한다! (아래 글 읽다보면 이해됩니다.)

3.1. Model Construction via Interpolants

두괄식 정리

핵심 개념:
1. stochastic Interpolant (복습): 시간 $t$를 매개로 데이터 $\mathbf{x}$ ↔ 노이즈 $\boldsymbol{\epsilon}$ 간에 확률적으로 섞이는 경로.(혹은 함수)
  - time augmented interpolation $\mathbf{x}_t \mid (\mathbf{x}, \boldsymbol{\epsilon}) \sim q_t(\mathbf{x}_t|\mathbf{x}, \boldsymbol{\epsilon}) \sim \mathcal{N}\big(I_t(\mathbf{x}, \boldsymbol{\epsilon}), \gamma_t^2 \mathbf{I}\big)$
2. Generalized Interpolant: “$\mathbf{x}$, $\mathbf{x}_t$를 조건으로 시간 $s$에 어떤 확률분포로 보간할지”를 더 확장된 형태로 논문에서 정의 (0($x$)
3. stochastic Interpolant를 포함하는 더 넓은 개념

Marginal-Preserving: Generalized Interpolant로 정의된 “$s$ 시점 분포”가, 원래 우리가 의도한 $q_s(\mathbf{x}_s)$와 똑같아지도록 유지 → “1-step sampler”를 써도 정확히 $q_s$를 얻는다.

어떻게 쓰이나?

이렇게 정의해 놓으면, “주어진 $\mathbf{x}_t$ 샘플을 시간 $s$로 바꾸는 과정”이 전부 하나의 알고리즘(혹은 함수)으로 정리됩니다.
즉, $\mathbf{x}_t$를 넣으면 $\mathbf{x}_s$가 나오는 1-step 변환기, 이 변환기가 잘 만들어지면, $t=1$에서 시작해도 $\mathbf{x}_0$ (실제 데이터) 상태로 한 번에 갈 수 있음을 보장.

논문의 전반적 목표: “한 단계(1-step) 변환”으로 분포를 옮기기

배경
- “시간 $t$에서의 분포 $q_t(\mathbf{x}_t)$”를 “시간 $s$에서의 분포 $q_s(\mathbf{x}_s)$”로 정확히 옮겨주는 특별한 보간(interpolation) 개념이 필요해요.
  - 이걸 “marginal-preserving interpolant”(주변분포를 보존하는 보간)이라고 부릅니다.

설명 시작: 식 (2)에서 말하는 “시간축 보간” 개념

$$ q_t(\mathbf{x}_t) = \iint q_t(\mathbf{x}_t|\mathbf{x}, \boldsymbol{\epsilon}), q(\mathbf{x}), p(\boldsymbol{\epsilon}), d\mathbf{x}, d\boldsymbol{\epsilon}. <식(2)> $$
$\mathbf{x} \sim q(\mathbf{x})$: 실제 데이터 분포에서 샘플
$\boldsymbol{\epsilon} \sim p(\boldsymbol{\epsilon})$: 사전 정의된 간단한 분포(보통 가우시안 잡음)에서 샘플.
$q_t(\mathbf{x}_t|\mathbf{x}, \boldsymbol{\epsilon})$: “$\mathbf{x}$와 $\boldsymbol{\epsilon}$을 받아, ‘시간 $t$’ 상태의 샘플 $\mathbf{x}_t$를 어떻게 만드는가”를 정의하는 time augmented interpolation (stochastic interpolant를 활용한)

즉, $\mathbf{x}_t$는 “$\mathbf{x}$와 $\boldsymbol{\epsilon}$을 적절히 섞어서 나온 $t$ 시점의 상태”라고 볼 수 있음.

그걸 전부 합치면($d\mathbf{x}, d\boldsymbol{\epsilon}$ 적분), 결국 $\mathbf{x}_t$만 봤을 때(주변화), 어떤 분포가 되느냐가 $q_t(\mathbf{x}_t)$라는 의미.

모든 데이터(x), 모든 노이즈($\epsilon$) 사례를 다 종합하면, ‘t 시점의 사진 분포’($q_t$)가 생겨납니다.*

“1-step sampler”로 $q_t(\mathbf{x}_t) \to q_s(\mathbf{x}_s)$ 전환

“$\mathbf{x}_t$ 분포를 갖고 있는 샘플을, 곧장 $\mathbf{x}_s$ 분포의 샘플로 바꿔 주는(‘transform’) 수 있는 모델을 만들자!”
“그런 모델을 만들려면, $s$ 시점의 주변분포 $q_s(\mathbf{x}_s)$와 일치하도록 하는 특별한 보간 과정이 필요하다.”
이 특별한 보간을 marginal-preserving interpolant라고 부르겠다는 것.
- 참고: 주변분포(marginal distribution)라 함은, “$\mathbf{x}_s$만 봤을 때의 분포를 의미”
  - “preserving”은 “그걸 그대로 지켜낸다”는 뜻.

식 (3) generalized interpolant 정의

stochastic interpolant: $q_t(\mathbf{x}_t|\mathbf{x}, \boldsymbol{\epsilon}) \sim \mathcal{N}\big(I_t(\mathbf{x}, \boldsymbol{\epsilon}), \gamma_t^2 \mathbf{I}\big)$
generalized interpolant 은 s <- t 로 가는 과정

generalized interpolant: $$ q_{s|t}(x_s|x, x_t) = \mathcal{N}\bigl(I_{s|t}(x, x_t), \gamma_{s|t}^2 I\bigr) <식(3)> $$

Constraints:
$I_{t|t}(x,x_t) = x_t,\quad I_{0|t}(x,x_t) = x,\quad \gamma_{t|t} = \gamma_{0|t} = 0,\quad q_{t|1}(x_t|x, \epsilon)=q_t(x_t|x, \epsilon)$

즉, generalized interpolant는 “시간 $t$의 샘플 $x_t$와 원본 데이터 $x$를 이어주는 가우시안 보간 과정”을 이런 방식으로 정의했을 때, 경계(시작/끝)에서는 확정값이고, 중간 시점에는 적절한 랜덤성을 줄 수 있게 함.

“When $t=1$, it reduces to regular stochastic interpolants.”

$t=1$이면, 우리가 흔히 아는 “완전 노이즈($\boldsymbol{\epsilon}$)와 데이터($\mathbf{x}$) 사이를 보간”하는 기본 확산(또는 Interpolant) 정의가 됩니다.
이때는 “$\mathbf{x}_1 \approx \epsilon$, $\mathbf{x}_0 = \mathbf{x}$” 같은 구조가 되어, ‘Stochastic Interpolant’(Albergo 등이 제안한 확률적 보간 모델)과 동일해진다는 말이에요.
즉 generalized interpolant이 더 큰 범주의 개념이고, stochastic interpolants는 그 안에 속하는 개념입니다.

Definition 1 (Marginal Preserving Interpolants)

두괄식 설명

“Marginal-Preserving”이란, “시간 축 어디서 출발해도, ‘시점 $s$’에서의 결과물이 원래 $q_s$와 완전히 같게끔 보간을 정의하는 것.”
이를 만족시키기 위해, 논문은 (4) (이론분포)와 (6) (모델분포)의 차이를 줄이는 목적함수(발산 최소화)를 제안하고, ( 수식은 아래에 나와요.)
- 그걸 통해 한 스텝/두 스텝/여러 스텝 샘플링이 모두 올바르게 진행되도록 유도하고자 하는 겁니다.
목표:
- “시간 $t$” 상태($\mathbf{x}_t$)에서 “시간 $s$” 상태($\mathbf{x}_s$)로 한 번에 이동하는 방식을 만들되, $\mathbf{x}_s$가 반드시 “진짜 $q_s(\mathbf{x}_s)$”를 만족하도록 하자.
- 즉, $t$ 시점의 샘플을 어떤 보간 과정을 거쳐도, 시점 $s$의 “정확한” 분포와 같아야 한다는 개념이 곧 “Marginal-Preserving Interpolant”.

Definition 1: Marginal-Preserving Interpolants

$$ q_s(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}_s \mid \mathbf{x}, \mathbf{x}_t);q_t(\mathbf{x}\mid \mathbf{x}_t);q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. \tag{4} $$

왼쪽의 $q_s(\mathbf{x}_s)$는 “시점 $s$”에서의 참된(이론적) 분포.
오른쪽은 “$\mathbf{x}_t$에서 $\mathbf{x}$를 고려한 뒤, $\mathbf{x}_s$를 만드는 보간”을 전부 합친(적분한) 것.
- $q_{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t)$ 는 위에서 정의한 generalized interpolant 식 (3) -> 가우시안 분포 $\mathcal{N}\bigl(I{s|t}(x, x_t), \gamma_{s|t}^2 I\bigr)$
결과적으로, 이 등식은:
- “시점 $t$에서 다양한 $\mathbf{x}_t$가 있을 수 있고, 그 $\mathbf{x}_t$가 어떤 원본 데이터 $\mathbf{x}$에서 왔을지도 모르는 상황을 다 고려했을 때,
- 최종적으로 생성되는 $\mathbf{x}_s$의 분포가 진짜 $q_s(\mathbf{x}_s)$와 동일해야 한다”
- 즉, “Marginal(=시점 $s$에서의 분포)이 보존된다”는 의미입니다.

$\mathbf{x}_t$와 $\mathbf{x}$ 사이의 관계: 식 (5)

아마 아래 식을 직접 구하긴 어려우니, 식 (6) 처럼 딥러닝 네트워크로 학습시키자는 의도 같음(내 생각임)

$$ q_t(\mathbf{x}\mid \mathbf{x}_t) = \int \frac{q_t(\mathbf{x}_t \mid \mathbf{x}, \boldsymbol{\epsilon}),q(\mathbf{x}),p(\boldsymbol{\epsilon})}{q_t(\mathbf{x}_t)} ; d\boldsymbol{\epsilon}. <식(5)> $$

“$\mathbf{x}_t$가 특정 $\mathbf{x}$에서 비롯된(조건부) 확률”을 구체적으로 나타낸 식.
분자: “$(\mathbf{x}, \boldsymbol{\epsilon})$에서 실제로 $\mathbf{x}_t$가 생기는 과정”, 분모: 전체 $\mathbf{x}_t$ 확률로 정규화.
요점: $\mathbf{x}_t$가 어디서부터 왔는지(어떤 $\mathbf{x}$, $\boldsymbol{\epsilon}$였는지)를 역추적하는 수식.

3. 식 (6): 모델 분포 $p_{s|t}^\theta(\mathbf{x}_s)$

$$ p_{s|t}^\theta(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s\mid \mathbf{x}, \mathbf{x}_t); p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}_t); q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. <식 (6)> $$

이제 우리 모델이 정의한 과정을 녹여, 최종 $\mathbf{x}_s$가 어떤 분포를 갖는지 계산한 것.
$\mathbf{x}_t \sim q_t(\mathbf{x}_t)$로 시작해서,
1. "$\mathbf{x}t \to \mathbf{x}$"를 만들고(이는 $p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}_t)$라 부름),
2. "$\mathbf{x}$와 $\mathbf{x}t$" 정보를 이용해 시점 $s$ 상태($\mathbf{x}_s$)를 만드는 과정($q{s|t}$).
  - $q_{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t)$ 는 위에서 정의한 generalized interpolant 식 (3) -> 가우시안 분포 $\mathcal{N}\bigl(I{s|t}(x, x_t), \gamma_{s|t}^2 I\bigr)$
그 전 과정을 몽땅 합쳐(적분) 얻은 모델 측 시점 $s$ 분포가 $p_{s|t}^\theta(\mathbf{x}_s)$.
이를 “Noisy model distribution at $s$ given $t$”라고 부름.

Multi-step Sampling (2-스텝 예시)

논문에서는 2스텝으로 샘플링하는 과정을 예시로 들어 설명합니다:

초기 $\mathbf{x}_t \sim q_t(\mathbf{x}_t)$
- 시점 $t$에서의 “진짜” 분포에서 샘플을 뽑았다고 가정. (실제로는 노이즈에서 만들어졌다고 볼 수도 있고, 관측된 값일 수도…)
(1단계) $\mathbf{x}_t \to \hat{\mathbf{x}} \to \hat{\mathbf{x}}_s$
- $\hat{\mathbf{x}} \sim p_{s|t}^\theta(\mathbf{x}\mid \mathbf{x}_t)$:
  - 즉, “$\mathbf{x}_t$가 주어졌을 때, 모델이 $\mathbf{x}$를 하나 뽑는다.”
  - $\hat{\mathbf{x}}$라고 이름 붙임.
- 그리고 $\hat{\mathbf{x}}s \sim q{s|t}(\mathbf{x}_s\mid \hat{\mathbf{x}}, \mathbf{x}_t)$:
  - “$\hat{\mathbf{x}}$와 $\mathbf{x}_t$로 시점 $s$ 상태($\hat{\mathbf{x}}_s$)를 만든다.”
  - 이 부분이 generalized interpolation에 해당.
(2단계) $\hat{\mathbf{x}}_s \to \mathbf{x}$
- 시점 $s$에서 최종적으로 $\mathbf{x}$ (시점 0, 완전 데이터)에 도달하고 싶다면,
- $\mathbf{x} \sim p_{0|s}^\theta(\mathbf{x}\mid \hat{\mathbf{x}}_s)$라는 식으로 또 한 번 샘플링.
- 이렇게 해서 $\mathbf{x}_t \to \mathbf{x}_0$로 가는 2단계 프로세스가 완성됨.

Marginal-preserving이 왜 중요할까?

“$\hat{\mathbf{x}}_s$의 분포가 실제 $q_s(\mathbf{x}_s)$와 동일”해야, 2단계 과정에서 문제가 안 생깁니다.
즉, “중간 시점 $s$ 분포가 올바르면, 거기서 또 한 번 샘플링해도 최종 0 시점(데이터)로 제대로 갈 수 있다”는 논리.

Equation (4) vs. (6)

결국 저자들은 “(4)에서 말하는 진짜 분포 $q_s(\mathbf{x}s)$”와 “(6)에서 정의되는 모델 분포 $p{s|t}^\theta(\mathbf{x}_s)$” 사이의 거리를 줄이고 싶어 합니다.

즉, $$ D\left(q_s(\mathbf{x}s),;p{s|t}^\theta(\mathbf{x}_s)\right) $$
이 작은(가장 0에 가까운) $\theta$를 찾으면, “Marginal-Preserving”에 가까워진다는 뜻이죠.

정리 및 요약

Definition 1에서 “Marginal-Preserving Interpolant”는, “어떤 시점 $t$에서 $\mathbf{x}_t$로부터 시점 $s$의 샘플 $\mathbf{x}_s$를 만들어도, 그 분포가 진짜 $q_s$와 같아지도록 하는 보간”이라고 정의.
식 (5)는 $\mathbf{x}_t$가 어떤 $\mathbf{x}$에서 왔는지(역조건부)도 고려하는 공식.
식 (6)은 우리가 실제로 학습하는 모델 분포 $p_{s|t}^\theta(\mathbf{x}s)$ 정의. (거기엔 1-step sampler $p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}_t)$가 들어감.)
Multi-step(2-step) 샘플링으로 “중간 $s$”를 거쳐 최종 $\mathbf{x}_0$ (데이터)까지 갈 수 있고, 그 중간에 Marginal-Preserving 속성이 중요함.
결과적으로, (4)와 (6) 간 차이를 발산으로 측정해 최소화하는 방식(예: MMD)을 쓰면, $\mathbf{x}_t \to \mathbf{x}_s$ 변환을 잘 학습할 수 있다는 아이디어

Naive Objective

<복습>

$$ q_s(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t);q_t(\mathbf{x}\mid \mathbf{x}_t);q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. \tag{4} $$ $$ p{s|t}^\theta(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s\mid \mathbf{x}, \mathbf{x}_t); p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}_t); q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. <식 (6)> $$

한 문장으로 두괄식 요약

아래 식 (7) 같은 간단 손실로 시점 $s$ 분포만 맞추면, Marginal-Preserving 식(4) 을 달성할 수 있다.
하지만 이건 경로(조건부)를 구체적으로 맞추는 것과는 달라서, 여러 가지 해법(결정적·확률적)이 공존한다.

이게 바로 논문이 말하는 “Naïve Objective”의 핵심 메시지입니다.

1. Naïve Objective(식 (7))의 아이디어

$$ \mathcal{L}(\theta) = \mathbb{E}{s, t}\Bigl[D\bigl(q_s(\mathbf{x}_s),,p{s|t}^\theta(\mathbf{x}_s)\bigr)\Bigr]. <식 (7)> $$

여기서 $D(\cdot,\cdot)$는 두 분포의 “차이(거리)”를 재는 방법(MMD, GAN loss 등).
$\mathbb{E}_{s,t}$은 “여러 시점 $s, t$를 골라 평균을 낸다”는 뜻.
결론적으로, “모델이 만들어낸 시점 $s$의 분포 $\mathbf{x}_s$가, 진짜 분포 $q_s(\mathbf{x}_s)$와 최대한 가깝도록 손실을 줄이자.” 라는 의미입니다.

왜 “Naïve”(단순 접근)인가?

이 방식은 “최종 분포만” 보고 맞춥니다.
예: “시점 $t \to$ 시점 $s$”로 갈 때, 어떤 경로를 거치든 상관없이, 결국 시점 $s$ 결과가 $q_s$와 똑같아지면 OK라는 식이죠.
내부에서 “$\mathbf{x}_t \to \mathbf{x}$”가 어떻게 되느냐는 깊이 신경 안 쓴다는 점에서 단순(naïve)하다는 겁니다.

하지만 “조건부 분포”까지 똑같아야 하는 건 아님

논문이 강조하는 부분:

"$q_s(\mathbf{x}s) = p{s|t}^\theta(\mathbf{x}s)$라고 해서
$\quad p{s|t}^\theta(\mathbf{x}|\mathbf{x}_t) = q_t(\mathbf{x}|\mathbf{x}_t)$
가 자동으로 성립하진 않는다."

즉, 최종 분포(마진 분포)가 일치한다고 해서,
개별 샘플 경로(조건부: $\mathbf{x}_t$에서 어떤 $\mathbf{x}$를 거쳐 $\mathbf{x}_s$에 이르는지)까지 완전히 동일할 필요는 없다는 말.

그래서 최적해(해결책)가 여러 가지 존재

논문 말대로, $\mathcal{L}(\theta) = 0$을 만족하려면, 마진 분포만 진짜와 같으면 되니까,
- "$\mathbf{x}_t \to \mathbf{x}$"나 "$\mathbf{x}_t \to \mathbf{x}_s$"를 어떻게 설계하든 괜찮습니다.
예:
- 결정적(Deterministic) 매핑("$\mathbf{x} = f_\theta(\mathbf{x}_t)$" 형태)이라 해도, 결과적인 시점 $s$ 분포만 일치한다면 문제 없음.
- 또는 확률적(무작위) 매핑이라도, 최종 시점 $s$ 분포만 진짜 $q_s$와 똑같으면 OK.
이렇게 내부 구조(조건부 경로)를 구애받지 않으므로,
- 다양한 모델 설계가 가능,
- 즉, 해가 여러 가지 존재한다(“non-unique”).

3.2. Learning via Inductive Bootstrapping

Inductive : 귀납적인
Bootstrapping: 점진적 확장
Learning via Inductive Bootstrapping: 귀납적인 점진적 확장 방식을 통해 학습하겠다.
왜 Naïve Objective(식 (7))가 학습하기 어렵나?

멀리 떨어진 시점 $t \gg s$는 분포 차이가 크다
예: $t=1$이면 완전 노이즈, $s=0$이면 완전 데이터.
둘을 한 번에 매칭하려면, 모델이 “완전 노이즈 $\to$ 완전 데이터” 변환을 직접 학습해야 하고,
이는 분포 차이가 매우 커서 학습 난이도가 높아집니다(훈련이 불안정하거나, 수렴이 느리거나 등).

해결책의 실마리 -> 2. , 3. , 4.

2. 경계조건이 주는 이점: $q_s(\mathbf{x}s) = p{s|s}^\theta(\mathbf{x}_s)$
논문에서 말하길, 인터폴런트(generailzed interpolant) 구조 덕분에 $s = t$일 때 자동으로
$$ p_{s|s}^\theta(\mathbf{x}_s) = q_s(\mathbf{x}_s), $$ 가 성립한다고 해요 (Lemma 4 참고).
이 말은, “$\theta$”(모델 파라미터)와 상관없이, 시점이 동일하면 분포를 맞추는 건 자동으로 해결된다는 겁니다.

3. 시점이 가까우면 “거의 맞춘다”

또한 $t$가 $s$에 가까우면 ($|t - s|$가 작다면) $\mathbf{x}t$와 $\mathbf{x}_s$의 분포 차이가 작아서,
$$ p{s|t}^\theta(\mathbf{x}_s) \approx q_s(\mathbf{x}_s). $$
다시 말해, 시간이 아주 조금만 떨어져 있어도, 분포 변동 폭이 작으니 모델이 쉽고 안정적으로 학습할 수 있습니다.

4. 시점 $t$와 $r \approx t$가 가까울 때도, 분포가 비슷해진다

논문 문장:

“Interpolant enforces $p_{s|t}^\theta(\mathbf{x}s) \approx p{s|r}^\theta(\mathbf{x}_s)$ for any $r < t$ close to $t$.”
즉, 시간 $t$보다 조금 작은 $r$에서 변환을 시도해도, $\mathbf{x}_s$ 분포가 크게 달라지지 않는다(연속성).
따라서 “$t$”에서 “$r$”로만 살짝 이동해도 거의 같은 분포이므로, 가까운 구간끼리 맞추기가 쉬워집니다.

5. Inductive Learning Algorithm: 가까운 시점부터 차근차근

저자들은 이 성질(2, 3, 4 번 성질) 을 이용해, 부트스트래핑(bootstrapping, 점진적 확장) 방식을 제안합니다.
즉, “멀리 떨어진 시점끼리 매칭”을 시도하기보단, “시점 $t$ 근처의 $r \approx t$나, 또는 $s \approx t$를 먼저 맞춰놓고, 그걸 단계적으로 확장하자”는 겁니다.
이 과정을 Inductive(귀납적)이라 부르는 이유:
- “시점이 거의 붙어있는 케이스부터 학습” → “조금 더 떨어진 케이스” → “더 멀리 떨어진 케이스” 식으로 점진적으로 확장.

6. 수열 $\theta_n$와 함수 $r(s,t)$

논문 설명에 따르면, 아래 방식대로 학습한다고 합니다.
1. $\theta_n$: 모델 파라미터를 학습하는 n번째 버전
2. $r(s,t)$: 시점 $t$에서 조금만 줄인(“finite decrement”) 시점. 즉, $r(s,t) < t$이고 $r(s,t) \ge s$.
3. 학습 방식: $$ p_{s|t}^{\theta_n}(\mathbf{x}s) \approx p{s|r}^{\theta_{n-1}}(\mathbf{x}_s) $$
  - 즉, 이번 단계($\theta_n$)에서 “$t$” 구간을 학습할 때, 이전 단계($\theta_{n-1}$)의 “$r \approx t$” 구간 모델과 맞춰나간다는 전략.

7. 왜 이걸 “잘” 동작한다고 보나?

시점 $r \approx t$ → 분포 차이 작다
- 이미 $\theta_{n-1}$로 $r$를 잘 맞춰놨으니, $t$까지도 별 무리가 없다는 논리.
연속성(4) + 경계조건(3)
- 시점이 인접하면 분포가 크게 변하지 않는(“continuous around $t$”) 특성,
- $s = s$일 땐 자동 경계조건을 만족,
- 이런 점들이 부트스트랩(“점진적 확장”)을 가능케 하며, 훈련 난이도를 낮추죠.

General Objectives & Theorem 1

$$ \mathcal{L}(\theta_n) = \mathbb{E}{s,t}\Bigl[ w(s,t),\mathrm{MMD}^2\Bigl( ,p{s|r}^{\theta_{n-1}}(\mathbf{x}s),; p{s|t}^{\theta_n}(\mathbf{x}_s) \Bigr) \Bigr]. < 식 (8) > $$

$\mathrm{MMD}^2(\cdot,\cdot)$:
- 두 분포 간 차이를 재는 지표(“Maximum Mean Discrepancy”).
- 간단히 말해, 분포가 같아지면 MMD=0이 되고, 다르면 양수가 됩니다.
$w(s,t)$:
- 시점 쌍 $(s,t)$에 대한 가중치 함수.
- 중요한 $(s,t)$는 더 크게 가중을 줄 수도 있고, 덜 중요한 구간은 가중을 작게 줄 수도 있음(자유롭게 설정 가능).

Theorem 1

$$ \lim_{n \to \infty} \mathrm{MMD}^2\Bigl( q_s(\mathbf{x}s),; p{s|t}^{\theta_n^*}(\mathbf{x}_s) \Bigr) = 0. <식(9)> $$

“$n \to \infty$”: 우리가 부트스트래핑 스텝을 무한히 반복(또는 충분히 많이 반복)한다고 보면,
결과: 시점 $s$에서 모델이 만들어내는 분포($p_{s|t}^{\theta_n^}(\mathbf{x}_s)$)가, *실제 $q_s(\mathbf{x}_s)$**와 완전히 같아진다 ($\mathrm{MMD}=0$).

해석

결국, “Inductive Bootstrapping + MMD 기반 목표(식 (8))”를 잘 구현하면, 무한 데이터·충분한 네트워크 용량이라는 이상적인 조건에서,
“원하는 시점 모든 $s \le t$에 대해 진짜 분포를 정확히 재현”할 수 있음을 보장한다.
즉, “노이즈~데이터 사이 전체 구간”을 하나의 one-step sampler $p_{s|t}^{\theta}$로도 학습해낼 수 있다는 이론적 뒷받침입니다.

부록: B.3. Definition of Well-Conditioned r(s, t)

“Well-Conditioned”의 기본 개념

Well-conditioned $r(s,t)$: 시점 $t$에서 부드럽고(연속, 단조 증가), 역함수 존재하도록 조금 감소시켜서 $r \approx t$를 만드는 함수.
- 부트스트래핑에서 “$r$로 갔다가 다시 $t$로 돌아오는 (또는 $\theta$ 업데이트 시)” 때 역방향이 필요할 수 있어서, 역함수가 잘 정의되어야 한다
- “$r$가 단조 증가” 형태면, 역함수도 쉽게 정의 가능.

식 (46) $r(s,t) = \max(s,; t - \Delta(t))$

$$ r(s,t) = \max(s,; t - \Delta(t)) <식 (46)> $$

$\Delta(t)$:
- $\Delta(t) \ge \epsilon > 0$인 양의 함수.
- 즉, “시간 $t$에서 얼마만큼 빼(or 줄이)면 ‘가까운 시점 $r$이 되는지”를 정하는 함수라 보면 됩니다.

Constant Decrement in $t$

정의
$$ r(s,t) = \max(s,; t - \epsilon) $$
$\epsilon$ 설정: $(T - \epsilon) / 2^k$
- 다소 혼동되는 표현이지만, 논문에서 말하는 건
  “시간 전체 범위 $T = 1$을, $2^k$개로 나눠서 $\epsilon$ 크기 정한다” 정도로 이해하면 됩니다.
- 예: $k = 10$이라면, “1을 1024분할”해서 한 번에 $1/1024 \approx 0.0009765625$씩 빼는 식일 수 있죠.
장점과 한계
- 장점: 아주 단순하고 구현하기 편함 (“그냥 $t - \epsilon$” 하면 끝).
- 한계: 확산 모델 특성상 “시간 $t$”와 실제 노이즈 비중($\alpha_t, \sigma_t$)이 비선형 관계이므로,
  한 번에 “0.1시간” 줄였다고 노이즈가 정확히 “균등하게” 줄어드는 건 아님.
- 그래도 코드 측면에선 가장 쉽게 적용 가능하다는 의미로 소개.

부록: C.6. Mapping Function r(s, t)

Constant Decrement in $\eta(t) = \sigma_t / \alpha_t$

$\alpha_t, \sigma_t$란?
- 흔히 확산 모델(Diffusion Model)에서 데이터 vs 노이즈 비중을 나타내는 변수들입니다.
- $\alpha_t$는 “데이터 스케일”, $\sigma_t$는 “노이즈 스케일”이라고 보시면 돼요.
- 예를 들어, $t = 0$일 땐 $\alpha_0 = 1$, $\sigma_0 = 0$ (데이터 100%, 노이즈 0%),
  $t = 1$일 땐 $\alpha_1 = 0$, $\sigma_1 = 1$ (노이즈 100%, 데이터 0%) 같은 식.
$\eta(t) = \sigma_t / \alpha_t$
- $\eta$가 클수록 “노이즈가 훨씬 크다”(노이즈 많음),
- 작을수록 “데이터가 상대적으로 많다”(노이즈 적음)라는 의미가 됩니다.
$\eta(t)$에서 $\epsilon$ 빼기
$$ r(s,t) = \max\left(s,; \eta^{-1}\left(\eta(t) - \epsilon\right)\right) $$
- 여기서 $\eta^{-1}$는 “$\eta$-값을 입력받아 해당하는 시점 $r$”를 리턴해주는 역함수.
- 즉, “노이즈 비율($\eta$)을 $\epsilon$만큼 낮춘 뒤, 그걸 만족하는 시점 $r$를 찾는다”는 개념
- 예: 만약 $t$ 시점에서 $\eta(t) = 10$ (노이즈가 데이터의 10배)였다면, $\epsilon = 2$라면
  “노이즈가 8배 정도일 시점”을 $r$로 삼는 식.
$\epsilon$ 설정: $(\eta_{\max} - \eta_{\min}) / 2^k$
- $\eta_{\max} \approx 160$, $\eta_{\min} \approx 0$이라고 가정
  (대부분 확산 모델에서 $\sigma_t$가 최대 160배 가까이 커질 수 있다는 뜻).
- $k$는 $10 \sim 15$ 사이 숫자.
  즉, $\eta$ 범위를 $2^k$ 단계로 쪼개서 “한 번에 조금씩” 줄이는 방법입니다.
- 왜 160? 저자들이 실험상 노이즈 최대치가 그 정도라고 파악한 예시로 보시면 됩니다.
논문이 “이 방식이 가장 좋다”는 이유
- “노이즈 비율”이라는 물리적·직관적 지표를 기반으로 시점을 줄이면,
  학습이 더 매끄럽고 안정적이었다고 보고합니다.
- 즉, 노이즈를 조금 줄인 상태 → 더 조금 줄인 상태 식으로 부트스트래핑하기가 편하다는 뜻이죠.

4. Simplified Formulation and Practice

4.1. Algorithmic Considerations

Definition 2 (Self-Consistent Interpolants)

핵심 아이디어는 t에서 변환 시작할 때, “중간 시점 $r$를 거치든, 안 거치든, 최종 $\mathbf{x}_s$ 분포가 같아야 한다”는 일종의 경로 일관성(path consistency) 조건을 정의하는 것

맥락: Marginal-Preserving과 Self-Consistency의 차이

Marginal-Preserving:
- “시점 $t$에서 시점 $s$로 직접 보간해도, 결과 분포가 실제 $q_s(\mathbf{x}_s)$와 일치한다”는 조건.
Self-Consistency
- 시점 $t$에서 직접 $s$로 가든, 중간 시점 $r$을 거쳐서 $r \to s$로 가든, 결과 분포가 동일해야 한다.
이 글에서 하고 싶은 이야기: Self-Consistency를 만족하면, 신기하게도 Marginal-Preserving도 자동으로 만족된다고 한 generalized interpolant !

2. 정의(Definition 2) 수식 (10)

복습: generalized interpolant 복습해보자.
$q_{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t) = \mathcal{N}\bigl(I{s|t}(x, x_t), \gamma_{s|t}^2 I\bigr)$

Self-Consistency의 정의 $$ q_{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t) = \int q{s|r}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_r); q{r|t}(\mathbf{x}_r \mid \mathbf{x}, \mathbf{x}_t); d\mathbf{x}_r. <식 (10)> $$
왼쪽: “$\mathbf{x}_s$를 (직접) $t$에서 $s$로 보간”했을 때의 분포. (generalized interpolant)
오른쪽: “먼저 $t \to r$로 $\mathbf{x}_r$를 하나 뽑고, 그다음 $r \to s$로 $\mathbf{x}_s$를 뽑는” 과정을 적분한 결과.
- $\mathbf{x}_r$를 다양하게 뽑을 수 있으니, 전체 $\mathbf{x}_r$ 공간에 대해 적분.

직관

한 번에(직접) $t \to s$ 해도,
나눠서(먼저 $t \to r$ → 그다음 $r \to s$) 해도,
- 최종적으로 $\mathbf{x}_s$가 나오는 분포가 동일해야 한다는 요구사항입니다.

4. (중요) Self-Consistency를 만족하면, Marginal Preserving을 자동으로 만족한다!!

논문에서 Lemma 5가 말하길, “Self-Consistent $\Rightarrow$ Marginal-Preserving”을 보장해 준다고 합니다.
- 즉, 자잘한 시점을 거치든 안 거치든 결과가 같다면, 결국 시점 $s$에서의 분포가 늘 동일하게 되고,
  그것이 우리가 의도한 $q_s(\mathbf{x}_s)$와도 일치하도록 설계할 수 있다는 의미.

밑에서 다룰 내용: “DDIM”이나 “DDPM posterior”도 Self-Consistent 이고, 그러므로 "Marginal Preserving" 이기도 함.

lemma 6

DDPM posterior도, 논문이 정의한 Self-Consistent 보간(interpolant)의 예시라는 것을 증명함

DDPM Posterior와 Self-Consistency의 맥락

DDPM은, “시점 $t$”에 있는 샘플($\mathbf{x}_t$)을 “시점 $s$”로 만들 때 가우시안 형태의 식을 이용
- DDPM Posterior은 gaussian 형태
Lemma 6에서 증명하려는 핵심은:
1. 먼저 시점 $t$에서 중간 시점 $r$로 가는 식을 적용해 $\mathbf{x}_r$를 구한 뒤,
2. 다시 $\mathbf{x}_r$로부터 시점 $s$로 가는 식으로 $\mathbf{x}_s$를 구하더라도,
3. 그 최종 결과가 “$t \rightarrow s$”로 직접 한 번에 갔을 때의 결과와 정확히 동일한 가우시안 분포가 됨을 보이겠다는 것
따라서 DDPM에서 사용하는 “posterior distribution”이,
- 중간 시점($r$)으로 쪼개든 쪼개지 않든 동일한 최종 분포를 내보낸다는 것이 Self-Consistent의 정의와 일치.
Lemma 6은 이 사실을 가우시안 전개를 통해 구체적으로 보여준 거고요.

마지막 언급: Diffusion GAN vs. Our Method

논문 말미에서, “Diffusion GAN 방식은 이 조건부 분포를 직접 학습하는 반면, 우리가 제시하는 방법은 마진 분포를 학습한다”는 언급이 나옵니다.
- 즉, Diffusion GAN은 “$\mathbf{x}_t \to \mathbf{x}$”라는 조건부 분포를 맞추려고 하고, 그게 유일한 해가 되지만,
- 여기서는 “분포 수준”에서 매칭하기 때문에, 내부 조건부 과정이 여러 형태가 될 수 있어 (즉, 해가 여러 개).
한마디로, “우리 방식(마진 매칭)은 해가 유연하게 많을 수 있다”는 코멘트로 끝납니다.

DDIM interpolant

1. DDIM이란?

Denoising Diffusion Implicit Models은 기존 확산 모델(Diffusion Model)에서,
- 다단계로 샘플링해야 하는 비효율성을 줄이기 위해,
- ODE(상미분방정식) 기반으로 시점 $t$에서 시점 $s$로 한 번에 점프하는 해석 방식을 제안한 기법
쉽게 말해, “느리게 조금씩 잡음을 제거”하던 과정을, “기울기(ODE)만 잘 추적하면 적은 단계로도 가능”하게 만든 아이디어입니다.

2. 식 (11): DDIM은 결정론적 generalized interpolant

복습: generalized interpolant ?
- $q_{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t) = \mathcal{N}\bigl(I{s|t}(x, x_t), \gamma_{s|t}^2 I\bigr)$
DDIM

$$ \text{DDIM}(\mathbf{x}_t, \mathbf{x}, s, t) = \left(\alpha_s - \frac{\sigma_s}{\sigma_t},\alpha_t\right),\mathbf{x} ;+; \frac{\sigma_s}{\sigma_t},\mathbf{x}_t \tag{11} $$

$\mathbf{x}$: 시점 0(원본 데이터) 쪽 정보를 나타내는 값.
- 실제론 $\mathbb{E}_{\mathbf{x}}[\mathbf{x} \mid \mathbf{x}_t]$ (깨끗한 이미지를 추정한 값)으로 대체. 네트워크가 “현재 $\mathbf{x}_t$이면 원본 $\mathbf{x}$는 이런 식으로 복원”이라고 예측해준다고 보면 됩니다.
$\alpha_s, \alpha_t$ / $\sigma_s, \sigma_t$: 확산 모델에서 흔히 쓰이는 계수들로, “데이터 vs 노이즈” 비중을 각각 시간 $s, t$에 대해 나타냄.
- 예: $\alpha_0 = 1,\ \sigma_0 = 0$ (시점 0이면 완전 데이터), $\alpha_1 = 0,\ \sigma_1 = 1$ (시점 1이면 완전 노이즈) 식으로 스케줄링.

식 (11)은 “$\mathbf{x}_t$와 $\mathbf{x}$를 적절히 섞어서 $\mathbf{x}_s$를 구한다”는 결정론적 레시피

3. DDIM 보간에서 $\gamma_{s|t} \equiv 0$란?

generalized interpolant은 “시간 $t$에서 $s$로 갈 때, ‘평균(중심)’ + ‘잡음($\gamma$)’” 형태 (예: $\mathbf{x}s = I{s|t}(\mathbf{x}, \mathbf{x}t) + \gamma{s|t}$).
그런데 DDIM은 잡음($\gamma$)이 없는 순수 결정론이므로, $\gamma_{s|t} = 0$로 표현합니다.
즉, $\mathbf{x}_s$가 오직 식 (11)에 의해 딱 하나의 값으로 정해지므로, 확률분포 관점에서 델타($\delta$) 분포로 적히는 겁니다 (즉, “다른 가능성 = 0, 이 값 = 100%”).

4. 왜 DDIM Interpolant가 Self-Consistent인가?

Self-Consistent란, “$t \to s$”로 직접 가든, 중간에 시점 $r$을 거쳐 $t \to r \to s$로 가든, 결과가 동일한 분포를 이룬다는 조건입니다.
DDIM 식 (11)을 한 번 더 적용(“중간 시점 $r$에서 또 식 (11)”) 해보면,
- 최종 시점 $s$ 결과가 한 번에 $t \to s$ 적용한 것과 똑같은 수식을 만족함을 확인할 수 있습니다.
논문 본문(부록 C.1)에서 이걸 수식으로 보여주는데, 한 번 더 DDIM을 안에 집어넣으면

$$ \text{DDIM}\bigl(\text{DDIM}(\mathbf{x}_t, \mathbf{x}, t, r), \mathbf{x}, r, s\bigr) = \text{DDIM}(\mathbf{x}_t, \mathbf{x}, t, s) $$

라는 식이 성립해요.

이로써 “중간 분할 경로 = 직행 경로”, 결과가 동일 $\Rightarrow$ Self-Consistent임을 증명합니다.

5. 결정론적 보간 $\Rightarrow$ “Deterministic Minimizer” 가능

복습

$$ q_s(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t);q_t(\mathbf{x}\mid \mathbf{x}_t);q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. \tag{4} $$ $$ p{s|t}^\theta(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s\mid \mathbf{x}, \mathbf{x}_t); p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}t); q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. <식 (6)> $$ $$ \mathcal{L}(\theta) = \mathbb{E}{s, t}\Bigl[D\bigl(q_s(\mathbf{x}s),,p{s|t}^\theta(\mathbf{x}_s)\bigr)\Bigr]. <식 (7)> $$

논문에서 “Eq. (7)”이라 부르는 것은, $\mathcal{L}(\theta) = \ldots$ 형태의 나이브 오브젝티브였는데, “최종 분포 $p_{s|t}^\theta(\mathbf{x} \mid \mathbf{x}_t)$ vs. 실제 분포” 간 거리를 줄이는 작업이었습니다.
DDIM처럼 결정론적 방식을 택하면, “조건부 분포($q_{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t)$) = 델타 분포”가 되고, 그걸 잘 학습하면 유일한 해($p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}_t)$, “Deterministic Minimizer”)로 수렴할 수 있다는 뜻이에요.
반면, 논문이 제안하는 마진 매칭 방식(“IMM” 등)은, 내부 조건부 구조가 여러 형태여도(랜덤이건 결정론이건) 최종 분포만 맞추면 되므로, 해가 여러 개 있을 수 있다고 말합니다.

6. 결론

DDIM은 시점 $t$에서 $s$로 가는 “결정론적 ODE 보간” 방식이고,
이 보간이 Self-Consistent (중간 시점을 거쳐도 동일 결과), $\gamma = 0$이라 완전히 결정론적이므로, 특정 해(Deterministic Minimizer)로 볼 수 있다.
논문에서는 이를 예시로 들어, “결정론적 보간도 Self-Consistent하다는 것”을 부록 C.1에서 증명하고 있습니다.

한 마디로, 식 (11) 하나가 “DDIM = 잡음 없이 $\mathbf{x}$와 $\mathbf{x}_t$ 섞어 시점 $s$를 만듦”을 나타내고, 이 덕에 인덕티브 보간(=Self-Consistent)이 성립하며, 식 (7) 관점에서도 유일한 결정론적 해가 존재한다는 것이 이 부분의 핵심입니다.

Proposition 1

<복습>

$$ q_s(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s \mid \mathbf{x}, \mathbf{x}_t);q_t(\mathbf{x}\mid \mathbf{x}_t);q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. \tag{4} $$ $$ p{s|t}^\theta(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s\mid \mathbf{x}, \mathbf{x}_t); p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}t); q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. <식 (6)> $$ $$ \mathcal{L}(\theta) = \mathbb{E}{s, t}\Bigl[D\bigl(q_s(\mathbf{x}s),,p{s|t}^\theta(\mathbf{x}_s)\bigr)\Bigr]. <식 (7)> $$

<원문>

$$ q_{s|t}(x_s|x, x_t) = \mathcal{N}\bigl(I_{s|t}(x, x_t), \gamma_{s|t}^2 I\bigr) $$

위 generalized interpolant가 $\gamma_{s|t} \equiv 0$ 이고, $I_{s|t}(\mathbf{x}, \mathbf{x}_t)$ mild assumptions(“데이터나 네트워크가 충분히” 좋고, “보간식이 적절히 역함수를 갖는” 같은 조건)을 만족 하면,
- Eq. (7).의 loss를 0으로 만드는 deterministic $p_{s|t}^\theta(\mathbf{x} \mid \mathbf{x}_t)$ 가 존재한다.
  - See Appendix B.6 for formal statement and proof. This allows us to define
    $$ p_{s|t}^\theta(\mathbf{x} \mid \mathbf{x}_t) = \delta(\mathbf{x} - g_\theta(\mathbf{x}_t, s, t)) $$
    for a neural network $g_\theta(\mathbf{x}_t, s, t)$ with parameter $\theta$ by default.

1. “잡음 없이” 보간한다는 건 무슨 뜻일까?

논문에서 보간(generalized interpolant)은 원래 “평균값 + 잡음” 형태로 쓸 수 있습니다.
그런데 여기서는 “잡음($\gamma_{s|t}$)이 없다”고 가정해요.
- 즉, 시점 $t$에서 시점 $s$로 데이터를 이동할 때, 추가로 섞이는 랜덤성이 전혀 없고,
$\mathbf{x}_s$가 오직 한 가지 식에 의해 딱 정해지는 겁니다 (결정론).

2. 결정론적 분포가 뭘 의미해?

$$ \delta(\mathbf{x} - g_\theta(\mathbf{x}_t)) $$
라고 써서, $\mathbf{x}$가 그 함수 $g_\theta$ 결과에 100% 고정된다고 표현하죠.

Eliminating stochasticity

1. 기본 아이디어: “결정론적”으로 시점 $t \to s$ 샘플링

원래 확산 모델(Diffusion Model) 샘플링이나, 다른 유사 모델들은 시점 $t$에서 시점 $s$로 이동할 때 확률적 잡음을 여러 차례 섞어야 하는 경우가 많습니다. (generalized interpolant)
그러나 여기서는 DDIM의 결정론적 보간(interpolant)을 사용합니다.

DDIM Interpolant
- 식으로 표현하면, DDIM은 시점 $t$에서 시점 $s$로 바로 가는 (ODE 기반의) 식을 제공합니다.
- 일반적으로 DDIM 보간은 $\gamma_{s|t} \equiv 0$인 형태(추가 잡음이 없음)로, 입력 $\mathbf{x}_t$와 추정된 원본 $\mathbf{x}$를 조합해서 한 번에 $\mathbf{x}_s$를 구해요.
결정론적 모델
- 실제로 “$\mathbf{x} = g_\theta(\mathbf{x}_t, s, t)$”라는 신경망(혹은 함수)로, “시점 $t$ 샘플 $\mathbf{x}_t$가 주어졌을 때, 원본 $\mathbf{x}$은 무엇인가?”를 추정합니다.
- 그 뒤, DDIM 식 안에 그 “$\mathbf{x} = g_\theta(\mathbf{x}_t,\dots)$”를 대입해 $\mathbf{x}_s$를 구하면, 전 과정이 결정론으로 구성됩니다.

3. 실제 샘플링 방식: 시점 $s$에서의 $\mathbf{x}_s$

논문 식에 따르면,
$$ \mathbf{x}s = f{s,t}^\theta(\mathbf{x}_t) := \text{DDIM}\left(\mathbf{x}_t,\ g_\theta(\mathbf{x}_t, s, t),\ s,\ t\right), $$
여기서

$\mathbf{x}_t$는 이미 “진짜 분포 $q_t(\mathbf{x}_t)$”에서 뽑은 시점 $t$ 샘플.
- 예: 확산 모델에서 “$t \approx 1$이라면 완전 잡음 상태”, “$t = 0$이라면 원본 데이터”.
$g_\theta(\mathbf{x}_t, s, t)$는 “$\mathbf{x}_t$와 시점 $s,t$를 입력받아 원본 데이터($\mathbf{x}$)을 복원”하려는 신경망.
$\text{DDIM}$은 DDIM 공식을 통해 “$\mathbf{x}_t$”와 “$\mathbf{x} = g_\theta(\mathbf{x}_t,\dots)$”를 적절히 섞어, 최종 $\mathbf{x}_s$를 만들어내는 결정론적 식.

5. 결론: DDIM + Deterministic → 추가 랜덤 필요 없다

결론: “Eliminating Stochasticity”가 가능하다는 말은,
1) DDIM 보간 (추가 잡음 없음),
2) 결정론적 신경망 $g_\theta$,
3) 표준 가우시안($\sigma_d$)으로 잡음 범위만 설정,
→ 이 세 요소를 합치면, 확률적으로 샘플링해야 할 스텝 없이도 $\mathbf{x}_s$를 한 번에 만들 수 있다는 뜻.
이것이 논문에서 의도하는 “Stochasticity 제거”의 요점입니다.
따라서, “ODE sampler” (DDIM 방식을 통한 단일 단계)로도 샘플링이 끝나며, 추가로 무작위화할 이유가 사라집니다.

Re-using $x_t$ for $x_r$

<복습>

$$ p_{s|t}^\theta(\mathbf{x}s) = \iint q{s|t}(\mathbf{x}s\mid \mathbf{x}, \mathbf{x}_t); p{s|t}^\theta(\mathbf{x}\mid \mathbf{x}t); q_t(\mathbf{x}_t);d\mathbf{x}_t,d\mathbf{x}. <식 (6)> $$ $$ \mathcal{L}(\theta_n) = \mathbb{E}{s,t}\Bigl[ w(s,t),\mathrm{MMD}^2\Bigl( ,p_{s|r}^{\theta_{n-1}}(\mathbf{x}s),; p{s|t}^{\theta_n}(\mathbf{x}_s) \Bigr) \Bigr]. < 식 (8) > $$

1. 왜 $\mathbf{x}_r \sim q_r$가 필요할까?

논문 식 (8)과 (6)을 보면 (또는 이와 유사한 구조를 보면),
- “시점 $r$”에서 샘플 $\mathbf{x}_r$가 진짜 분포 $q_r(\mathbf{x}_r)$에서 나와야 한다는 조건이 등장합니다.
“모델이 $t$에서 $s$로 가는 것과 $r$에서 $s$로 가는 것을 비교”하려면, 둘 다 진짜 분포 샘플이 있어야 평가(또는 학습) 가능하다는 식의 맥락을 상상할 수 있습니다.

2. 기존 방식: $(\mathbf{x}, \boldsymbol{\epsilon})$ 새로 뽑아 $\mathbf{x}_r$ 만들기

원래라면, “시점 0에 있던 원본 $\mathbf{x}$”와 “잡음 $\boldsymbol{\epsilon}$”을 새로 뽑아, 시점 $r$로까지 Forward (또는 역, 중간) 가는 과정을 통해 $\mathbf{x}_r$를 생성할 수 있습니다.
하지만 이렇게 하면 추가로 새 샘플을 계속 뽑아야 해서, 계산 부담도 커지고 샘플 변동 (“분산”)도 증가합니다.

3. 대안: “$\mathbf{x}_t$와 $\mathbf{x}$ 재사용해서 $\mathbf{x}_r$ 만들기”

논문에서는 “차라리 한 번 얻은 $\mathbf{x}_t$” (이미 시점 $t$에서 샘플링된 것)를 써먹자고 제안합니다.
그리고 DDIM을 이용해,
$$ \mathbf{x}_r = \text{DDIM}(\mathbf{x}_t,\ \mathbf{x},\ r,\ t). $$
$\text{DDIM}(\cdot)$은 시점 $t \to r$로 가는 결정론 보간 식.

요점

이렇게 하면, “새로운 $(\mathbf{x}, \boldsymbol{\epsilon})$ 조합을 뽑지 않아도” 시점 $r$ 샘플을 얻을 수 있음.
즉, “이미 시점 $t$에 존재하던 $\mathbf{x}_t$”에서 한 번에 $r$로 결정론적으로 이동 가능.

4. 정당화: “$\mathbf{x}_r$가 이렇게 만들어져도 $q_r(\mathbf{x}_r)$ 그대로 유지”

논문은 Appendix C.2를 인용하며,

“$\mathbf{x}_r$를 $\mathbf{x}_t$에서 DDIM 보간으로 만들었지만, 그 결과는 ‘원래의 시점 $r$ 분포 $q_r$’와 동일하게 유지된다.”
이는 Self-Consistent나 Marginal-Preserving 같은 개념과 연관되어 있습니다.
- DDIM 보간이 중간 시점을 바꾸든 직접 가든 결과 분포가 동일하게 유지된다는 성질 덕분에,
- 시점 $t$에서 만든 $\mathbf{x}_r$가 본래 $q_r(\mathbf{x}_r)$ 분포와 충돌하지 않는 거죠.

5. 최종 결론

“Re-using $\mathbf{x}_t$ for $\mathbf{x}_r$”: 1) $\mathbf{x}_r$를 $\text{DDIM}(\mathbf{x}_t,\mathbf{x},r,t)$로 계산,
2) 이로써 새 샘플링 과정 없이 시점 $r$ 샘플을 얻는다,
3) 그리고 시점 $r$ 분포가 그대로 유지되므로 분산(오차)을 줄이고 효율적.

Simplified Objective

식 (1)은 아래의 식과 동치이고, 우리는 아래의 식을 objective로 씁니다.
아래 그림 매우 중요. 확대해서 자세히 보자.
“M을 늘린다고 해서, 모델 호출 수는 달라지지 않습니다"
구체적으로, 하나의 배치(B개의 예시)를 모델에 통과시킬 때, 이미 B개의 전파(forward pass)가 일어나므로,
- M을 조정해도 배치 내 구성만 달라지고 추가 모델 호출이 크게 늘지는 않는다는 의미예요.
즉, 배치 크기를 어떻게 분할해서 M개의 샘플씩 묶느냐의 문제지, 전체 forward pass 횟수는 여전히 “배치 B”번으로 동일하다는 이야기입니다.

Full training algorithm

아래 그림 매우 중요. 확대해서 자세히 보자.

classifier-free guidance

4.2. Other Implementation Choices

Flow trajectories

Network

Noise conditioning

Mapping function

Kernel function

Weighting $w(s,t)$ and distribution $p(s, t)$

4.3. Sampling

아래 그림 매우 중요. 확대해서 자세히 보자.
restart sampling
- sampling 과정을 확률적으로 만든다.
- 더 다양한 mode coverage
- 샘플 품질 up

5. Connection with Prior Works

7. Experiments

8. Conclusion

DriveArena (클로즈드 루프 생성 시뮬레이션)

Fri, 07 Mar 2025 14:37:14 GMT

딥 리서치 설명

데이터 기반 생성 시뮬레이션: DriveArena는 교통 시뮬레이션에 생성 모델 접근법을 도입한 혁신적인 시뮬레이터입니다.
이 시스템은 물리적 차량 시뮬레이션을 위한 Traffic Manager와, 생성 기반 비전 모델인 World Dreamer로 구성됩니다.
World Dreamer는 nuScenes 데이터셋으로 사전 학습된 확산(diffusion) 모델로, 실제 도시 주행 데이터를 바탕으로 현실적인 주행 장면(카메라 뷰)을 생성하고 이들의 변화를 예측할 수 있습니다.
그 결과, DriveArena는 nuScenes와 같은 데이터셋의 분포에 맞는 고화질, 실제와 유사한 비주얼 및 행동을 생성할 수 있습니다.
시뮬레이션 내의 모든 NPC 차량은 수천 개의 실제 궤적으로 학습된 모델에 의해 제어되므로, 단순 규칙에 의존하지 않고, 일반적인 도시 주행 패턴부터 복잡한 상호작용까지 다양한 자연스러운 움직임을 보여줍니다.
핵심 차이점은, DriveArena가 단순히 nuScenes 로그를 재생하는 것이 아니라, 이를 사용해 클로즈드 루프 생성 에이전트를 학습하여 새로운 지도나 시나리오에서도 인간과 유사하게 주행할 수 있도록 한다는 점입니다.
다중 에이전트 및 사용자 정의 행동: DriveArena에서 Traffic Manager는 다중 에이전트 메커니즘을 담당하며, 임의의 HD 지도를 불러와 도시 규모의 도로망에서 교통 흐름을 시뮬레이션할 수 있고, 서로 다른 차량 유형이나 행동을 설정할 수 있습니다.
기본적으로 모든 NPC에 대해 생성된 트래픽 모델을 사용하지만, 시스템은 모듈식으로 설계되어 있어 필요에 따라 다른 트래픽 정책으로 교체할 수 있습니다.
클로즈드 루프 설계 덕분에, ego 차량의 행동이 Traffic Manager의 동역학에 의해 다른 차량의 행동에 영향을 주며, 생성 모델은 이러한 상호작용에 부합하는 새로운 영상을 생성합니다.
이를 통해, ego 에이전트와 ego의 행동에 따라 반응하는 교통 에이전트를 동시에 평가할 수 있습니다.
또한, 이 프레임워크는 새로운 지도나 “도심, 낮, 흐림”과 같은 프롬프트를 쉽게 반영할 수 있어, 이에 맞는 현실적인 교통 비주얼을 생성합니다.
모듈성 및 사용 편의성: DriveArena는 매우 모듈화되어 있습니다.
Traffic Manager는 플러그 앤 플레이 방식의 물리 엔진 역할을 하며, 외부 시뮬레이터(CARLA 등)와 연동할 수도 있고, World Dreamer는 플러그 앤 플레이 렌더러로서, 탑다운 레이아웃에서 포토리얼리틱한 프레임을 출력합니다.
이러한 설계 덕분에, DriveArena는 비전 기반 자율주행 에이전트와 쉽게 연동되어, ego 차량은 생성된 카메라 영상을 입력받고 제어 혹은 궤적 출력을 생성하며, 이 출력이 시뮬레이션 스텝에 반영됩니다.
본질적으로, 이는 새로운 데이터를 수집하지 않고도 엔드투엔드 자율주행 모델의 클로즈드 루프 동작을 테스트할 수 있게 해줍니다.
프로젝트는 GitHub에서 ⭐ 301개 이상의 스타를 보유하고 있으며, UniAD와 같은 최신 비전 기반 자율주행 에이전트의 데모를 통해 DriveArena의 생성 환경에서 실제 주행하는 모습을 보여줍니다.
DriveArena는 최소한의 설정으로 지도(또는 CARLA 타운)를 불러와, 센서 데이터와 반응형 트래픽을 함께 제공할 수 있도록 문서와 인터페이스를 갖추고 있습니다.
또한, 학계와 산업체(예: Shanghai AI Lab)의 협력을 통해 활발히 유지보수되고 있어, 향후 nuScenes 외 다른 데이터셋으로 확장될 가능성도 있습니다.
기존 시뮬레이터 대비 고유한 장점: DriveArena의 접근 방식은 전통적인 시뮬레이터와는 완전히 다릅니다. 실제 데이터로부터 세계 모델을 학습하고, 이를 바탕으로 새로운 시나리오를 생성하기 때문에 두 가지 주요 장점이 있습니다.
첫째, 행동의 리얼리즘 – NPC 운전자들이 실제 운전자에게서 학습한 패턴에 기반하여 의사결정을 내리므로, 미세한 차이(예: 살짝 밀어내기, 서서히 정지 등)를 포착할 수 있습니다.
둘째, 센서 수준의 리얼리즘 – DriveArena는 실제 대시캠 영상처럼 보이는 카메라 뷰를 출력할 수 있어, 단순 로그 재생만으로는 새로운 시나리오에 대해 얻을 수 없는 센서 데이터를 제공합니다.
반면 CARLA는 센서 출력을 제공하지만 NPC 행동은 데이터 기반이 아니므로, DriveArena는 실제와 유사한 비주얼과 상호작용을 모두 제공한다는 점에서 우위를 점합니다.
또한, 계획 중심의 시뮬레이터(BARK나 GPUDrive 등)와 비교할 때, DriveArena는 전체 AV 스택 – 특히 비전 기반 정책 – 을 테스트할 수 있도록 설계되어, 에이전트가 생성된 환경에서 어떻게 주행하는지를 관찰할 수 있는 독특한 기능을 제공합니다.
더불어, DriveArena는 지도에 구애받지 않고(어떤 도시 지도라도 사용 가능) 생성적이어서, 데이터셋에 기록된 것 이상의 무한한 교통 장면을 생성할 수 있습니다.
이러한 생성적 다양성과 높은 정밀도는 결정론적 로그 재생이나 단순한 트래픽 시뮬레이터와 차별화되는 강력한 기능입니다.

GPUDrive

Fri, 07 Mar 2025 14:31:03 GMT

딥 리서치의 대답

https://github.com/Emerge-Lab/gpudrive

장점:

초고속 시뮬레이션: 약 10^6 스텝/초의 시뮬레이션 속도를 달성하여, 다른 시뮬레이터로는 불가능한 대규모 학습을 실행할 수 있습니다(GPUDrive 논문). 이는 강화학습이나 희귀 이벤트를 브루트포스로 샘플링하여 검증하는 데 이상적입니다.
다중 에이전트 집중: 서로 다른 정책을 사용하는 많은 에이전트를 본질적으로 지원하여, 교통 동역학, 자율주행 차량 상호작용, 또는 차량 플릿 행동 연구에 매우 유리합니다.
데이터 기반 시나리오 및 에이전트: 실제 Waymo 데이터셋을 사용하여 환경을 구성하며, 모방 학습이나 강화학습을 통해 데이터 기반 에이전트 모델을 학습하거나 배포할 수 있습니다. 이를 통해 실제 교통 상황 내에서 반응적이고 현실적인 운전자 행동을 학습할 수 있습니다.
높은 커스터마이제이션: 사용자가 동역학, 보상 함수 등 시뮬레이션 로직을 직접 작성할 수 있어, 새로운 주행 계획기나 예측기를 연구할 수 있는 완벽한 통제권을 제공합니다. 이는 이종 행동을 쉽게 삽입할 수 있도록 하여 그룹별 에이전트 커스터마이제이션 요구를 충족합니다.
병렬 실험: 단일 GPU에서 많은 시나리오를 병렬로 실행할 수 있어, 대규모 랜덤 실험을 통한 통계적 평가가 가능하며, 이는 느린 시뮬레이터에서 어려운 부분을 보완합니다.

단점:

가파른 학습 곡선 및 기술적 부담: 효과적으로 사용하려면 C++17, CUDA 등 저수준 프로그래밍 및 GPU 프로그래밍에 대한 지식이 필요합니다. Python 스크립트만으로 시나리오를 정의하는 것보다 복잡할 수 있습니다.
물리적 리얼리즘 저하: 극한 속도를 달성하기 위해 차량 동역학과 충돌 모델링에서 일부 타협이 이루어지며, 실제 서스펜션이나 타이어 마찰 등의 세밀한 물리 효과는 추상화됩니다. 이는 차량 제어 검증이나 안전 물리 시뮬레이션에는 적합하지 않을 수 있습니다.
그래픽 및 센서 출력 부재: GPUDrive는 아름다운 그래픽이나 센서 출력을 생성하지 않고, 클로즈드 루프 상태 시뮬레이션만 제공하므로, 데모용 시각화나 엔드투엔드 주행 비전 모델 테스트에는 직접적으로 활용하기 어렵습니다.
데이터 제한 및 전처리: Waymo 데이터셋에 집중함으로써, 해당 데이터셋의 한계(특정 지리, 센서 오류 등)를 그대로 물려받으며, 대용량 파일을 GPU 최적화된 커스텀 바이너리 포맷으로 변환하는 추가 작업이 필요할 수 있습니다.
기본적으로 RL 지향: 내장 예제 및 사전 학습된 에이전트가 대부분 강화학습 기반이므로, 순수 모방 학습에 초점을 맞추려면 사용자가 별도로 구현해야 합니다. 이는 SISL NGSIM 환경과 대조적이며, RL 정책이 제약 없이 학습될 경우 비인간적인 전략을 생성할 위험도 있습니다.

주요 기능 및 능력:
GPUDrive는 GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS 논문에서 소개된 2024년의 신생 오픈소스 시뮬레이터로, 자율주행 연구를 위한 전례 없는 시뮬레이션 속도와 확장성을 목표로 합니다. 이 시뮬레이터는 Madrona 게임 엔진 위에 구축되어 있으며, GPU 실행에 최적화되어 벤치마크에서 초당 100만 스텝 이상의 시뮬레이션 속도를 달성합니다(GPUDrive 논문) – 이는 기존 시뮬레이터보다 몇 배나 빠른 속도입니다. GPUDrive의 핵심 아이디어는 시뮬레이터 병목 현상을 제거하여 다중 에이전트 주행 계획 및 학습을 대규모로 가능하게 하는 것입니다. 사용자는 관측 공간, 보상 함수, 동역학을 C++로 정의할 수 있으며, 이 코드들은 CUDA 코드로 JIT 컴파일되어 매우 복잡한 에이전트 행동을 구현할 수 있습니다(GPUDrive 논문). 특히, GPUDrive는 시나리오와 에이전트 로직에 있어 *데이터 기반으로 설계되어, Waymo Open Motion Dataset과 같은 실제 데이터 시나리오를 받아들여 이를 시뮬레이션할 수 있습니다. 저자들은 실제 시나리오에서 강화학습 에이전트를 학습시키는 예를 보여주었으며, 다른 교통 에이전트들이 데이터셋을 그대로 따르거나 학습된 정책에 의해 제어될 수 있음을 언급합니다(GPUDrive 논문). 주로 RL 환경으로 소개되었지만, GPUDrive는 사전 제작된 “시뮬레이터 에이전트”와 실험을 통해 학습된 일부 사전 학습 에이전트도 오픈소스로 포함하고 있습니다(GPUDrive 논문). 이 사전 학습된 모델은 데이터로부터 학습된 반응형 운전자 모델로 볼 수 있는데, RL 정책을 통해 실제 데이터를 기반으로 주행하는 법을 학습한 결과일 가능성이 높습니다. 또한, 이 시뮬레이터는 클로즈드 루프 다중 에이전트 환경으로, 서로 다른 신경망 정책에 의해 제어되는 다수의 차량들이 실시간으로 상호작용할 수 있습니다. 요약하면, GPUDrive의 주요 강점은 실제 데이터 시나리오와 대규모 병렬 시뮬레이션을 결합하여, 수천 가지 변형을 빠르게 실행하며 다중 에이전트 전술이나 드문 이벤트를 탐구할 수 있게 한다는 점입니다. 다만, 고화질 물리 시뮬레이션보다는 속도와 처리량, 그리고 *이종 에이전트 행동 실험에 더 중점을 두고 있습니다.

모듈성 및 그룹 커스터마이제이션:
GPUDrive는 이종 에이전트를 명시적으로 지원합니다. 시뮬레이터는 각 에이전트가 서로 다른 업데이트 함수를 가질 수 있도록 설계되었으며, 이 모든 것이 GPU 코드로 효율적으로 컴파일됩니다(GPUDrive 논문). 예를 들어, 한 그룹의 차량은 보수적인 주행 정책을, 다른 그룹은 공격적인 정책을 사용할 수 있으며, 이는 각기 다른 신경망 가중치나 로직으로 정의됩니다. 이 모든 것이 GPU 상에서 하나의 통합된 시뮬레이션 스텝 내에서 실행되어, 에이전트들이 동기화되고 효율적으로 작동합니다. 그룹을 정의하는 것은 시나리오 초기화 시 각 에이전트에 어떤 정책 인스턴스를 할당할 것인지를 지정하는 것만큼 간단합니다. 저자들은 또한 복잡하고 이종적인 행동 – 즉, 에이전트마다 맞춤형 결정 로직을 구현 – 을 성능 저하 없이 통합할 수 있다고 언급합니다(GPUDrive 논문). 실제로, 사용자는 모방 학습된 모델, RL로 학습된 모델, 또는 휴리스틱 모델 등 다양한 운전자 모델 라이브러리를 구축하여, 시나리오에 혼합하여 적용할 수 있습니다. 그룹별로 최소한의 수정만으로 원하는 행동 함수를 각 에이전트에 할당하면 되므로, GPUDrive는 혼합 자율주행 연구에 매우 강력한 플랫폼이 됩니다. 다만, 이러한 행동 함수는 GPU 호환 코드(C++/CUDA)를 통해 구현되어야 하므로, Python보다 다소 복잡할 수 있지만 속도 면에서는 큰 이점을 제공합니다.

다른 시뮬레이터와의 비교:
GPUDrive는 엄청난 시뮬레이션 속도와 다중 에이전트 학습에 중점을 둔 점에서 돋보입니다. 전통적인 시뮬레이터인 CARLA나 MetaDrive는 실시간 또는 실시간보다 몇 배 빠른 속도로 한 인스턴스에서 실행되지만, GPUDrive는 수 초 만에 하루치 주행을 시뮬레이션할 수 있습니다(물리적 정밀도는 낮을 수 있음). 리얼리즘 측면에서 CARLA는 고화질 물리와 시각 효과를 제공하지만 트래픽 AI가 제한적(규칙 기반)인 반면, GPUDrive는 속도를 위해 동역학을 단순화하지만, 학습된 에이전트가 있다면 훨씬 더 똑똑한 행동을 할 수 있습니다. 즉, GPUDrive는 일부 물리적 세부사항을 희생하는 대신, 대규모 행동 복잡성에 초점을 맞춥니다. 실제 데이터(Waymo)를 시작 시나리오로 사용하기 때문에, 환경은 ScenarioNet이나 Waymax와 유사하게 현실적일 수 있습니다. 다만, GPUDrive의 내장 에이전트 모델은 현재 주로 RL 정책에 초점을 맞추고 있어, 모방 학습 중심은 다소 부족할 수 있으나, 사용자가 직접 모방 정책을 학습하여 삽입할 수 있습니다. Waymax와 비교하면, Waymax는 JAX 기반으로 최적화되어 있으나 GPUDrive의 맞춤형 엔진만큼의 초고속 FPS에는 미치지 못할 수 있으며, 반면 Waymax는 Python과 JAX를 통해 보다 쉽게 사용할 수 있는 반면, GPUDrive는 C++ 지식이 필요합니다. SMARTS와 비교하면, SMARTS는 다중 에이전트 지원이 있으나 Python 및 CPU 성능에 제한이 있는 반면, GPUDrive는 SMARTS보다 훨씬 뛰어난 성능을 제공하지만, SMARTS가 제공하는 다양한 관측 공간이나 통합 도구(예: SUMO를 이용한 교통 시나리오 XML 설정)는 제공하지 않습니다. 요약하면, GPUDrive의 장점은 학습 알고리즘을 위한 확장성에 있으며, 수십억 스텝에 달하는 대규모 시뮬레이션을 실행할 수 있어, 다른 시뮬레이터로는 감당하기 어려운 규모의 실험을 가능하게 합니다(GPUDrive 논문, GPUDrive 논문).

사용성, 데이터셋 통합 및 시나리오 생성:
GPUDrive는 최첨단 연구 도구로, 초보자에게는 다소 사용하기 어려울 수 있습니다. 강력한 NVIDIA GPU와 빌드 과정이 필요하며, 시나리오를 작성하려면 C++ 로직이나 제공된 래퍼를 사용해야 합니다. 다행히도, 저자들은 예제와 Dockerfile을 포함하여 설정 과정을 쉽게 할 수 있도록 오픈소스로 공개하였습니다(GPUDrive 논문). Waymo Open Motion Dataset과의 통합도 제공되며, 사용자는 데이터셋 파일을 GPUDrive가 요구하는 포맷으로 변환한 후 사용할 수 있습니다. 시나리오 생성 측면에서는, GPUDrive 자체에 시나리오 편집 UI가 내장되어 있지 않으므로, 데이터를 재생하거나 코드로 환경(차량 위치, 목표 등)을 정의해야 합니다. 많은 사용자가 평가보다는 학습을 위한 경험 생성에 GPUDrive를 활용할 것이며, Python을 통한 설정 인터페이스도 제공되어 C++로 모든 것을 작성할 필요는 없습니다(GPUDrive 논문). 모듈성은 뛰어나, 새로운 운전자 모델을 추가하려면 새로운 함수(디바이스 함수)를 작성하면 되며, 고급 사용자는 PyTorch 모델을 TorchScript나 ONNX로 내보내 CUDA 커널 내에서 호출할 수도 있습니다. 이러한 유연성은 매우 높지만, 그만큼 전문 지식이 요구됩니다. 유지관리 측면에서는, 이 프로젝트는 2025년 초 기준 300+ GitHub 스타와 활발한 이슈 및 커밋 기록을 보이고 있어, 대학 연구실과 연계된 것으로 보이며, 성능 면에서 최첨단을 달리고 있습니다. 다만, 커뮤니티 채택 측면에서는 CARLA나 Waymax보다 덜 성숙할 수 있습니다.

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

Fri, 07 Mar 2025 14:29:20 GMT

https://github.com/waymo-research/waymax
- 859 stars
https://proceedings.neurips.cc/paper_files/paper/2023/file/1838feeb71c4b4ea524d0df2f7074245-Paper-Datasets_and_Benchmarks.pdf
- 2023, 10월
- 75회 인용

-1. abstract

다중 에이전트 데이터 기반 시뮬레이터
유의미한 논문 기여 정리 (아래 3가지가 다임)
- 1. Waymax는 differentiable 시뮬레이터로 구현하여, 주행 정책 학습 알고리즘의 학습 단축시간을 줄이고, global optimal 에 수렴하기 용이하게 만들었다고 함.
  - 미분 가능한 시뮬레이터라는 뜻은 아래에서 자세히 설명
- 1. Waymax는 네트워크 학습 뿐만 아니라, 시뮬레이션 환경 구동도 hardware-acceleration 할 수 있음 (JAX로 개발하였기 때문)
  - 이는 학습/추론 실행 속도를 향상시킬 수 있다.
  - 학습과 시뮬레이션이 호스트 머신을 통한 통신 병목 현상 없이 가속기에서 완전히 수행될 수 있습니다.
- 1. 멀티 에이전트 학습을 실 데이터를 기반으로 빠르게 수행할 수 있는 코드를 잘 짜놓은 것 같음 (논문의 주장과 github star수가 높은걸 보니 그런것 같은데, 확인 필요)
- 1. Waymo Open Dataset의 실제 주행 데이터를 기반으로 구축
  - 다른 유명한 데이터셋으로도 쉽게 확장 가능하게 코드 짜놨음
  - 다양한 다중 에이전트 시뮬레이션 시나리오를, 데이터셋을 이용해서 초기화하거나 재생(오픈루프를 의미) 하는 기능이 있음
  - 자동차 뿐만 아니라, 보행자와 자전거 이용자와 같은 도심 주행에서의 까다로운 장애물을 시뮬레이션했다고 함 (신호등도 포함)
    - 하지만 자세히 읽어보니, Waymo Open dataset으로부터 non-interactive하게 play하는 것만 가능한듯
    - 자동차를 interactive하게 구현해놓은것은 IDM이 전부임 ㅠㅠ
    - 대신, 내가 interactive-model을 개발한 후, 쉽게 바꿔끼울 수 있게 코드를 잘 구조화해서 짜놨다고 주장하고 있다.

0. 개념: 미분가능한 시뮬레이터?

0.1. 미분 불가능한 시뮬레이터 개념

전통적으로, 물리 기반 시뮬레이션은 "입력(상태와 행동) → 출력(다음 상태)" 관계를 갖는 블랙박스 형태
이러한 시뮬레이션은 주어진 초기 조건과 행동(예: 자동차의 조향각, 가속도, 외부 agent의 변동)에 따라 다음 상태(예: 차량의 위치, 속도, 외부 agent의 변동)를 계산하지만,
- 이 변환 과정은 일반적으로 미분 가능하게 설계되지 않음
그 이유는
- 일반적인 물리 시뮬레이션은 충돌, 마찰, 접지 등의 현상에서 비연속적이고 미분 불가능한 특성을 많이 가짐
- 그리고 ego 차량의 움직임에 따른 주변 agent의 움직임도 예측하기 어렵기 때문
즉, "지금 상태에서 특정 행동을 했을 때, 결과 상태가 어떻게 변하는가?"에 대한 기울기(미분값)를 쉽게 얻을 수 없는 경우가 대부분입니다.

0.2. 미분 가능한 시뮬레이션을 만드는 이유

0.2.1. 효과 : 정책 최적화와 학습 효율성 개선

기존의 RL 기법들은 "정책 파라미터 변화 → 보상 변화" 관계를 명시적으로 알 수 없으므로, 수많은 시뮬레이션을 돌려서 추정치(근사치)를 얻어야 하고, 이 추정치는 노이즈가 많아 불안정합니다.
- 이는 학습 과정에서 시간이 많이 걸리고, 때때로 수렴하기 어려운 상황을 초래
시뮬레이션이 미분 가능해지면, 학습 시간이 줄어들고, global optimum으로 수렴하기 용이해집니다.
- 정책 파라미터 변화에 대한 정확한 기울기를 바로 계산할 수 있습니다.
- 그럼 굳이 많은 에피소드를 통해 통계적 평균으로 기울기를 추정할 필요가 줄어들고, 노이즈로 인한 불안정성도 크게 완화됩니다.

0.2.1.1. 미분 불가능한 시뮬레이터에서, RL 알고리즘의 기울기 추정 방식 복습

Policy gradient, Actor-crtic 모두, "환경과의 상호작용"을 통해 많은 경험(데이터)을 모은 뒤, 그 경험에서 추정한 값으로 기울기를 근사 (시뮬레이터 미분이 불가능하기 떄문)
한 번의 에피소드에서 얻은 보상은 매우 변동성이 크고, 특정한 정책 파라미터 변화에 대한 정확한 정보를 주기 어렵습니다.
- 예를 들어, 같은 정책이라도 어떤 초기 조건, 어떤 랜덤 시드에서 시뮬레이션을 돌리느냐에 따라 결과(보상)가 크게 달라집니다.
- 결국 더 많은 에피소드를 모아 평균을 내야 기울기가 어느 정도 안정화되는데, 이것이 바로 학습 과정이 느리고 불안정해지는 원인
예시: Policy Gradient 학습 과정

0.3. 그럼 어떻게 미분 가능하게 만드는가?

미분 가능한 시뮬레이터를 만들기 위해서는
- 비연속 요소(충돌, 마찰, 접지 등의 현상, 주변 agent의 움직임 모델)을 부드럽게 근사(smoothing)하거나,
- 적절한 수학적 모델(예: 미분 가능한 충돌 모델, 매끄러운 마찰 모델)을 도입해야 합니다.
TODO
- Waymax에서는 이를 어떤 방식으로 구현했는지? 그것이 합리적인지? 를 분석해야함
- 논문을 열심히 읽어봤는데, 이 부분이 자세히 설명되지 않은 것 같음

0.3.1. 방법론 예시: Reparameterization / Pass-through Gradients

시뮬레이션이 미분 하기 위해서는, 환경 내 확률적 요소나 복잡한 연산을 Reparameterization/Pass-through Gradients 기법을 통해 미분 가능 형태로 만들면 된다.
내 생각
- 실제 세계를 잘 모델링한 미분가능한 수학적 모델을 도입해야지, 그냥 미분가능하도록 trick만 쓰는게 무슨 의미가 있나?

Reparameterization

재매개화 기법의 핵심은 "랜덤 변수의 샘플링"을 파라미터 θ와 독립적인 랜덤 변수 ϵ을 통해 표현하는 것입니다.
이렇게 하면 원래 직접적으로 θ에 의존하는 확률적 과정을, ϵ이라는 기정(given)한 노이즈에 대한 결정론적(de-terministic) 변환으로 바꿀 수 있어, θ에 대한 미분이 용이해집니다.

Pass-through Gradient

역전파 과정에서 특정 연산을 단순화하는 방법입니다.
즉, 순전파(forward)에서는 어떤 복잡한(또는 비분리) 함수 변환을 수행하지만, 역전파(backward) 시에는 그 변환을 무시하거나 단순화하여 기울기를 '통과(pass through)'시킵니다.
이를 통해 역전파 그래프를 간소화하고, 기울기 흐름을 원활히 할 수 있습니다.

1. Introduction

CARLA [14], Sim4CV [33], SUMMIT [9]와 같은 시뮬레이터는 photo-realistic 렌더링에 중점을 둠
자율"주행 PnC" 시뮬레이터에서 중요한 점: 속도와 현실성
- 다양한 시나리오와 현실적인 에이전트(예: 차량 및 보행자)의 동작 생성
- closed-loop의 에이전트 동작 시뮬레이션 제공
- 대규모 모델과 데이터세트를 사용하는 최신 머신러닝 트렌드를 지원할 수 있는 높은 속도와 처리량

2.1. 자율주행 시뮬레이터

waymax와 가장 가까운 연구는 Nocturne[52], MetaDrive[25], nuPlan[8]
- 위 3가지는 실제 주행 데이터를 사용하여 시나리오와 기록된 행동을 초기화하는 기능이 있는, 다중 에이전트 자율 주행 시뮬레이터
이와 비교하여, Waymax 시뮬레이터가 가지는 장점은
- 주행 학습 뿐만 아니라, 시뮬레이션 환경 구동도 hardware-acceleration 할 수 있음
- 미분 가능한 시뮬레이션

2.2. 학습 기반 주행 에이전트

Waymax는 대표적인 closed loop 방법(IL 및 RL baseline 세트)의 구현을 제공하고, Waymax가 정의한 표준 메트릭 세트에 대한 성능을 보고하였음
- 논문에서 제시한 6개 비교군 논문 모두 2022년 이전에 작성되었으며, 인용수가 100을 넘지 않고 github 코드도 없음
Waymax는
- 표준 학습 및 평가 워크플로우와 오픈 및 폐쇄 루프 설정에서 신뢰할 수 있는 벤치마킹을 제공함

3. 시뮬레이터 기능

이 섹션에서 논의된 각 구성 요소는 사용자가 자신의 프로젝트 요구에 맞게 쉽게 수정하거나 교체할 수 있습니다.
3.1. 시나리오 및 데이터셋
합성(가상의) 시나리오를 생성하는 시뮬레이터(예: CARLA[14])와 달리,
- Waymax는 실제 주행 로그를 사용하여 주행 시나리오를 인스턴스화하고 고정된 수의 step 동안 실행
Waymax는 Waymo Open Motion Dataset(WOMD)[15]에 대한 기본 지원을 제공
- 각 트래젝토리 스니펫은 0.1Hz로 기록된 9초
- 트래젝토리에는 자율 주행 차량(AV), 다른 차량, 보행자, 자전거를 포함한 장면의 모든 객체에 대한 위치 및 속도 정보가 포함
각 시나리오에 대해 도로 그래프와 같은 정적 정보를 가져와
- 로그된 정보의 시작점을 사용하여 동적 객체를 초기화
그런 다음, 에이전트 모델(3.5절에서 설명할 것임)을 사용하여 시뮬레이션 단계를 통해 주변 동적 객체를 제어
각 모델은 여러 객체를 제어할 수 있습니다.

3.2. state / observation space

자율 주행 제어의 첫 번째 구성 요소는 state space을 정의하는 것
state에는 두 가지 유형의 데이터가 포함
- 1. 에피소드와 시나리오에 걸쳐 변경될 수 있는 동적 데이터
  - 장면의 모든 차량, 자전거 및 보행자의 위치, 회전, 속도 및 bounding box 치수 + 교통 신호등의 색상(빨강, 노랑, 녹색)
- 1. 에피소드 동안에는 동일하지만 시나리오마다 다를 수 있는 정적 데이터
  - "roadgraph"라고 하는, 3D pointcloud로 샘플링된 도로 및 차선 경계
  - 자율 주행 차량의 on-route and off-route paths (이게 뭔지는 뒤에 나옴)
각 에이전트는 사용자 정의 observation 함수를 통해 시뮬레이터 state를 확인하며, 이는 부분 관측 가능성을 구현할 수 있게 함
- waymax에서 제공하는 기본 observation 함수는
  - 다른 모든 차량의 위치를 에이전트의 자체 좌표계로 변환하고
  - "roadgraph"를 거리에 따라 샘플링
    3.2.1. on-Route and off-Route paths
자율 주행 차량이 초기 위치에서 취할 수 있는 실행 가능한 paths로 각 시나리오를 증강
- 각 path는 roadgraph의 점의 하위 집합인 일련의 점들로 표시
각 경로 candidate는 시작 위치에서 roadgraph의 깊이 우선 탐색을 수행하여 계산됨
만약 path candidate가
- AV’s logged trajectory와 같은 road이면 on-route path로 정의
- 그 외는 off-route path로 정의
이 2가지가 state space 에 포함된다.

3.3. Object Dynamics

Waymax는 사용자가 역학 모델을 정의할 수 있도록 하고,
시뮬레이션에서 차량의 물리적 역학을 제어하기 위한 몇 가지 사전 정의된 옵션을 제공
- 1) 보행자, 자전거 등: 모든 유형의 객체에 적합한 델타 동작 공간은 두 연속 상태 사이의 위치 차이(델타 항 ∆x, ∆y, ∆θ)를 사용
- 2) 자동차: 그리고 bicycle action space((a, κ))은 가속도와 조향 곡률을 사용

3.4. Metric

Waymax는
- 안전성과 행동의 정확성(교통 규칙 준수, 충돌 방지 등)뿐만 아니라
- 편안함과 진행 상황을 평가하기 위한 메트릭을 제공
Waymax의 모든 메트릭은 폐쇄 루프 방식으로 계산
중요한 내용이 없어서 세부 사항은 생략했음

3.5. Simulated Agent Behavior

내가 학습/검증할 AV이외의 다른 agent의 현실적인 행동 구현이 매우 중요함
Waymax는 다중 에이전트 시뮬레이터로서 사용자가 시뮬레이션에서 모든 객체의 행동을 제어할 수 있는 기능을 제공
- 이를 통해 사용자는 학습된 행동 모델과 같은 원하는 모델로 에이전트를 제어할 수 있습니다.
그러나 Waymax는 지능형 운전자 모델(IDM)[51]만을 구현해 놓았습니다.
- Waymax의 IDM 에이전트는 데이터에 기록된 로그된 경로를 따르지만,
- IDM을 사용하여 속도 프로필을 조정하여 충돌을 피하고 자유로운 도로에서 가속합니다.

4. Software API

주요 라이브러리는 다음과 같습니다.
- (1) 일반적인 데이터 구조 세트
- (2) distributed 데이터 로딩 라이브러리
- (3) 메트릭 및 dynamics와 같은 시뮬레이터 구성 요소
- (4) Gym과 유사한 환경 인터페이스
각 구성 요소는 수정, 교체 또는 독립적으로 사용할 수 있습니다.

4.1. Environment Interface

state = env.reset(next(dataset))
while not done:
  action = policy(env.observe(state))
  state = env.step(state, action)

4.2. Hardware Acceleration and In-graph training

Waymax은 학습과 시뮬레이션을 동일한 계산 그래프 내에서 결합하는 것을 지원합니다(그래프 내 학습이라고 함).
- 이를 통해 학습과 시뮬레이션이 호스트 머신을 통한 통신 병목 현상 없이 가속기에서 완전히 수행될 수 있습니다.
이러한 기능은 Waymax가 전적으로 JAX[5] 라이브러리를 사용하여 작성되었기 때문에 가능

4.2.1. 시뮬레이션 환경을 JAX 라이브러리를 사용하여 구현할 수 있는 이유

Waymax는 함수형 순수성(functionally pure)을 갖춘 초기화(initialization) 및 전이(transition) 함수를 제공
- 즉, 같은 입력을 주면 항상 동일한 출력을 반환하는 "순수 함수" 형태로 환경을 정의
- 이러한 stateless 설계 덕분에 JAX[5]의 JIT컴파일러와 함수형 라이브러리를 통해 시뮬레이션을 효율적으로 최적화할 수 있습니다. (컴파일 후 GPU에서 빠르게 구동 가능)
- 또한, 이 접근 방식은 사용자가 backtracking 제어 알고리즘(여러 후보 행동 시나리오를 가상으로 미리 시뮬레이션한 뒤 최적의 행동 경로를 선택하는 알고리즘)을 쉽게 구현할 수 있도록 돕습니다.

기존의 OpenAI Gym[6]이나 DM Control[32] 같은 환경은 내부적으로 상태를 관리하는 "상태 저장(stateful)" 방식입니다.
즉, reset()이나 step() 함수를 호출할 때 사용자는 상태를 명시적으로 전달하지 않고, 환경 자체가 내부적으로 상태를 기억합니다.

반면 Waymax에서는 사용자가 명시적으로 상태를 관리하고, 매 시뮬레이션 단계에서 이 상태를 transition 함수에 전달하여 다음 상태를 얻는 방식으로 작동합니다.
- 이로 인해 함수형 패러다임에 맞춰 전체 연산을 정의할 수 있고, JAX를 통한 자동미분, 최적화, 가속기(GPU/TPU) 활용이 용이해집니다.

정리하자면, Waymax의 함수형 순수성 및 상태를 외부에서 관리하는 설계 방식은
- JAX와 같은 함수형, 자동미분 지원 라이브러리와 궁합이 좋아
- 시뮬레이션을 고성능으로 최적화하고 다양한 알고리즘을 유연하게 구현할 수 있게 합니다.

4.3. Single and Multi-agent Simulation

Waymax는 아래 2가지를 모두 지원
- 1. MultiAgentEnvironment
  - 사용자가 시나리오 내의 임의의 객체를 제어할 수 있는 multi-agent simulation
  - 사용자는 장면의 모든 제어 객체에 대한 동시 동작과 제어해야 하는 객체를 나타내는 마스크를 제공
- 1. PlanningAgentEnvironment
  - 학습되거나 규칙 기반 모델 혹은 로그 재생을 사용하여 장면의 다른 차량을 제어하는, single AV 에이전트가 훈련되는 워크플로우
  - 자율 주행 차량만 제어하기 위한 인터페이스를 제공
waymax에서는 하나의 MultiAgentEnvironment에 여러 정책(정책=행동 전략)을 직접 넣는 것은 가능하지만, 유연하지 않다.
- MultiAgentEnvironment를 여러개 만드는게 올바르다.

5. 실험

5.1. 런타임 벤치마크

CPU(Intel Xeon W-2135@3.7GHz)와 GPU(Nvidia-V100)를 사용하여, Waymax의 런타임 성능을 테스트해봤음
다양한 배치 크기로 MultiAgentEnvironment 및 PlanningAgentEnvironment의 성능을 평가
- 매우 빠르다.
- 특히 Waymax가 배치 처리를 지원함

rollout

우리는 또한 전체 에피소드(즉, WOD의 경우 80단계) 동안 Actor가 주어진 환경을 롤아웃하는 Rollout 함수를 벤치마크했음
- Rollout 함수:
  - 여러 단계(step)를 연속적으로 실행하여 전체 에피소드(시나리오)를 한 번에 처리하는 함수
  - 주변 agent와의 상호작용은, step을 한번씩 실행하는 것과 동일하게 일어납니다.
- rollout 함수는 더 빠른 추론과 평가를 제공하는 데 특히 유용
- 이것은 Step 함수를 80번 실행하는 것보다 빠릅니다.
더 중요한 것은 GPU에서 실행하면 일관되게 2배의 속도 향상을 볼 수 있다는 것입니다.

5.2. Baseline Planning Agents

Expert

오픈 루프 훈련을 위함

Behavior Prediction Model(Wayformer)

Wayformer 행동 예측 모델[35]을 계획 설정에 적용
원래 Wayformer는 1초의 컨텍스트 히스토리를 주어진 8초 미래 트래젝토리를 여러 개 예측
이를 planning setting에 적용하기 위해
- 미래를 예측할 때, 단 한 번에 긴 시간을 예측하는 대신, 짧은 구간씩 예측하고, 그 예측 결과(가장 그럴듯한 미래 경로)를 다시 다음 예측의 입력 정보로 사용하는 방식을 썼음

Behavior Cloning

Wayformer[35] 모델의 인코더 부분을 그대로 활용하고, 그 뒤에 4층짜리 residual MLP 구조를 붙여서,
- 전문가(실제 운전자) 주행 데이터의 경로와 최대한 일치하도록(로그 우도를 극대화) 모델을 학습시킨다.

Model-free RL - DQN

We used the Acme [19] implementation of priori- tized replay double DQN [45].

5.3. Planning BenchMark Results

요약: IDM을 주변 agent로 두고 자율주행 정책을 훈련하니, IDM의 양보하는 성향을 자율주행 정책이 배워서 악용해버리더라.
- RL 에이전트는 스스로 충돌을 피하는 방법을 배울 동기가 적습니다.
- 그래서 차라리, 주변 agent를 실제 dataset에서 play하여 Non-interactive 하게 학습하는게 더 낫더라.

딥 리서치의 대답

주요 기능 및 능력:
Waymax는 Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research에서 소개된 2023년 말 출시된 오픈소스 시뮬레이터로, Waymo Open Motion Dataset을 활용한 대규모 에이전트 행동 연구를 위해 설계

Waymax는 가속도와 확장성에 중점을 두고 있으며, JAX(고성능 머신러닝 프레임워크)를 전적으로 사용하여 GPU/TPU 하드웨어에서 다수의 시뮬레이션을 병렬 실행할 수 있도록 합니다.
시뮬레이터는 Waymo의 모션 데이터셋으로부터 실제 시나리오(지도와 에이전트)를 로드하며, 10만 건 이상의 현실적인 주행 상황을 포함하고 있습니다.
이를 통해 사용자는 해당 시나리오 내에서 클로즈드 루프 상호작용을 시뮬레이션할 수 있습니다.
특정 시나리오에서, ego 또는 다른 에이전트는 원래의 기록된 궤적에서 벗어나 행동할 수 있으며, 시뮬레이터는 동역학 모델을 통해 이들의 상태를 시간에 따라 전파
Waymax는 배경 에이전트에 대해 로그 재생 모드(다른 차량들이 ego의 행동을 무시하고 기록된 궤적을 그대로 따름)와 IDM 기반 반응 모델 모드를 기본 제공
이 IDM “트래픽 엔진”은 단순 로그 재생보다 현실적인 반응을 주입하여, 예를 들어 ego 차량이 갑자기 정지할 경우 주변 차량이 속도를 줄여 충돌을 방지하도록 합니다.
중요한 점은, Waymax가 모든 에이전트에 대해 사용자 정의 행동 모델을 삽입할 수 있도록 허용한다는 것입니다
즉, 사용자는 에이전트의 로그 재생 또는 IDM 행동을 자신이 학습한 정책(모방 학습 또는 RL)으로 대체할 수 있으며, 데이터 기반 제어를 구현할 수 있습니다.
Waymax 자체는 사전 학습된 운전자 모델 라이브러리를 제공하지 않지만,
- 시나리오 데이터, 동역학, 평가 지표(충돌, 도로 이탈 등)를 처리하는 강력한 샌드박스로서, 연구자들이 결정 로직 개발에 집중할 수 있도록 합니다.
- 요약하면, Waymax는 실제 시나리오 데이터와 고속 시뮬레이션 코어를 결합하여, 수동 재생과 상호작용형 에이전트 행동을 모두 지원합니다.

모듈성 및 그룹 기반 커스터마이제이션:

Waymax는 모듈식 제어를 염두에 두고 설계되었습니다.
시나리오 내의 각 객체(차량, 보행자 등)는 각 타임스텝마다 행동을 결정하기 위한 정책 함수가 할당될 수 있습니다.
기본적으로, 모든 비제어 에이전트는 로그 플레이어나 파라미터화된 IDM 컨트롤러를 사용하지만, API를 통해 에이전트별로 행동을 재정의할 수 있습니다.
예를 들어, 시나리오에서 n대의 차량은 사용자 정의 학습 정책(알고리즘 A)을 사용하도록 지정하고, 나머지 m-n 대의 차량은 표준 IDM 모델(알고리즘 B)을 사용하도록 할 수 있습니다.
이는 시뮬레이션 롤아웃 전에 특정 에이전트 ID에 대해 사용자의 정책을 주입함으로써 수행됩니다.
이러한 그룹 지정은 매우 간단하며, 여러 정책 객체를 유지한 후 이를 각 에이전트 집합에 매핑하면 됩니다.
시뮬레이터의 스텝 루프는 각 에이전트의 해당 정책을 호출하여 행동을 결정합니다.
또한, Waymo 데이터에서 추출된 시나리오의 경우, 어떤 특정 기록된 에이전트를 교체할지 선택할 수 있어(예: 교차로를 막고 있던 트럭만을 학습된 공격적 트럭 정책으로 교체하고 나머지 차량은 IDM 사용) 세밀한 제어가 가능합니다.
이러한 세밀한 제어는 Waymax의 함수형 설계 덕분에 최소한의 수정으로 달성되며, ego 또한 단순한 에이전트 중 하나로 취급되어 다중 ego 시뮬레이션도 가능합니다.

다른 시뮬레이터와의 비교:

Waymax의 가장 가까운 경쟁자는 nuPlan 및 ScenarioNet으로, 이들 모두 실제 모션 데이터셋을 시나리오의 기초로 사용합니다.
Waymax는 Waymo 데이터셋 전용이지만, JAX 기반의 최신 엔진(병렬 처리를 위한 GPU 최적화)을 제공하여,
- 이 엄청난 속도(예: Google의 Brax 물리 엔진과 유사한 인터페이스 제공)는 대규모 테스트나 학습에 있어 독보적인 장점입니다.
Waymax는 동시에 다수의 에이전트를 다루도록 설계되었습니다.
- 이는 복잡한 다중 에이전트 상호작용을 포함하는 Waymo 데이터셋의 규모를 그대로 활용할 수 있음을 의미합니다.
그러나 Waymax는 학습된 모델과 함께 사용되도록 설계되었으므로, Waymo 연구팀이 종종 학습된 주행 계획기나 예측기를 교체하여 테스트하는 방식과 맞닿아 있습니다.
또한, Waymax는 센서를 시뮬레이션하거나 포토리얼리즘을 제공하지 않고, 경계 상자 및 웨이포인트와 같은 추상화된 상태로 작동하여, 행동 연구에 적합하도록 설계되었습니다.

사용성, 데이터셋 통합 및 시나리오 생성:

Waymax는 Waymo Research의 공식 발표물로서 잘 문서화되어 있으며, 튜토리얼이 제공됩니다.
연구자에게 무료이나 다운로드 요청을 통해 제공되는 Waymo Open Motion Dataset에 접근해야 합니다.
- 데이터셋과의 통합은 원시 tfrecord 파일을 직접 로드할 수 있을 정도로 원활하며,
- Google Cloud Storage에서 스트리밍 데이터 지원도 있어, 매우 대용량의 데이터셋을 다루는 데 유리합니다.
사용자 정의 에이전트 모델을 작성할 때 JAX에 익숙해야 하며, 간단한 Python 함수로 프로토타입을 작성할 수도 있습니다.
- 또한, 로그 분포, 충돌, 도로 이탈 등 시뮬레이션과 원본 로그 또는 서로 다른 컨트롤러 간의 차이를 정량적으로 평가할 수 있는 평가 지표도 포함되어 있습니다(Waymax Metrics).
시나리오 생성 측면에서는, Waymax 자체는 새로운 지도나 교통 흐름을 생성하지 않고, Waymo 데이터셋에서 제공된 시나리오에 의존합니다.
그러나, 해당 데이터셋이 다양한 조건을 포함하고 있기 때문에, 관심 있는 시나리오(예: 보호받지 못한 좌회전 상황)만 선택하여 사용할 수 있습니다.
내장된 랜덤 시나리오 생성기는 없지만, 불러온 시나리오의 초기 조건(예: 차량의 시작 속도 또는 위치)을 수정하여 변형을 만들고, 강건성을 테스트할 수 있습니다.
모듈성은 강력하여, 새로운 모델을 추가할 때 시뮬레이터 핵심 코드를 수정할 필요 없이, 해당 에이전트에 대해 함수를 전달하면 됩니다.
Waymax는 함수형 및 벡터화된 설계 덕분에 복잡도가 증가해도 결정론성과 속도를 유지할 수 있습니다.

장점:

진정한 데이터 기반 시나리오: Waymo의 실제 주행 시나리오(고도로 상세하며 다수의 에이전트를 포함)를 직접 사용하여, 수작업 없이도 높은 현실감을 제공합니다(Waymax).
고성능 (JAX): GPU/TPU에서 JIT 컴파일된 물리 및 병렬 처리를 통해 시뮬레이션을 극도로 빠르게 실행할 수 있어, 수백만 스텝 또는 수천 개의 시나리오를 병렬 실행할 수 있습니다.
사용자 정의 모델 지원: 사용자 정의 에이전트 모델을 쉽게 통합할 수 있으며, 각 차량에 대해 서로 다른 제어기를 할당하여 이종 운전자 집단(예: 인간 유사 정책과 로봇 정책의 혼합) 실험이 가능합니다.
내장 평가 지표 및 안전 검사: 충돌, 도로 이탈 등 표준화된 평가 지표를 제공하여, 시뮬레이션 결과가 실제 운전자 데이터와 얼마나 일치하는지 정량적으로 평가할 수 있습니다.
전문가에 의한 활발한 개발: Waymo 연구팀이 개발하여 실제 자율주행 평가 요구에 부합하며, 오픈소스로 공개되어 사용자 기반이 꾸준히 성장하고, 지속적인 업데이트가 예상됩니다.

단점:

데이터셋 특이성: 기본적으로 Waymo Open Motion Dataset 포맷만 지원하므로, 다른 데이터셋(nuScenes, Argoverse) 시나리오를 사용하려면 변환 코드 작성 또는 확장이 필요
사전 학습된 모방 모델 부재: 내장된 트래픽 행동은 로그 재생 또는 단순한 IDM 모델에 의존하므로, 진정한 인간과 유사한 반응형 교통을 구현하려면 사용자가 직접 모방 학습 모델을 학습하여 삽입해야 합니다.
추상적 시뮬레이션 (센서 미지원): Waymax는 카메라 이미지나 라이다 스캔을 생성하지 않고, 경계 상자와 웨이포인트 등 추상적인 상태 정보만 제공하여, 엔드투엔드 자율주행 테스트에는 적합하지 않습니다.
JAX 학습 곡선: JAX 및 함수형 프로그래밍 스타일에 익숙해야 최대 성능을 활용할 수 있으며, 기존의 Python OOP 인터페이스에 익숙한 개발자에게는 다소 어려울 수 있습니다.
제한된 시나리오 편집: 시나리오가 로그에서 제공된 실제 상황에 기반하므로, 지도 및 초기 조건이 고정되어 있으며, 임의의 도로망이나 교통 흐름을 새롭게 설계하기에는 한계가 있습니다.

MetaDrive

Fri, 07 Mar 2025 07:17:21 GMT

https://github.com/metadriverse/metadrive

MetaDrive 시뮬레이터

1. “절차적 장면 생성” – 정확히 무엇을 의미하는가?:

MetaDrive에서 절차적 장면 생성은 미리 제작된 고정된 맵 대신 알고리즘적 과정을 통해 새로운 도로 맵/환경을 자동으로 생성하는 것을 의미합니다.
MetaDrive는 매우 구성 가능하여, 미리 정의된 도로 “블록”(예: 직선 구간, 곡선, 교차로, 회전 교차로 등)을 가지고 있으며, 이 블록들을 무작위로 조합하여 완전한 주행 시나리오를 구성합니다.
각 에피소드(특히 MetaDrive 환경에서 리셋하거나 시작할 때)마다 시뮬레이터는 이 구성 요소들을 임의의 매개변수(길이, 곡률 등)를 사용하여 연결함으로써 새로운 도로 레이아웃을 생성하고, 이로써 무한한 변형의 주행 맵을 만들어냅니다 .
이는 에고 차량이 매 실행마다 이전에 본 적 없는 새로운 타운이나 고속도로 구조에 직면할 수 있음을 의미하며, 강화학습에서 일반화를 돕습니다.
이는 마치 무한한 다양한 트랙 구성을 갖는 상황과 같다고 할 수 있습니다.

“절차적”이라는 용어는 고정된 맵 접근 방식과 대비되며, CARLA나 다른 시뮬레이터에서는 Town01, Town02 등 제한된 맵만 존재하는 반면, MetaDrive에서는 절차적 생성 모드를 통해 매번 다른 레이아웃을 제공받을 수 있음을 의미합니다. 그 목표는 MetaDrive 논문에서 “무한한 수의 다양한 주행 시나리오를 생성할 수 있다”고 언급한 것처럼, 학습 및 테스트를 위한 무한한 장면을 지원하는 것입니다

2. 비반응 교통 차량(에고와 상호작용 제한적)은 로그 리플레이와 유사한가?:

대체로 그렇습니다. MetaDrive에서 교통이 에고 차량에 대해 비반응적이라고 표현할 때, 이는 배경 차량들이 에고 차량의 행동에 동적으로 반응하지 않고, 미리 정해진 경로나 단순 규칙에 따라 움직인다는 것을 의미합니다.
이는 로그 리플레이 또는 스크립트화된 교통 시나리오와 유사합니다. 실제로 MetaDrive는 Waymo, nuScenes 등 실제 주행 로그를 가져와 시나리오로 재생할 수 있는 기능을 제공합니다.

심지어 순수 합성 절차적 모드에서도, 기본적으로 교통 차량들은 에고에 대해 적극적으로 반응하지 않고, 정해진 교통 규칙이나 경로를 따릅니다.
특히, MetaDrive는 과거에 “트리거” 메커니즘을 도입하여, 에고 차량이 가까이 오기 전까지 NPC 차량들이 정지해 있다가, 에고가 접근하면 미리 정해진 행동을 시작하도록 했습니다.
일단 움직이기 시작하면, 에고 차량에 반응하여 주행 정책을 변경하지는 않으며, 앞 차량이 느리더라도 에고 차량에 의해 갑자기 끼어들 경우 회피하지 않을 수 있습니다.
이는 제한된 상호작용으로 볼 수 있는데, 에고를 별도의 에이전트로 인식하여 반응하지 않기 때문입니다.
본질적으로, MetaDrive의 기본 교통(특히 단일 에이전트 모드)은 에고 차량이 존재하지 않는 것처럼 행동하며(충돌 물리 효과는 제외), 이는 로그 리플레이나 스크립트화된 교통과 유사합니다.
따라서, *기본적으로 MetaDrive의 교통 차량은 대부분 에고에 대해 비반응적으로 동작합니다. *
데이터셋 기반 시나리오에서는 실제로 로그 리플레이와 동일한 방식입니다
절차적 모드에서는 에고의 행동을 고려하지 않는 규칙 기반 자동 조종을 따릅니다(일부 간단한 트리거 외에는).

3. MetaDrive에서 반응 교통 차량을 쉽게 배치할 수 있는가?:

네, MetaDrive는 교통 차량을 반응 모드로 전환할 수 있는 간편한 옵션을 제공합니다.
최근 버전(특히 MetaDrive-Scenario 애드온)에서는 IDM 제어 교통을 활성화하는 플래그가 있습니다.
IDM은 Intelligent Driver Model의 약자로, 차량이 앞차와의 안전 거리를 유지하며 충돌을 피하도록 속도를 조절하는 일반적인 차량 추종 모델입니다.
MetaDrive의 시나리오 러너는 --idm_traffic 인자를 추가하여 “이 차량들을 반응 차량으로 전환”할 수 있도록 합니다.

실제로, Waymo 데이터셋 기반 시나리오를 불러올 경우(기본적으로 Waymo 궤적을 그대로 재생함), --idm_traffic 옵션을 추가하면 해당 차량들이 Waymo 궤적을 참조하되 충돌을 피하기 위해 경로를 변경하게 되어 시나리오가 상호작용적이 됩니다 .
이는 “재생 모드”와 “인터랙티브 트래픽 모드” 간의 토글 기능과 같으며, 합성 절차적 시나리오에서도 IDM을 활성화하면 모든 교통 차량이 에고 및 다른 차량에 반응하는 교통 흐름 시뮬레이션을 수행하게 됩니다.

4. MetaDrive가 다른 오픈소스 시뮬레이터와 비교하여 가지는 가장 차별화된 특장점:
MetaDrive는 몇 가지 주요 측면에서 차별화됩니다.

절차적 생성 및 다양성:
MetaDrive는 일반화 가능성을 위해 설계되었습니다. 무한히 새로운 주행 시나리오를 자동으로 생성할 수 있는 능력이 큰 특징입니다. 또한, 실제 시나리오(예: Waymo, nuPlan)를 가져와 합성 시나리오와 결합할 수 있어, 실제와 합성 시나리오를 효과적으로 혼합할 수 있습니다.
경량화 및 높은 성능:
- MetaDrive는 경량하고 효율적으로 설계되어, 적당한 PC에서 500~1000 FPS의 매우 높은 프레임 속도로 실행할 수 있습니다.
- 이는 CARLA나 AirSim과 같이 고화질을 추구하는 시뮬레이터에 비해, 속도와 효율성이 우수합니다.
- 또한, pip install로 간편하게 설치할 수 있고, Windows, Linux, Mac 등 다중 플랫폼을 지원하므로, 고성능 GPU나 언리얼 엔진 설정 없이도 시작할 수 있습니다.
내장된 다중 에이전트 및 RL 환경 지원:
- MetaDrive는 단일 에이전트 및 다중 에이전트 환경 모두를 포함한 Gym 스타일의 환경을 기본적으로 제공
- 연구자들이 강화학습 알고리즘에 바로 연결할 수 있도록 패키지화되어 있음
- 예를 들어, 다중 에이전트 교차로 또는 안전 주행과 같은 시나리오와 이에 따른 보상 함수 등을 기본적으로 제공합니다.
- 다른 시뮬레이터들은 보상 계산 및 다중 에이전트 처리를 사용자가 직접 구현해야 하는 반면,
  - MetaDrive의 API는 환경 리셋과 스텝 실행 시 관측치(라이다 포인트 클라우드, 카메라 이미지, 조감도 상태 등)와 보상 신호를 바로 제공하므로 “배터리 포함” 방식의 편리함을 갖추고 있습니다.
실제 데이터와 반응 제어의 통합:
- 실제 시나리오를 가져와 상호작용 시뮬레이션으로 전환할 수 있으며(앞서 IDM 플래그에서 설명한 것처럼), 이는 데이터 기반 + 시뮬레이션의 최신 방식으로, 값비싼 클로즈드 루프 시뮬레이터 없이도 실제 시나리오에 대해 플래너를 테스트할 수 있습니다.
센서 모달리티 및 관측치:
- MetaDrive는 여러 센서 모달리티(라이다 포인트 클라우드, RGB 이미지, 깊이 지도, 시맨틱 맵 등)를 제공하며, 조감도나 1인칭 관점으로 데이터를 제공할 수 있습니다 - CARLA 등은 더 포토리얼하게 구현하지만, MetaDrive는 기계학습 훈련에 충분한 센서 데이터를 빠르게 제공하는 균형을 이룹니다.

요약하면, MetaDrive의 가장 큰 차별화된 특징은 효율성과 다양성에 초점을 맞춘 일반화 가능한 자율주행 연구에 있으며, 극도의 리얼리즘보다 속도와 다양성을 우선시합니다. 이는 학습 기반 주행 에이전트를 수많은 시나리오에서 훈련 및 평가하는 데 이상적이며,
경쟁 시뮬레이터들이 더 높은 화질이나 특정 기능을 제공할 수 있으나, MetaDrive는 무한한 다양성, 경량화, 연구 지향적 디자인(다중 에이전트 지원, 간편한 API)을 결합하여 오픈소스 자율주행 시뮬레이터 분야에서 독보적인 위치를 차지하고 있습니다.

SMARTS

Fri, 07 Mar 2025 05:18:19 GMT

https://github.com/huawei-noah/SMARTS

질문 (이거 물어봐야함)
버블 내에 있는 주변 차량 agents NPC들은 구체적으로 어떤 운전자 모델로 제어돼? (IDM? MOBIL? 아니면 학습 기반 알고리즘?)
버블 내에 있는 주변 차량 agents NPC들의 운전자 모델을 내가 직접 만든 딥러닝 기반 planner을 주변 차량 agents NPC의 행동 모델로 바꿔서 사용할 수 있어? 구체적으로 어떻게 해야해? 디테일하게 설명해줘.
" 나아가 고급 사용자라면 TrafficProvider를 새로 구현하여 완전히 커스텀한 운전자 정책을 통합할 수도 있습니다 ([Help Request] Wanna to know about traffic model · Issue #2119 · huawei-noah/SMARTS · GitHub). " 라고 했는데, 내가 직접 만든 딥러닝 기반 planner을 주변 차량 agents NPC의 버블 밖에 있는 경우의 행동 모델로 사용하고 싶으면, 구체적으로 어떻게 해야해? 디테일하게 설명해줘.

GPT 설명

클로즈드 루프 반응형 트래픽의 주변 운전자 모델 및 커스터마이징:

SMARTS 시뮬레이터에서 주변 차량(사회적 차량)은 기본적으로 SUMO의 차량 주행 모델을 활용하여 현실감 있게 제어
SUMO의
- 기본 종방향 모델인 Krauss 모형(안전거리 기반, 무작위성 포함)이 적용되어 차량 간 간격을 유지하며 주행하고,
- 차선 변경은 협조적 거동 등을 고려한 규칙 기반 알고리즘으로 이루어집니다 ([Help Request] Wanna to know about traffic model · Issue #2119 · huawei-noah/SMARTS · GitHub).
이러한 모델은 사용자 설정으로 가속도, 감속도, 시간 간격(tau), 최소 간격(minGap) 등의 파라미터를 조정함으로써 운전 성향(보수적/공격적 등)을 반영할 수 있습니다 ([Help Request] Wanna to know about traffic model · Issue #2119 · huawei-noah/SMARTS · GitHub).
또한 SMARTS는 모듈식 구조를 갖추고 있어 사용자가 자신만의 운전자 모델로 교체하기도 용이합니다.
예를 들어, 내장된 IDM(지능형 운전자 모델) 알고리즘으로 변경하고 싶다면 시나리오의 TrafficActor 설정에서 car_follow_model="IDM"과 같이 지정하거나, SUMO TraCI 인터페이스를 통해 차량 생성 시 IDM을 적용할 수 있습니다 ([Help Request] Wanna to know about traffic model · Issue #2119 · huawei-noah/SMARTS · GitHub).
나아가 고급 사용자라면 TrafficProvider를 새로 구현하여 완전히 커스텀한 운전자 정책을 통합할 수도 있습니다 ([Help Request] Wanna to know about traffic model · Issue #2119 · huawei-noah/SMARTS · GitHub).
요약하면, SMARTS의 주변 차량 거동은 폐루프(closed-loop) 상호작용을 지원하는 규칙 기반 모델들(IDM, Krauss 등)로 제공되며, 사용자가 파라미터 튜닝이나 코드 확장을 통해 손쉽게 커스터마이징할 수 있습니다.

보행자나 자전거 이용자에 대한 반응형 모델:

SMARTS는 시나리오 DSL에서 TrafficActor를 정의할 때 차량 외에도 Pedestrian*이나 *Cyclist 유형을 포함할 수 있도록 설계되어 있습니다.
- 즉, 보행자나 자전거 운전자도 시나리오에 등장시키는 것은 가능하나, 이들의 거동 역시 SUMO 엔진을 통해 시뮬레이션됩니다.
- SUMO에서는 보행자와 자전거에 대해 비교적 단순한 규칙 기반 모델을 사용합니다.
- 예를 들어 보행자는 보도나 횡단보도를 따라 일정 속도로 이동하고 신호등에 반응하며,
- 자전거 역시 도로에서 차량과 유사하게 일정한 속도로 주행하거나 정지하는 형태로 모델링됩니다.
- 다만 SMARTS 공식 문서에서는 보행자/자전거의 구체적 알고리즘을 강조하고 있지 않으며, 기본 제공 구현은 자동차에 초점을 맞추고 있습니다.
필요하다면 사용자가 이러한 에이전트에 대한 커스텀 정책을 추가할 수는 있으나, 별도의 학습 기반 보행자/자전거 모델이 내장되어 있는 것은 아닙니다.
요약하면, SMARTS는 보행자나 자전거의 시뮬레이션을 SUMO의 로직으로 처리할 수는 있지만,
- 특별한 행동 AI를 기본 제공하는 수준은 아니며, 이들의 행동은 폐루프 상호작용이라기보다는 정해진 경로를 따르는 수준으로 다뤄집니다.

NGSIM의 개념과 SMARTS와의 연관:

NGSIM(Next Generation Simulation)은 미국 교통부(DOT)가 공개한 대형 교통 주행 데이터셋으로,
- 실제 도로 환경에서 차량들의 고정밀 궤적 데이터를 수집한 것.
NGSIM은 운전자 모델 연구나 시뮬레이터 검증에 널리 사용되는 현실 주행 기록 데이터이며, SMARTS는 이를 시뮬레이션에 활용할 수 있도록 지원합니다.
SMARTS의 Ecosystem 모듈 중 NGSIM 지원을 통해, 사용자는 NGSIM 원시 데이터를 다운로드 받아 SMARTS 시나리오에 통합할 수 있습니다.
구체적으로, 각 NGSIM 시나리오 폴더의 scenario.py에 데이터 파일 경로를 지정한 뒤 scl scenario build-all 명령을 실행하면, NGSIM의 차량 궤적들이 SMARTS 전용 형식(.shf)으로 변환되어 불러와집니다.
요약하면, NGSIM은 현실 교통상황 데이터셋이고 SMARTS는 이를 불러와 오픈 루프 형태로 차량들을 움직이게 함으로써 현실적인 시나리오를 구현합니다.

“버블(bubbles)” 메커니즘의 의미:

버블은 SMARTS에서 도입한 공간-시간 구역 개념으로, 특정 차량들 간의 상호작용을 강조하거나 정밀하게 다루고 싶을 때 사용하는 기능
일반적으로 시뮬레이션 월드 전체의 모든 차량을 고복잡도로 제어하면 계산 자원이 크게 소모되므로,
- SMARTS는 버블로 지정된 구역 안으로 들어오는 차량들에 대해서만 특별한 제어를 가합니다.
버블 영역에 진입한 사회 차량은 먼저 에어록(airlock) 구간을 지나며,
- 이 동안 백그라운드에서 해당 차량을 통제할 에이전트(미리 정의된 정책)가 관찰을 시작.
차량이 버블 핵심 구역에 들어서면 SMARTS는 그 차량의 제어권을 원래의 트래픽 정책에서 버블 에이전트에게 넘겨주어,
- 해당 차량이 자율주행차(Ego)와 능동적으로 상호작용하게 합니다.
예를 들어, Ego 차량 주변에 버블을 설정하고 그 안에 있는 주변 차들을 agresive하게 끼어들거나 양보하도록 제어함으로써,
- Ego와 특정 상호작용(끼어들기 시나리오 등)을 집중적으로 시뮬레이션할 수 있습니다.
차량이 버블을 벗어나면, 버블 에이전트는 제어를 해제하고, 다시 기본 트래픽 프로바이더 (예: SUMO 또는 기록 재생)에 차량을 반환
이때 버블 내부에서 원래 궤적에서 벗어났던 차량은 기록된 경로로 복귀할 수 없으므로,
- SMARTS의 간단한 주행 알고리즘이 차량을 남은 경로를 따라 주행시키며 상황을 이어갑니다 .
요컨대, 버블 메커니즘은 특정 영역의 상호작용을 강조하기 위해 주변 차량들을 일시적으로 특별 제어하는 기능이며,
- 이를 통해 중요한 상호작용에 시뮬레이션 자원을 집중시켜 효율성과 현실감을 모두 확보합니다.

Imitation Learning/Behavior Cloning 지원 여부 (학습 파이프라인):

SMARTS는 이름에서도 드러나듯 다중 에이전트 강화학습(MARL) 환경으로 설계된 플랫폼으로, 공식 문서와 논문도 강화학습 연구에 초점을 맞추고 있습니다.
제공되는 예제들과 통합 기능(예: RLlib 연동, MARL 시나리오)은 모두 에이전트의 강화학습 훈련 및 평가를 용이하게 하는 데 집중되어 있습니다.
반면에 모방 학습(Imitation Learning)이나 Behavior Cloning을 위한 전용 모듈이나 자동화된 파이프라인은 별도로 제공되지 않습니다.
즉, SMARTS를 이용해 모방 학습을 하려면 사용자가 환경 상에서 데이터를 수집하고 별도의 학습 코드를 구현해야 합니다.
이러한 경향은 SMARTS 개발 목적이 자율주행 연구 중 복잡한 상호작용 시나리오에서의 RL 실험에 맞춰져 있기 때문입니다.
요약하면, SMARTS는 강화학습 위주의 학습 환경을 지향하며, 모방 학습 전용 파이프라인은 내장하고 있지 않습니다.
(물론 사용자가 원한다면 SMARTS 환경에서 에이전트의 경험을 로그로 저장하여 자체적으로 behavior cloning을 수행할 수는 있지만, 이는 플랫폼이 직접 제공하는 기능 범위를 넘어서는 것입니다.)

ad_official.log

control diffusion

normalization

1안

Waymo Open Motion Dataset (WOMD)

들어가기전에: Waymo Open Motion Dataset (WOMD) 란?

nuplan VS WOMD

Waymo Sim Agents Challenge

어떤 데이터셋 split으로 평가해?

내 학습 모델 굴리는 방법

하드웨어, FPS 조건?

2. 평량적 평가 metric

2.1. Realism Meta Metric (RMM)

Realism Meta 측정 방식

가장 쉬운 선속도에 대한 $\ell_i$ 를 설명

총 평가 항목

(28개 평가 항목

2.2. 참고: minADE (최소 평균 변위 오차)

뭘 측정하는거야?

dasfdsaf

[GPT에 요청] <위 너의 지적>을 아래 코드에 직접 구현 반영해줄래? 위 목적을 완벽하게 달성하면서도 변경점을 최소화하는게 가장 훌륭한 코딩이야.

matplotlib 설정 추가

[ADDED] 통계 저장용

=====================

1. Get list of agent array from raw data

=====================

=====================

2. Get agents array for model input

=====================

utils.py (적절한 위치에 추가)

시나리오 전체 horizon(초) 계산: 시작~끝 타임스탬프 차이

=====================

1. Ego, agent, static coordination transformation

=====================

=====================

2. Map coordination transformation

=====================

=====================

3. Numpy-Tensor transformation

=====================

time embedding

글 1

1) 목적 — “시간”을 숫자열로 똑똑하게 알려주기

2) 핵심 수식 — 시간 $t$를 벡터로 바꾸는 방법

3) 왜 $\sin$과 $\cos$을 둘 다 쓰나? (위상 정보 보존)

4) 왜 “연속시간 + $dt$ 파라미터화”가 중요한가?

5) 과거/미래 시간 축 생성(코드와 일치)

6) 하이퍼파라미터 선택 가이드 (실전 중요)

예시 (dt=0.1s, T=8s)

7) 단일 주파수 vs 멀티주파수 — 왜 좋아지나?

8) 모델에 어떤 효과가 있나? (해석 포인트)

9) 계산/메모리 비용

10) 자주 받는 질문 (FAQ)

11) 한 줄 요약

글 2

1) 목표: “시간”을 한눈에 알아듣게 벡터로 바꾸자

2) 핵심 아이디어(요지)

3) 코드와 정확히 1:1로 대응되는 수식

4) 왜 사인/코사인(여러 주파수)을 쓰나? (직관)

5) $\tfrac{t}{T_{\max}}$ 과 $\mathrm{sign}(t)$를 왜 추가하나?

6) “과거/미래 일관화”가 왜 중요한가?

7) 이 임베딩이 실제로 어디 붙나?

(a) 과거 이웃/자차 인코더: AgentFusionEncoder

(b) 미래 플랜 인코더: EgoFutureEncoder

8) 이 설계가 주는 실질적 이득

9) 수학적 뒷받침(부담 없이 읽기)

10) 하이퍼파라미터(쉽게 고르는 법)

11) 마지막 한 줄 정리

[metadrive] DiffusionPlannerEnv

해야할 것

전체 큰그림

1. 전체적인 역할

2. 구성 요소와 동작 흐름

3. 한 타임스텝(step)의 내부 알고리즘

4. DiffusionPlannerEnv만의 특징 정리

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

scene initialization

scene initialization을 왜 difssuion 으로 풀어야 하는데?

latent diffusion model을 쓰는 이유?

그럼 왜 베타-VAE? autoencoder로 충분하지 않아?

Waymo Sim Agents Challenge

2.1. Realism Meta Metric (RMM)

(a) 과거 이웃/자차 인코더: `AgentFusionEncoder`

(b) 미래 플랜 인코더: `EgoFutureEncoder`

3. 한 타임스텝(step)의 내부 알고리즘

한 단락 요약

1. Lane Graph – Urban Planning Metrics

2. Lane Perceptual Quality FD

평가 프로토콜 Step‑by‑Step

왜 이 평가가 타당한가