jun-yong.log

OpenCV build

Sat, 04 Mar 2023 08:25:40 GMT

OpenCV를 사용하기 위해서는 CMake 작업 공간에 OpenCV 라이브러리를 구성해야 한다.

ORB_SLAM2$ mkdir Thirdparty && cd Thirdparty
thirdParty$ mkdir -P OpenCV/build && mkdir -P OpenCV/install
thirdParty$ cd OpenCV
OpenCV$ git clone https://github.com/opencv/opencv.git
OpenCV$ git checkout "your_version"
OpenCV$ cd build
build$ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../install/ ../opencv/
build$ make -j4
build$ sudo make install

usb_cam 설치 오류

Fri, 17 Feb 2023 06:48:24 GMT

[ 50%] Built target usb_cam
[ 75%] Linking CXX executable /home/jun/xycar_ws/devel/lib/usb_cam/usb_cam_node
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_packet_from_data'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `sws_scale'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_frame_alloc'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_close'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_frame_get_buffer'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_new_packet'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_alloc_context3'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_log_set_level'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_free_context'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_free'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_open2'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_send_packet'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `sws_getContext'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_image_get_buffer_size'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_parser_close'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_parser_init'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_register_all'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `sws_freeContext'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_receive_frame'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `av_image_copy_to_buffer'
/home/jun/xycar_ws/devel/lib/libusb_cam.so: undefined reference to `avcodec_find_decoder'
collect2: error: ld returned 1 exit status
usb_cam/CMakeFiles/usb_cam_node.dir/build.make:170: recipe for target '/home/jun/xycar_ws/devel/lib/usb_cam/usb_cam_node' failed
make[2]: *** [/home/jun/xycar_ws/devel/lib/usb_cam/usb_cam_node] Error 1
CMakeFiles/Makefile2:1326: recipe for target 'usb_cam/CMakeFiles/usb_cam_node.dir/all' failed
make[1]: *** [usb_cam/CMakeFiles/usb_cam_node.dir/all] Error 2
Makefile:140: recipe for target 'all' failed
make: *** [all] Error 2
Invoking "make -j12 -l12" failed

이 문제에 대한 대응으로 해당 라이브러리를 찾을 수 없다는 의미로 ~/catkin_ws/src/usb_cam/ 아래의 CMakeLists.txt를 확인하여 수정해야됩니다.

"locate libavcodec", "locate libavutil", "locate libswscale"을 사용하여 라이브러리의 경로를 찾은 다음 CMakeLists.txt에 다음 줄을 추가합니다.

set(avcodec_LINK_LIBRARIES /usr/lib/x86_64-linux-gnu/libavcodec.so)
set(avutil_LINK_LIBRARIES /usr/lib/x86_64-linux-gnu/libavutil.so)
set(swscale_LINK_LIBRARIES /usr/lib/x86_64-linux-gnu/libswscale.so)

[230106] CNN - Shallow CNN

Sat, 07 Jan 2023 13:49:36 GMT

Shallow CNN

Shallow Neural Network

Conv1 : 3x3 Convolution + sigmoid Max Pool : Max Pooling FC : Fully Connected Layer + sigmoid

- Backpropagation in Shallow Neural Network

Update the weights in backpropagation

Backpropagation in Shappow Neural Network (W2)

W2 수식

Backpropagation in Shappow Neural Network (W1)

W1 수직

- Max Pooling backpropagation

Max Pooling backpropagation

[230106] CNN - Pooling

Sat, 07 Jan 2023 08:03:48 GMT

CNN - Pooling

Feature map의 channel resolution을 줄여주는 용도로 Pooling이 사용된다 Resize the feature map & Reduce the resolution of feature map

Max Pooling vs Average Pooiling
- Max -> 선명한 느낌, 각 픽셀의 특징이 보인다, 연속적이지 않다
- Ave -> 주변의 영향을 받아서 평균값으로 스무딩된 느낌이다

Fully Connected Layer

컨볼루션 연산과 다르게 2D의 기하학적인 정보를 가지고 있는 Feature map을 1D 벡터로 reshape을 시켜준다. 한 요소당 한 가중치와 매칭될 수 있게 FC weights를 만들어서 연산 후 합하여 결과를 도출한다 Reshape 2D feature maps from 2D to 1D All weights are matching with each feature map pixel

Activation

컨볼루션과 풀링을 통해 나온 결과 중 Feature map에서 유의미한 값을 더욱 도드라지게 만들어주며 유의미하지 않은 값을 0이나 음수의 값으로 치환하여 값에 차이를 주어 유의미한 값을 잘 나타내게 끔 활성화 함수가 사용된다 Activation 함수는 Non-linear function이다

실습

main.py (forward_net() -> Conv2d, Pooling, fully connected layer)

from function.convolution import Conv
from function.pool import Pool
from function.fc import FC
import numpy as np
import time

def forward_net():
    """_summary_
    'Conv - Pooling - FC' model inference code 
    """
    #define
    batch = 1
    in_c = 3
    in_w = 6
    in_h = 6
    k_h = 3
    k_w = 3
    out_c = 1

    X = np.arange(batch*in_c*in_w*in_h, dtype=np.float32).reshape([batch,in_c,in_w,in_h])
    W1 = np.array(np.random.standard_normal([out_c,in_c,k_h,k_w]), dtype=np.float32)

    Convolution = Conv(batch = batch,
                        in_c = in_c,
                        out_c = out_c,
                        in_h = in_h,
                        in_w = in_w,
                        k_h = k_h,
                        k_w = k_w,
                        dilation = 1,
                        stride = 1,
                        pad = 0)

    L1 = Convolution.gemm(X,W1)

    print("L1 shape : ", L1.shape)
    print(L1)

    Pooling = Pool(batch=batch,
                   in_c = 1,
                   out_c = 1,
                   in_h = 4,
                   in_w = 4,
                   kernel=2,
                   dilation=1,
                   stride=2,
                   pad = 0)

    L1_MAX = Pooling.pool(L1)
    print("L1_MAX shape : ", L1_MAX.shape)
    print(L1_MAX)

    #fully connected layer
    W2 = np.array(np.random.standard_normal([1, L1_MAX.shape[1] * L1_MAX.shape[2] * L1_MAX.shape[3]]), dtype=np.float32)
    Fc = FC(batch = L1_MAX.shape[0],
            in_c = L1_MAX.shape[1],
            out_c = 1,
            in_h = L1_MAX.shape[2],
            in_w = L1_MAX.shape[3])

    L2 = Fc.fc(L1_MAX, W2)

    print("L2 shape : ", L2.shape)
    print(L2)

if __name__ == "__main__":
    forward_net()

pool.py

import numpy as np

# 2D Pooling
class Pool:
    def __init__(self, batch, in_c, out_c, in_h, in_w, kernel, dilation, stride, pad):
        self.batch = batch
        self.in_c = in_c
        self.out_c = out_c
        self.in_h = in_h
        self.in_w = in_w
        self.kernel = kernel
        self.dilation = dilation
        self.stride = stride
        self.pad = pad
        self.out_w = (in_w - kernel + 2 * pad) // stride + 1
        self.out_h = (in_h - kernel + 2 * pad) // stride + 1

    def pool(self, A):
        C = np.zeros([self.batch, self.out_c, self.out_h, self.out_w], dtype=np.float32)
        for b in range(self.batch):
            for c in range(self.in_c):
                for oh in range(self.out_h):
                    a_j = oh * self.stride - self.pad
                    for ow in range(self.out_w):
                        a_i = ow * self.stride - self.pad
                        max_value = np.amax(A[:, c, a_j:a_j+self.kernel, a_i:a_i+self.kernel])
                        C[b, c, oh, ow] = max_value
        return C

fc.py

import numpy as np

class FC:
    def __init__(self, batch, in_c, out_c, in_h, in_w):
        self.batch = batch
        self.out_c = out_c
        self.in_c = in_c
        self.in_h = in_h
        self.in_w = in_w 

    def fc(self, A, W):
        #A shape : [b,in_c, in_h, in_w]
        a_mat = A.reshape([self.batch, -1])
        B = np.dot(a_mat, np.transpose(W, (1,0)))
        return B

main.py (activation)

import matplotlib.pyplot as plt
import numpy as np
from function.activation import *

def plot_activation():
    """_summary_
    Plot the activation output of [-10,10] inputs
    activations : relu, leaky_relu, sigmoid, tanh
    """
    x = np.arange(-10,10,1)

    out_relu = relu(x)
    out_leaky = leaky_relu(x)
    out_sigmoid = sigmoid(x)
    out_tanh = tanh(x)

    #print(out_relu, out_leaky, out_sigmoid, out_tanh)

    plt.plot(x, out_relu, 'r', label='relu')
    plt.plot(x, out_leaky, 'b', label='leaky')
    plt.plot(x, out_sigmoid, 'g', label='sigmoid')
    plt.plot(x, out_tanh, 'bs', label='tanh')
    plt.ylim([-2,2])
    plt.legend()
    plt.show()

activation.py

import numpy as np

def relu(x):
    x_shape = x.shape
    x = np.reshape(x,[-1])
    x = [max(v,0) for v in x]
    x = np.reshape(x, x_shape)
    return x

def leaky_relu(x):
    x_shape = x.shape
    x = np.reshape(x, [-1])
    x = [max(0.1*v,v) for v in x]
    x = np.reshape(x, x_shape)
    return x

def sigmoid(x):
    x_shape = x.shape
    x = np.reshape(x,[-1])
    x = [ 1 / (1 + np.exp(-v)) for v in x]
    x = np.reshape(x, x_shape)
    return x

def tanh(x):
    x_shape = x.shape
    x = np.reshape(x, [-1])
    x = [np.tanh(v) for v in x]
    x = np.reshape(x,x_shape)
    return x

https://github.com/Jun-yong-lee/pytorch_study/tree/Convolution

[230106-1] CNN

Fri, 06 Jan 2023 12:59:33 GMT

CNN

CNN은 Convolution Neural Network의 약자로 이미지 처리에 특화된 네트워크이다. CNN의 과정과 사람의 인지과정을 간단히 살펴보면 아래와 같다.

컨볼루션 매트릭스라고 하여 (3x3, 5x5)의 작은 커널을 가진 연산을 하는 작업을 컨볼루션이라고 한다.

Padding : Add zero values in boundary of input image Stride : Elements of sliding window of convolution kernel

Output shape
- oh = (ih - kh + padding*2)//stride + 1
- ow = (iw - kw + padding*2)//stride + 1
  - b : batch
  - ic : in_channel
  - ih : in_height
  - iw : in_width
  - oc : out_channel
  - kh : kernel_h
  - kw : kernel_w

A shape : [b, ic, ih, iw]
W shape : [oc, kc, kh, kw]
B shape : [b, oc, oh, ow]
- Weight sharing between batch of A

Convolution Operation
MAC(Multiply Accumulation operation)
- Convolution MAC : kw x kh x kc x oc x ow x oh x b ① : kw x kh x oc, ② : kw x kh x kc x oc

# 7 loops in convolution operation
for b in batch:
    for oh in out_height:
        for ow in out_width:
            for oc in out_channel:
                for kc in kernel_channel:
                    for kh in kernel_height:
                        for kw in kernel_width:

IM2COL & GEMM
- IM2COL
  - Transform n-dimension data into 2D matrix data
  - more efficient operation!!
- GEMM
  - General Matrix to Matrix Multiplication

Sliding window Convolution

시간 측정 결과 -> L1 time : 0.0365447998046875

IM2COL GEMM convolution

시간 측정 결과 -> L2 time : 0.0050013065338134766

pytorch convolution

시간 측정 결과 -> L3 time : 0.0

	Sliding window	IM2COL GEMM	pytorch
시간	0.0365447998046875	0.0050013065338134766	0
측정된 시간은 위와 같으며 각각의 tensor의 내용을 살펴보면 3가지 convolution 모두 동일한 것을 확인할 수 있다.

코드

main

import numpy as np
import time
import torch
import torch.nn as nn

from function.convolution import Conv

def convolution():
    print("convolution")

    # define the shape of input & weight

    in_w = 6
    in_h = 6
    in_c = 3
    out_c = 16
    batch = 1
    k_w = 3
    k_h = 3

    X = np.arange(in_w * in_h * in_c * batch, dtype=np.float32).reshape([batch, in_c, in_h, in_w])
    W = np.array(np.random.standard_normal([out_c, in_c, k_h, k_w]), dtype=np.float32)

    Convolution = Conv(batch=batch,
                in_c=in_c,
                out_c=out_c,
                in_h=in_h,
                in_w=in_w,
                k_h=k_h,
                k_w=k_w,
                dilation=1,
                stride=1,
                pad=0)

    # print(f"X = {X}")
    # print(f"W = {W}, W.shape = {W.shape}")

    L1_time = time.time()

    for i in range(5):
        L1 = Convolution.conv(X, W)            
    print(f"L1 time : {time.time() - L1_time}")
    print(f"L1 : {L1}")

    L2_time = time.time()
    for i in range(5):
        L2 = Convolution.gemm(X, W)
    print(f"L2 time : {time.time() - L2_time}")
    print(f"L2 : {L2}")

    torch_conv = nn.Conv2d(in_c,
                           out_c,
                           kernel_size=k_h,
                           stride=1,
                           padding=0,
                           bias=False,
                           dtype=torch.float32)
    torch_conv.weight = torch.nn.Parameter(torch.tensor(W))

    L3_time = time.time()
    for i in range(5):
        L3 = torch_conv(torch.tensor(X, requires_grad=False, dtype=torch.float32))
    print(f"L3 time : {time.time() - L3_time}")
    print(f"L3 : {L3}")

if __name__ == "__main__":
    convolution()

convolution.py

import numpy as np

class Conv:
    def __init__(self, batch, in_c, out_c, in_h, in_w, k_h, k_w, dilation, stride, pad):
        self.batch = batch
        self.in_c = in_c
        self.out_c = out_c
        self.in_h = in_h
        self.in_w = in_w
        self.k_h = k_h
        self.k_w = k_w
        self.dilation = dilation
        self.stride = stride
        self.pad = pad

        self.out_h = (in_h - k_h + 2 * pad) // stride + 1
        self.out_w = (in_w - k_w + 2 * pad) // stride + 1

    def check_range(self, a, b):
        return a > -1 and a < b

    # naive convolution Sliding window metric
    def conv(self, A, B):
        C = np.zeros((self.batch, self.out_c, self.out_h, self.out_w), dtype=np.float32)

        # seven loop
        for b in range(self.batch):
            for oc in range(self.out_c):
                # each channel of output
                for oh in range(self.out_h):
                    for ow in range(self.out_w):
                        # one pixel of output shape
                        a_j = oh * self.stride - self.pad
                        for kh in range(self.k_h):
                            if self.check_range(a_j, self.in_h) == False:
                                C[b, oc, oh, ow] += 0
                            else:
                                a_i = ow * self.stride - self.pad
                            for kw in range(self.k_w):
                                if self.check_range(a_i, self.in_w) == False:
                                    C[b, oc, oh, ow] += 0
                                else:
                                    C[b, oc, oh, ow] += np.dot(A[b, :, a_j, a_i], B[oc, :, kh, kw])
                                a_i += self.stride
                            a_j += self.stride
        return C

    # IM2COL. Change n-dim input to 2-dim matrix
    def im2col(self, A):
        # output
        mat = np.zeros((self.in_c * self.k_h * self.k_w, self.out_w * self.out_h), dtype=np.float32)

        mat_j = 0
        mat_i = 0
        for c in range(self.in_c):
            for kh in range(self.k_h):
                for kw in range(self.k_w):
                    in_j = kh * self.dilation - self.pad
                    for oh in range(self.out_h):
                        if not self.check_range(in_j, self.in_h):
                            for ow in range(self.out_w):
                                mat[mat_j, mat_i] = 0
                                mat_i += 1
                        else:
                            in_i = kw * self.dilation - self.pad
                            for ow in range(self.out_w):
                                if not self.check_range(in_i, self.in_w):
                                    mat[mat_j, mat_i] = 0
                                    mat_i += 1
                                else:
                                    mat[mat_j, mat_i] = A[0, c, in_j, in_i]
                                    mat_i += 1
                                in_i += self.stride
                        in_j += self.stride
                    mat_i = 0
                    mat_j += 1
        return mat

    # gemm. 2D matrix multiplication
    def gemm(self, A, B):
        a_mat = self.im2col(A)
        b_mat = B.reshape(B.shape[0],-1)
        c_mat = np.matmul(b_mat, a_mat)
        c = c_mat.reshape([self.batch, self.out_c, self.out_h, self.out_w])
        return c

https://github.com/Jun-yong-lee/pytorch_study/tree/Convolution

[230105-2] Perception in self driving car

Fri, 06 Jan 2023 11:41:59 GMT

Pytorch LeNet5 MNIST 학습

main.py

from turtle import down
import argparse
import sys, os
import torch
import torch.nn as nn
from torchvision.datasets import MNIST
import torchvision.transforms as transforms
from torch.utils.data.dataloader import DataLoader
import torch.optim as optim

from model.models import *
from loss.loss import *
from util.tools import *

def parse_args():
    parser = argparse.ArgumentParser(description="MNIST")
    parser.add_argument('--mode', dest='mode', help="train / eval / test",
                        default=False, type=str)
    parser.add_argument('--download', dest='download', help="download MNIST dataset",
                        default=False, type=bool)
    parser.add_argument('--output_dir', dest='output_dir', help="output directory",
                        default='./output', type=str)
    parser.add_argument('--checkpoint', dest='checkpoint', help="checkpoint trained model",
                        default=None, type=str)

    if len(sys.argv) == 1:
        parser.print_help()
        sys.exit()
    args = parser.parse_args()
    return args

def get_data():
    my_transform = transforms.Compose([
        transforms.Resize([32, 32]),
        transforms.ToTensor(),
        transforms.Normalize((0.5,), (1.0,))
    ])
    download_root = "./mnist_dataset"
    train_dataset = MNIST(root=download_root,
                          transform=my_transform,
                          train=True,
                          download=args.download)
    eval_dataset = MNIST(root=download_root,
                         transform=my_transform,
                         train=False,
                         download=args.download)
    test_dataset = MNIST(root=download_root,
                         transform=my_transform,
                         train=False,
                         download=args.download)

    return train_dataset, eval_dataset, test_dataset

def main():
    print(torch.__version__)

    if not os.path.isdir(args.output_dir):
        os.mkdir(args.output_dir)

    if torch.cuda.is_available():
        print("gpu")
        device = torch.device("cuda")
    else:
        print("cpu")
        device = torch.device("cpu")

    # Get MNIST Dataset
    train_dataset, eval_dataset, test_dataset = get_data()

    # Make DataLoader
    train_loader = DataLoader(train_dataset,
                              batch_size=8,
                              num_workers=0,
                              pin_memory=True,
                              drop_last=True,
                              shuffle=True)
    eval_loader = DataLoader(eval_dataset,
                            batch_size=1,
                            num_workers=0,
                            pin_memory=True,
                            drop_last=False,
                            shuffle=False)
    test_loader = DataLoader(test_dataset,
                        batch_size=1,
                        num_workers=0,
                        pin_memory=True,
                        drop_last=False,
                        shuffle=False)

    _model = get_model('lenet5')

    # LeNet5

    if args.mode == "train": # python main.py --mode "train" --download 1 --output_dir ./output
        model = _model(batch=8, n_classes=10, in_channel=1, in_width=32, in_height=32, is_train=True)
        model.to(device)
        model.train() # trian

        # optimizer & scheduler
        optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
        scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.1)

        criterion = get_criterion(crit='mnist', device=device)

        epoch = 15
        iter = 0
        for e in range(epoch):
            total_loss = 0
            for i, batch in enumerate(train_loader):
                img = batch[0]
                gt = batch[1]

                img = img.to(device)
                gt = gt.to(device)

                out = model(img)

                loss_val = criterion(out, gt)

                # backpropagation
                loss_val.backward()
                optimizer.step()
                optimizer.zero_grad()

                total_loss += loss_val.item()

                if iter % 100 == 0:
                    print(f"{e} epoch {iter} iter loss : {loss_val.item()}")
                iter += 1

            mean_loss = total_loss / i
            scheduler.step()

            print(f"->{e} epoch mean loss : {mean_loss}")
            torch.save(model.state_dict(), args.output_dir + "/model_epoch" + str(e)+".pt")
        print("Train end")


    elif args.mode == "eval":
        # python main.py --mode "eval" --download 1 --output_dir ./output \ 
        # --checkpoint ./output/model_epoch2.pt
        model = _model(batch=1, n_classes=10, in_channel=1, in_width=32, in_height=32)
        # load trained model
        checkpoint = torch.load(args.checkpoint)
        model.load_state_dict(checkpoint)
        model.to(device)
        model.eval() # not train()

        acc = 0
        num_eval = 0

        for i, batch in enumerate(eval_loader):
            img = batch[0]
            gt = batch[1] # ground thruth

            img = img.to(device)

            # inference
            out = model(img)

            out = out.cpu()

            if out == gt:
                acc += 1
            num_eval += 1

        print(f"Evaluation Score : {acc} / {num_eval}")

    elif args.mode == "test":
        # python main.py --mode "test" --download 1 --output_dir ./output \
        # --checkpoint ./output/model_epoch2.pt
        model = _model(batch=1, n_classes=10, in_channel=1, in_width=1, in_height=1)
        checkpoint = torch.load(args.checkpoint)
        model.load_state_dict(checkpoint)
        model.to(device)
        model.eval() # not train()

        for i, batch in enumerate(test_loader):
            img = batch[0]
            img = img.to(device)

            # inference
            out = model(img)
            out = out.cpu()

            print(out)

            # show result
            show_img(img.cpu().numpy(), str(out.item()))

if __name__ == "__main__":
    args = parse_args()
    main()

# image classification sequential
# 1. Get dataset
# 2. Make Dataloader(학습에 사용될 DB 구축)
# 3. design model
# 4. training
# 5. optimizer & scheduler
# 6. loss function
# 7. forward -> loss_val
# 8. loss_val -> backpropagation -> optimizer.step(), optimizer.zero_grad()
# 9. save model

model

lenet5.py

import torch
import torch.nn as nn

class Lenet5(nn.Module):
    def __init__(self, batch, n_classes, in_channel, in_width, in_height, is_train=False):
        super().__init__()
        self.batch = batch
        self.n_classes = n_classes
        self.in_width = in_width
        self.in_height = in_height
        self.in_channel = in_channel
        self.is_train = is_train

        # convolution output : [(W - K + 2P)/S] + 1

        # [(32 - 5 + 2*0) / 1] + 1 = 28
        self.conv0 = nn.Conv2d(self.in_channel, 6, kernel_size=5, stride=1, padding=0)
        self.pool0 = nn.AvgPool2d(2, stride=2)
        self.conv1 = nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0)
        self.pool1 = nn.AvgPool2d(2, stride=2)
        self.conv2 = nn.Conv2d(16, 120, kernel_size=5, stride=1, padding=0)

        # fully-connected layer
        self.fc3 = nn.Linear(120, 84)
        self.fc4 = nn.Linear(84, self.n_classes)

    def forward(self, x):
        # x' shape : [B, C, H, W]
        x = self.conv0(x)
        x = torch.tanh(x)
        x = self.pool0(x)
        x = self.conv1(x)
        x = torch.tanh(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = torch.tanh(x)
        # change format from 4dim -> 2dim ( [B, C, H, W] -> [B, C*H*W] )
        x = torch.flatten(x, start_dim=1)
        x = self.fc3(x)
        x = torch.tanh(x)
        x = self.fc4(x)
        x = x.view(self.batch, -1)
        x = nn.functional.softmax(x, dim=1)

        if self.is_train is False:
            x = torch.argmax(x, dim=1)
        return x

models.py

from model.lenet5 import Lenet5

def get_model(model_name):
    if (model_name == "lenet5"):
        return Lenet5
    else:
        print("unknown model")

loss

loss.py

import torch
import torch.nn as nn
import sys

class MNISTloss(nn.Module):
    def __init__(self, device=torch.device('cpu')):
        super(MNISTloss, self).__init__()
        self.loss = nn.CrossEntropyLoss().to(device)

    def forward(self, out, gt):
        loss_val = self.loss(out, gt)
        return loss_val

def get_criterion(crit = "mnist", device=torch.device('cpu')):
    if crit == "mnist":
        return MNISTloss(device=device)
    else:
        print("unknown criterion")
        sys.exit(1)
        return

util

tools.py

from PIL import Image, ImageDraw
import numpy as np
import matplotlib.pyplot as plt

def show_img(img_data, text):
    _img_data = img_data * 255

    # 4D -> 2D

    _img_data = np.array(_img_data[0, 0], dtype=np.uint8)

    img_data = Image.fromarray(_img_data)
    draw = ImageDraw.Draw(img_data)

    cx, cy = int(_img_data.shape[0] / 2), int(_img_data.shape[1] / 2)

    # draw text in image
    if text is not None:
        draw.text((cx, cy), text)

    plt.imshow(img_data)
    plt.show()

https://github.com/Jun-yong-lee/pytorch_study/tree/pytorch_MNIST

[230105-1] Perception in self driving car

Fri, 06 Jan 2023 05:59:47 GMT

pytorch의 기본적인 사용방법

코드

import torch
import numpy as np

# print(torch.__version__)

def make_tensor():
    # int16
    a = torch.tensor([[1, 2], [3, 4]], dtype=torch.int16)
    # float
    b = torch.tensor([2], dtype=torch.float32)
    # double
    c = torch.tensor([3], dtype=torch.float64)

    # print(a, b, c)

    tensor_list = [a, b, c]

    for t in tensor_list:
        print(f"shape of tensor {t.shape}")
        print(f"datatype of tensor {t.dtype}")
        print(f"device tensor is stored on {t.device}")

def sumsub_tensor():
    a = torch.tensor([3, 2])
    b = torch.tensor([5, 3])

    print(f"input {a}, {b}")

    # sum
    sum = a + b
    print(f"sum : {sum}")
    # sub
    sub = a - b
    print(f"sub : {sub}")

    sum_element_a = a.sum()
    print(f"sum_element_a : {sum_element_a}")

def muldiv_tensor():
    a = torch.arange(0, 9).view(3, 3)
    b = torch.arange(0, 9).view(3, 3)
    print(f"input tensor :\n {a} \n {b}")

    # mat_mul
    c = torch.matmul(a, b) # matrix multiplication
    print(f"mat_mul : {c}")

    # elementwise multiplication
    d = torch.mul(a, b)
    print(f"elementwise mul : {d}")

def reshape_tensor():
    a = torch.tensor([2, 4, 5, 6, 7, 8])
    print(f"input tensor : \n {a}")

    # view
    b = a.view(2, 3)
    print(f"view \n {b}")

    # transpose
    bt = b.t()
    print(f"transpose \n {bt}")

def access_tensor():
    a = torch.arange(1, 13).view(4, 3)
    print(f"input : \n {a}")

    # first col
    print(a[:, 0])
    # first row
    print(a[0, :])
    # [1, 1]
    print(a[1, 1])

def transform_numpy():
    a = torch.arange(1, 13).view(4, 3)
    print(f"input : \n {a}")

    a_np = a.numpy()
    print(f"numpy : {a_np}")

    b = np.array([1, 2, 3])
    bt = torch.from_numpy(b)
    print(bt)

def concat_tensor():
    a = torch.arange(1, 10).view(3, 3)
    b = torch.arange(10, 19).view(3, 3)
    c = torch.arange(19, 28).view(3, 3)

    abc = torch.cat([a, b, c], dim=0)

    print(f"input tensor : \n {a} \n {b} \n {c}")
    print(f"concat : \n {abc}")
    print(abc.shape)

def stack_tensor():
    a = torch.arange(1, 10).view(3, 3)
    b = torch.arange(10, 19).view(3, 3)
    c = torch.arange(19, 28).view(3, 3)

    abc = torch.stack([a, b, c], dim=0)

    print(f"input tensor : \n {a} \n {b} \n {c}")
    print(f"stack : \n {abc}")
    print(abc.shape)

def transpose_tensor():
    a = torch.arange(1, 10).view(3, 3)
    print(f"input tensor : \n {a}")

    # transpose
    at = torch.transpose(a, 0, 1)
    print(f"transpose : \n {at}")

    b = torch.arange(1, 25).view(4, 3, 2)
    print(f"input b tensor : \n {b}")

    bt = torch.transpose(b, 0, 2)
    print(f"transpose : \n {bt}")
    print(bt.shape)

    bp = b.permute(2, 0, 1) # 0, 1, 2
    print(f"permute : \n {bp}")
    print(bp.shape)

if __name__ == "__main__":
    # make_tensor()
    # sumsub_tensor()
    # muldiv_tensor()
    # reshape_tensor()
    # access_tensor()
    # transform_numpy()
    # concat_tensor()
    # stack_tensor()
    transpose_tensor()

각 함수의 결과

- make_tensor()

- sumsub_tensor()

- muldiv_tensor()

- reshape_tensor()

- access_tensor()

- transform_numpy()

- concat_tensor()

- stack_tensor()

- transpose_tensor()

https://github.com/Jun-yong-lee/pytorch_study/tree/pytorch_prac

[230103-4] Deep Learning: 신경망의 기초 - 기계학습 III

Thu, 05 Jan 2023 15:03:39 GMT

Deep Learning: 신경망의 기초 - 기계학습 III

1.4 간단한 기계 학습의 예

기계학습 요소
- 카드 승인 예제 및 요소
- 카드 승인 교사학습 예제
- 기계학습 설정
  - 교사학습의 경우,

1.5.1 과소적합과 과잉적합

[그림 1.13]의 1차 모델은 과소적합^underfitting
- 모델의 '용량이 작아' 오차가 클 수밖에 없는 현상
대안 : 비선형 모델을 사용
- [그림 1-13]의 2차, 3차, 4차, 12차는 다항식 곡선을 선택한 예
- 1차(선형)에 비해 오차가 크게 감소함
과잉적합^overfitting
- 12차 다항식 곡선을 채택한다면 훈련집합에 대해 거의 완벽하게 근사화함
- 하지만 '새로운' 데이터를 예측한다면 큰 문제 발생
  - x₀에서 빨간 막대 근방을 예측해야 하지만 빨간 점을 예측
- 이유는 '모델의 용량^capacity이 크기' 때문에 학습 과저에서 잡음까지 수용 -> 과잉적합 현상
  - 훈련집합에 과몰입해서 단순 암기했기 때문
- 적절한 용량의 모델을 선택하는 모델 선택 작업이 필요함
1차~12차 다항식 모델의 비교 관찰
- 1~2차는 훈련집합과 테스트집합 모두 낮은 성능
- 12차는 훈련집합에 높은 성능을 보이나 테스트집합에서는 낮은 성능 -> 낮은 일반화 능력
- 3~4차는 훈련집합에 대해 12차보다 낮겠지만 테스트집합에는 높은 성능 -> 높은 일반화 능력
모델의 일반화 능력과 용량 관계
훈련집합에 대한 세가지 모델 적합도 예

1.5.2 편향^bias과 분산(변동)^variance

훈련집합을 여러 번 수집하여 1차~12차에 적용하는 실험
- 2차는 매번 큰 오차 -> 바이어스가 큼. 하지만 비슷한 모델을 얻음 -> 낮은 분산
- 12차는 매번 작은 오차 -> 바이어스가 작음. 하지만 크게 다른 모델을 얻음 -> 높은 분산
- 일반적으로 용량이 작은 모델은 바이어스는 크고 분산은 작음. 복잡한 모델은 바이어스는 작고 분산은 큼
- 바이어스와 분산은 상충^trade-off 관계
기계 학습의 목표
- 낮은 편향과 낮은 분산을 가진 예측 모델을 만드는 것이 목표 (왼쪽 아래)
- 하지만 모델의 편향과 분산은 상충 관계
- 따라서 편향을 최소로 유지하며 분산도 최대로 낮추는 전략 필요
편향과 분산의 관계
- 용량 증가 -> 편향 감소, 분산 증가 경향
- 일반화 오차 성능 (=편향+분산)은 U형의 곡선을 가짐

1.5.3 검증집합과 교차검증을 이용한 모델 선택 알고리즘

검증집합을 이용한 모델 선택
- 훈련집합과 테스트집합과 다른 별도의 검증집합^{validation set}을 가진 상황(데이터의 양이 많을 경우)
교차검증^{cross validation}
- 비용 문제로 별도의 검증집합이 없는 상황에 유용한 모델 선택 기법(데이터 양이 적을 경우)
- 훈련집합을 등분하여, 학습과 평가 과정을 여러 번 반복한 후 평균 사용
10겹 교차검증^{10-fold cross validation}의 예
부트스트랩^{boot strap}
- 임의의 복원 추출 샘플링^{sampling with replacement} 반복
  - 데이터 분포가 불균형일 때 적용

1.5.4 모델 선택의 한계의 현실적인 해결책

현대 기계 학습의 전략
- 용량이 충분히 큰 모델을 선택 한 후, 선택한 모델이 정상을 벗어나지 않도록 여러 가지 규제^{regularization} 기법을 적용함

1.6 규제

1.6.1 데이터 확대

데이터를 더 많이 수집하면 일반화 능력이 향상됨

데이터 수집은 많은 비용이 듦
- 실측자료^{ground thruth}를 사람이 일일이 표식^labeling을 해야 함
인위적으로 데이터 확대^{data augmentation}
- 훈련집합에 있는 샘플을 변형^transform함
- ex) 약간 회전^rotation 또는 왜곡^warping (원 데이터의 부류 소속 등의 고유 특성이 변하지 않게 주의할 것)

1.6.2 가중치 감쇠

가중치를 작게 조절하는 기법
- [그림 1-18(a)]의 12차 곡선은 가중치가 매우 큼
- 가중치 감쇠는 개선된 목적함수를 이용하여 가중치를 작게 조절하는 규제 기법
  - 식 (1.11)의 두 번째 항은 규제 항으로서 가중치 크기를 작게 유지해줌

1.7.1 지도 방식에 따른 유형

지도 학습^{supervised learning}
- 특징 벡터 𝕏와 목푯값 𝕐가 모두 주어진 상황
- 회귀^regression와 분류^{classification} 문제로 구분
비지도 학습^{unsupervied learning}
- 특징 벡터 𝕏는 주어지는데 목푯값 𝕐 가 주어지지 않는 상황(정답 없음)
- 군집화^clustering 과업 (고객 성향에 따른 맞춤 홍보 응용 등)
- 밀도 추정^{density estimation}, 특징 공간 변환 과업(PCA)

강화 학습^{reinforcement learning}
- (상대적) 목표치가 주어지는데, 지도 학습과 다른 형태임(==보상^reward)
- ex) 바둑
  - 수를 두는 행위가 샘플인데, 게임이 끝나면 목푯값 하나가 부여됨 -이기면 1, 패하면 -1을 부여
  - 게임을 구성한 샘플들 각각에 목푯값을 나누어 주어야 함
준지도 학습
- 일부는 𝕏와 𝕐를 모두 가지지만, 나머지는 𝕏만 가진 상황
- 최근, 대부분의 데이터가 𝕏의 수집은 쉽지만, 𝕐는 수작업이 필요하여 최근 중요성 부각
  1.7.2 다양한 기준에 따른 유형
오프라인 학습^{offline learning}과 온라인 학습^{online learning}
- 보통은 오프라인 학습을 다룸
- 온라인 학습은 IoT 등에서 추가로 발생하는 샘플을 가지고 점증적 학습 수행
결정론적 학습^{deterministic learning}과 확률적 학습^{stochastic learning}
- 결정론적에서는 같은 데이터를 가지고 다시 학습하면 같은 예측기가 만들어짐
- 스토캐스틱 학습은 학습 과정에서 난수를 사용하므로 같은 데이터로 다시 학습하면 다른 예측기가 만들어짐. 보통 예측 과정도 난수 사용
분별 모델^{discriminative models}과 생성 모델^{generative models}
- 분별 모델은 부류 예측에만 관심. 즉 P(y|x)의 추정에 관심
- 생성 모델은 P(x) 또는 P(x|y)를 추정함
  - 따라서 새로운 샘플을 ‘생성’할 수 있음

[230103-3] Deep Learning: 신경망의 기초 - 기계학습 II

Thu, 05 Jan 2023 13:12:01 GMT

Deep Learning: 신경망의 기초 - 기계학습 II

1.3 데이터에 대한 이해

과학 기술의 발전 과정
- 예) 튀코 브라헤는 천동설이라는 틀린 모델을 선택함으로써 자신이 수집한 데이터를 설명하지 못함. 케플러는 지동설 모델을 도입하여 제1, 제2, 제 3법칙을 완성함
기계학습
- 기계 학습이 푸는 문제는 훨씬 복잡함 예) [그림 1-2]의 ‘8’ 숫자 패턴과 ‘단추’ 패턴의 다양한 변화 양상
- 단순한 수학 공식으로 표현 불가능함
- 데이터를 설명할 수 있는 학습 모델을 찾아내는 과정, 즉 기계학습에는 자동으로 모델을 찾아내는 과정이 필수

1.3.1 데이터 생성 과정

데이터 생성 과정을 완전히 아는 인위적 상황의 예제
- 예) 두 개 주사위를 던져 나온 눈의 합을 x라 할 때, y=(x-7)²+1 점을 받는 게임
  - 이런 상황을 ‘데이터 생성 과정을 완전히 알고 있다’고 말함
  - x를 알면 정확히 y를 예측할 수 있음 -> 실제 주사위를 던져 𝕏={3,10,8,5}를 얻었다면, 𝕐={17,10,2,5}
- x의 발생 확률 P(x)를 정확히 알 수 있음
  - P(x)를 알고 있으므로, 새로운 데이터 생성 가능
위과 같은 실제 기계 학습 문제
- 데이터 생성 과정을 알 수 없음
- 단지 주어진 훈련집합 𝕏, 𝕐로 가설 모델을 통해 근사 추정만 가능

1.3.2 데이터베이스의 중요성

데이터베이스의 품질
- 주어진 응용에 맞는 충분히 다양한 데이터를 충분한 양만큼 수집  추정 정확도 높아짐
  - 예) 정면 얼굴만 가진 데이터베이스로 학습하고 나면, 기운 얼굴은 매우 낮은 성능
주어진 응용 환경을 자세히 살핀 다음 그에 맞는 데이터베이스 확보는 아주 중요함
- 데이터의 양과 학습 모델의 성능 경향성 비교
공개 데이터베이스
- 기계 학습의 대표적인 3가지 데이터베이스: Iris, MNIST, ImageNet
- UCI 저장소^repository리퍼지토리 (2017년11월 기준으로 394개 데이터베이스 제공)
Iris 데이터베이스는 통계학자인 피셔 교수가 1936년에 캐나다 동부 해안의 가스페 반도에 서식하는 3종의 붓꽃(setosa, versicolor, virginica)을 50송이씩 채취하여 만들었다[Fisher1936]. 150개 샘플 각각에 대해 꽃받침 길이, 꽃받침 너비, 꽃잎 길이, 꽃잎 너비를 측정하여 기록하였다. 따라서 4차원 특징 공간이 형성되며 목푯값은 3종을 숫자로 표시함으로써 1, 2, 3 값 중의 하나이다.
MNIST 데이터베이스는 미국표준국(NIST)에서 수집한 필기 숫자 데이터베이스로, 훈련집합 60,000자, 테스트집합 10,000자를 제공한다. http://yann.lecun.com/exdb/mnist에 접속하면 무료로 내려받을 수 있으며, 1988년부터 시작한 인식률 경쟁 기록도 볼 수 있다. 2017년 8월 기준으로는 [Ciresan2012] 논문이 0.23%의 오류율로 최고 자리를 차지하고 있다. 테스트집합에 있는 10,000개 샘플에서 단지 23개만 틀린 것이다.

ImageNet 데이터베이스는 정보검색 분야에서 만든 WordNet의 단어 계층 분류를 그대로 따랐고, 부류마다 수백에서 수천 개의 영상을 수집하였다[Deng2009]. 총 21,841개 부류에 대해 총 14,197,122개의 영상을 보유하고 있다. 그중에서 1,000개 부류를 뽑아 ILSVRC라는 영상인식 경진대회를 2010년부터 매년 개최하고 있다.
- 데이터의 적은 양 -> 차원의 저주와 관련
  - MNIST: 28*28 흑백 비트맵이라면 서로 다른 총 샘플 수는 2⁷⁸⁴가지이지만, MNIST는 고작 6만 개 샘플

1.3.3 데이터베이스 크기와 기계 학습 성능

적은 양의 데이터베이스로 어떻게 높은 성능을 달성하는가?
- 방대한 공간에서 실제 데이터가 발생하는 곳은 매우 작은 부분 공간임
- 데이터 희소^{data sparsity} 특성 가정 위와 같은 데이터 발생 확률은 거의 0에 가까움
- 매니폴드(마니 + 끼다) 가정^{manifold assumption (or manifold hypothesis)}
  - 고차원의 데이터는 관련된 낮은 차원의 매니폴드에 가깝게 집중되어 있음
  - 아래와 같이 일정한 규칙에 따라 매끄럽게 변화

1.3.4 데이터 가시화

4차원 이상의 초공간은 한꺼번에 가시화 불가능
여러 가지 가시화 기법
- 2개씩 조합하여 여러 개의 그래프 그림

1.4 간단한 기계 학습의 예

선형 회귀^{linear regression}
- [그림 1-4] : 식 (1.2)의 직선 모델(가설)을 사용하므로 두 개의 매개변수 Θ=(𝑤,𝑏)^T

목적 함수^{objective function} (또는 비용 함수^{cost function})
- 식 (1.8)은 선형 회귀를 위한 목적 함수
  - 식 (1.8)을 평균제곱오차식^{MSE(Mean Squared Error)}라 부름
  - 𝑓_Θ(𝐱_i)는 예측함수의 예측 출력, y_i는 예측함수가 맞추어야 하는 실제 목표치
  - 𝑓_Θ(𝐱_i) - y_i는 오차^error 혹은 손실^loss
    - 처음에는 최적 매개변수 값을 알 수 없으므로 난수로 Θ₁=(𝑤₁,b₁)^T 설정 -> Θ₂=(𝑤₂,b₂)^T 로 개선 -> Θ₃=(𝑤₃,b₃)^T 로 개선 -> Θ₃는 최적해 Θhat
  - 𝐽(Θ₁)>𝐽(Θ₂)> 𝐽(Θ₃)

[230103-2] Deep Learning: 신경망의 기초 - 기계학습 I

Wed, 04 Jan 2023 02:45:57 GMT

Deep Learning: 신경망의 기초 - 기계학습 I

1.1.1 기계 학습의 정의

인공지능(Artificial Intelligence)이란?

인간의 학습, 추론, 지각, 자연언어 이해 등의 지능적 능력을 기기로 실현한 기술

학습이란?

"경험의 결과로 나타나는, 비교적 지속적인 행동의 변화나 그 잠재력의 변화 또는 지식을 습득하는 과정"

기계 학습(Machine Learning)이란?

인공지능 초창기 정의 "Progeramming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort. 컴퓨터가 경험을 통해 학습할 수 있도록 프로그래밍할 수 있다면, 세세하게 프로그래밍해야 하는 번거로움에서 벗어날 수 있다[Samuel1959]."
현대적 정의 "A computer program is said to learn from experience E with respect to some class of tasks T and performace measure P, if its performance at tasks in T, as measured by P, improves with experience E. 어떤 컴퓨터 프로그램이 T라는 작업을 수행한다. 이 프로그램의 성능을 P라는 척도로 평가했을 때 경험 E를 통해 성능이 개선된다면 이 프로그램은 학습을 한다고 말할 수 있다[Mitchell1997]"

-> 최적의 프로그램(알고리즘)을 찾는 행위
- 경험 E 를 통해
- 주어진 작업 T 에 대한
- 성능 P 의 향상
"Programming computers to optimize a performance criterion using example data or past experience. 사례 데이터, 즉 과거 경험을 이용하여 성능 기준을 최적화하도록 프로그래밍하는 작업[Alpaydin2010]" "Computational methods using experience to improve performance or to make accurate predictions. 성능을 개선하거나 정확하게 예측하기 위해 경험을 이용하는 계산학 방법들[Mohri2012]"

기계 학습과 전통적인 프로그래밍의 비교

1.1.2 지식기반 방식에서 기계 학습으로의 대전환

인공지능의 탄생 == 연산 장치의 탄생 컴퓨터의 뛰어난 능력 복잡한 연산을 사람보다 잘함 ex) 80932.4321575*0.152367512 ex) 복잡한 함수의 미분과 적분

인공지능의 주도권 전환 지식 기반 -> 기계 학습 -> 심층 학습^{deep Learning}(표현 학습^{Representation Learning}) 데이터 중심 접근방식으로 전환

1.1.3 기계 학습 개념

간단한 기계 학습 예제
- 가로축은 시간, 세로축은 이동체의 위치 <- 모든 데이터는 정량화된 형태로 표현(벡터)
- 4개의 점이 데이터 관측
문제^task 예측^prediction
- 임의의 시간이 주어지면 이때 이동체의 위치는?
- 예측은 회귀^regression 문제와 분류^{classification} 문제로 나뉨
  - 회귀는 목표치가 실수, 분류는 부류 혹은 종류의 값
훈련집합^{training set}
- 가로축은 특징, 세로축은 목표치
- 관측한 4개의 점이 훈련집합을 구성함
  - 훈련 집합: X = {x_1, x_2, ..., x_n}, Y = {y_1, y_2, ..., y_n}
관찰된 데이터들을 어떻게 설명할 것인가?
- 가설^hypothesis : 눈대중으로 데이터 양상이 직선 형태를 보임 -> 모델을 직선으로 가정하여 선택
- 가설인 직선 모델의 수식
  - 2개의 매개변수^parameter w와 b y = wx + b
- 기계 학습의 훈련^train
  - 주어진 문제인 예측을 가장 정확하게 할 수 있는 최적의 매개변수를 찾는 작업
  - 처음은 임의의 매개변수 값에서 시작하지만, 개선하여 정량적인 최적 성능^performance에 도달
- 훈련을 마치면, 추론^inference을 수행
  - 새로운^unknown 특징에 대응되는 목표치의 예측에 사용
- 기계 학습의 궁극적인 목표
  - 훈련집합에 없는 새로운 데이터에 대한 오류를 최소화 (새로운 데이터 = 테스트 집합^{test set})
  - 테스트 집합에 대한 높은 성능을 일반화^{generalization} 능력이라 부름
  - 기계학습의 필수요소
    - 학습할 수 있는 데이터가 있어야 함
    - 데이터 규칙 존재
    - 수학적으로 설명 불가능

1.2.1 1차원과 2차원 특징 공간

모든 데이터가 정량적으로 표현되며, 특징 공간 상에 존재
1차원 특징 공간
2차원 특징 공간 x=(x1,x2)T x=(몸무게,키)T, y=장타율 x=(체온,두통)T, y=감기 여부
다차원 특징 공간 예제
차원의 저주(curse of dimensionality)
- 차원이 높아짐에 따라 발생하는 현실적인 문제들
- 1차, 2차, 3차원에서의 차원의 저주 예시
- 예) d=784인 MNIST 샘플의 화소가 0과 1값을 가진다면 2⁷⁸⁴ 개의 칸이 거대한 공간에 고작 6만 개의 샘플을 흩뿌린 매우 희소한 분포
- 차원이 높아질수록 유의미한 표현을 찾기 위해 지수적으로 많은 데이터가 필요함
선형 분리 불가능^{linearly non-separable}한 원래 특징 공간
- 직선 모델을 적용하면 75% 정확도가 한계
식 (1.6)으로 변환된 새로운 특징 공간
- 공간 변환을 통해 직선 모델로 100% 정확도

표현 학습^{representation learning}
- 좋은 특징 공간을 자동으로 찾는 작업
- 딥러닝은 다수의 은닉층을 가진 신경망을 이용하여 계층적인 특징 공간을 찾아냄
심층학습^{deep learning}
- 표현학습의 하나로 다수의 은닉층을 가진 신경망을 이용하여 최적의 계층적인 특징을 학습
- 인공지능의 단계
  - 초인공지능(Super AI) 인공지능의 발전이 가속화되어 모든 인류의 지성을 합친 것보다 더 뛰어난 인공지능
  - 강인공지능(Strong AI = 인공일반지능) 인간이 할 수 있는 어떠한 지적인 업무도 성공적으로 해낼 수 있는 (가상적인) 기계의 지능
  - 약인공지능(Weak AI) 인간이 지시한 명령의 틀 안에서만 일하기 때문에 예측과 관리가 용이

[230103-1] Deep Learning: 신경망의 기초 - 인공지능과 기계학습 소개

Tue, 03 Jan 2023 13:59:00 GMT

Deep Learning: 신경망의 기초 - 인공지능과 기계학습 소개

인공지능의 사전적 의미 - 인간의 학습능력과 추론능력, 지각능력, 자연언어의 이해능력 등을 컴퓨터 프로그램으로 실현한 기술을 말한다. 일산 속 인공지능에는 음성인식(Siri), 추천 시스템(eBay, Netflix), 자율주행(Waymo), 실시간 객체 인식(Face ID), 로봇(HUBO), 번역(papago)가 있다.

인공지능의 역사는 아래와 같으며 1980~~2000년대까지 기계학습이 번창한 시대였으며 2010~~현재까지 심층학습의 혁신으로 인공지능의 황금 시대라고 볼 수 있다.

인공지능의 분류

머신러닝과 딥러닝

머신러닝(Machine Learning)

협의적: 컴퓨터가 다량의 데이터를 기반으로 스스로 학습하고 통계적인 결과를 도출하는 인공지능 광의적: 인공지능의 하위 개념이지만, 컴퓨터로 연구하는 대부분의 인공지능을 포함 ex) SVM
- 딥러닝(Deep Learning)
  
  협의적: 역전파의 기울기 소실 문제를 해결해 깊은 다층 레이어 학습을 가능하게 한 머신러닝 광의적: 인간의 뇌와 흡사하게 구현한 신경망 알고리즘을 적용하여 보다 빠르고 효율적으로 학습하는 인공지능 ex) CNN, RNN

앞으로 인공지능의 분야 중 Deep Learning을 활용한 자율주행 기술을 자이카에 적용하여 구동시키기 위해 학습을 진행할 예정이다!!

출처

https://itwiki.kr/w/%EC%9D%B8%EA%B3%B5%EC%A7%80%EB%8A%A5

[프로그래머스] - 다음에 올 숫자 (C++)

Tue, 03 Jan 2023 13:09:08 GMT

📌 문제 설명

등차수열 혹은 등비수열 common이 매개변수로 주어질 때, 마지막 원소 다음으로 올 숫자를 return 하도록 solution 함수를 완성해보세요.

📌 제한사항

2 < common의 길이 < 1,000
-1,000 < common의 원소 < 2,000
등차수열 혹은 등비수열이 아닌 경우는 없습니다.
공비가 0인 경우는 없습니다.

📌 입출력 예

common	result
[1, 2, 3, 4]	5
[2, 4, 8]	16

📌 입출력 예 설명

입출력 예 #1

[1, 2, 3, 4]는 공차가 1인 등차수열이므로 다음에 올 수는 5이다.

입출력 예 #2

[2, 4, 8]은 공비가 2인 등비수열이므로 다음에 올 수는 16이다.

📌 풀이

문제의 제한사항에서 common의 길이가 2보다 크다는 것을 바탕으로 common[0], common[1], common[2]의 값을 이용하였다.

(common[1] - common[0] == common[2] - common[1]) // 등차수열
(common[1]/common[0] == common[2]/common[1]) // 등비수열
0으로 나눠지는 경우가 있을 수 있으니 조심!!

📌 코드

#include 
#include 

int solution(std::vector common) { 
    if ((common[1] - common[0]) == (common[2] - common[1])) {
        return int(common[common.size() - 1] + common[1] - common[0]);
    }
    else {
        return int((common[1]/common[0]) * common[common.size() - 1]);
    }
}

jun-yong.log

OpenCV build

usb_cam 설치 오류

[230106] CNN - Shallow CNN

Shallow CNN

- Backpropagation in Shallow Neural Network

- Max Pooling backpropagation

[230106] CNN - Pooling

CNN - Pooling

Fully Connected Layer

Activation

실습

main.py (forward_net() -> Conv2d, Pooling, fully connected layer)

pool.py

fc.py

main.py (activation)

activation.py

[230106-1] CNN

CNN

Sliding window Convolution

IM2COL GEMM convolution

pytorch convolution

코드

main

convolution.py

[230105-2] Perception in self driving car

Pytorch LeNet5 MNIST 학습

main.py

model

lenet5.py

models.py

loss

loss.py

util

tools.py

[230105-1] Perception in self driving car

pytorch의 기본적인 사용방법

코드

각 함수의 결과

- make_tensor()

- sumsub_tensor()

- muldiv_tensor()

- reshape_tensor()

- access_tensor()

- transform_numpy()

- concat_tensor()

- stack_tensor()

- transpose_tensor()

[230103-4] Deep Learning: 신경망의 기초 - 기계학습 III

Deep Learning: 신경망의 기초 - 기계학습 III

1.4 간단한 기계 학습의 예

1.5.1 과소적합과 과잉적합

1.5.2 편향bias과 분산(변동)variance

1.5.3 검증집합과 교차검증을 이용한 모델 선택 알고리즘

1.5.4 모델 선택의 한계의 현실적인 해결책

1.6 규제

1.6.1 데이터 확대

1.6.2 가중치 감쇠

1.7.1 지도 방식에 따른 유형

1.7.2 다양한 기준에 따른 유형

[230103-3] Deep Learning: 신경망의 기초 - 기계학습 II

Deep Learning: 신경망의 기초 - 기계학습 II

1.3 데이터에 대한 이해

1.3.1 데이터 생성 과정

1.3.2 데이터베이스의 중요성

1.3.3 데이터베이스 크기와 기계 학습 성능

1.3.4 데이터 가시화

1.4 간단한 기계 학습의 예

[230103-2] Deep Learning: 신경망의 기초 - 기계학습 I

Deep Learning: 신경망의 기초 - 기계학습 I

1.1.1 기계 학습의 정의

1.1.2 지식기반 방식에서 기계 학습으로의 대전환

1.1.3 기계 학습 개념

1.2.1 1차원과 2차원 특징 공간

[230103-1] Deep Learning: 신경망의 기초 - 인공지능과 기계학습 소개

Deep Learning: 신경망의 기초 - 인공지능과 기계학습 소개

인공지능의 분류

출처

[프로그래머스] - 다음에 올 숫자 (C++)

📌 문제 설명

1.5.2 편향^bias과 분산(변동)^variance