<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>joyjoy.log</title>
        <link>https://velog.io/</link>
        <description>ML &amp; iOS 공부하는 학생입니다</description>
        <lastBuildDate>Thu, 29 Sep 2022 12:34:23 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>joyjoy.log</title>
            <url>https://velog.velcdn.com/images/joker_joy00/profile/0d459b87-7814-4bcf-9c89-5d7ea60f7885/image.jpeg</url>
            <link>https://velog.io/</link>
        </image>
        <copyright>Copyright (C) 2019. joyjoy.log. All rights reserved.</copyright>
        <atom:link href="https://v2.velog.io/rss/joker_joy00" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[[Review] Differentiable Manifold Reconstruction for Point Cloud Denoising]]></title>
            <link>https://velog.io/@joker_joy00/Review-Differentiable-Manifold-Reconstruction-for-Point-Cloud-Denoising</link>
            <guid>https://velog.io/@joker_joy00/Review-Differentiable-Manifold-Reconstruction-for-Point-Cloud-Denoising</guid>
            <pubDate>Thu, 29 Sep 2022 12:34:23 GMT</pubDate>
            <description><![CDATA[<h1 id="differentiable-manifold-reconstruction-for-point-cloud-denoising">[Differentiable Manifold Reconstruction for Point Cloud Denoising]</h1>
<h2 id="quick-look">Quick Look</h2>
<p><strong>Authors &amp; Affiliation</strong>: [Shitong Luo, Wei Hu] [Wangxuan Institute of Computer Technology, Peking University]</p>
<p><strong>Link</strong> : <a href="https://arxiv.org/pdf/2007.13551.pdf">https://arxiv.org/pdf/2007.13551.pdf</a></p>
<p><strong>Comments:</strong>  Published at ACMM 2020</p>
<p><strong>TLDR:</strong> Point Cloud denoising with differentiable manifold reconstruction</p>
<p><strong>Relevance</strong>: 4</p>
<h2 id="research-topic">Research Topic</h2>
<ul>
<li>Category (General) : Computer Vision</li>
<li>Category (Specific) : Point Clouds Denoising</li>
</ul>
<h2 id="paper-summary-what">Paper summary (What)</h2>
<p>[Summary of the paper - a few sentences with bullet points. What did they do?]</p>
<ul>
<li><p>We propose a differentiable manifold reconstruction paradigm for point cloud denoising, aiming to learn the underlying manifold of a noisy point cloud via an autoencoder-like framework.</p>
</li>
<li><p>manifold: 차원을 축소할 때, 모든 정보의 대표성을 지닌채로 축소된 데이터의 분포
  ex) 스위스롤<img src="https://velog.velcdn.com/images/joker_joy00/post/fa9b630a-becd-4ee2-a761-ed185bd4114f/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/ecc41613-0192-4072-a2d2-4ce06392f2c4/image.png" alt=""></p>
</li>
<li><p>We propose an adaptive differentiable pooling operator on point clouds, which samples points that are closer to the underlying surfaces and thus narrows down the latent space for reconstructing the underlying manifold.</p>
<p> <img src="https://velog.velcdn.com/images/joker_joy00/post/3005d634-5ecc-401c-b02e-41b6dead0040/image.png" alt=""></p>
</li>
</ul>
<ul>
<li>We infer the underlying manifold by transforming each sampled point along with the embedded feature of its neighborhood to a local surface centered around the point—a patch manifold.</li>
<li>We design an unsupervised training loss, so that our network can be trained in either an unsupervised or supervised fashion.</li>
</ul>
<h2 id="issues-addressed-by-the-paper-why">Issues addressed by the paper (Why)</h2>
<p>[What are the issues that the paper addresses? Describe the problem. Why did they write this paper?]</p>
<ol>
<li>point clouds data are often contaminated by noise due to inherent limitations of scanning devices or matching ambiguities in the reconstruction from images</li>
<li>In general, these methods infer the displacement of noisy points from the underlying surface and reconstruct points, which however are not designated to recover the surface explicitly and may lead to suboptimal denoising results</li>
</ol>
<h2 id="detailed-information-how">Detailed Information (How)</h2>
<h3 id="methodology">Methodology</h3>
<p>[How did they approach the problem. What methods did they use? ]</p>
<ol>
<li>Hence, point cloud denoising is crucial to relevant 3D vision applications, which is also challenging due to the irregular and unordered characteristics of point clouds.</li>
<li>To this end, inspired by that a point cloud is typically a representation of some underlying surface or 2D manifold over a set of sampled points, we propose to explicitly learn the underlying manifold of a noisy point cloud for reconstruction, aiming to capture intrinsic structures in point clouds</li>
</ol>
<h3 id="assumptions">Assumptions</h3>
<p>[What assumptions were made and are these assumptions valid?]</p>
<h3 id="prominent-formulas">Prominent Formulas</h3>
<p>[Can be empty]</p>
<h3 id="prominent-figures">Prominent Figures</h3>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/4e4197d1-fc46-4997-bcf3-f6d33b07f143/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/72bfba19-c03b-4295-86d3-ab074df7920b/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/2a3c9775-9ff3-45a2-8c61-1212194ce6f3/image.png" alt=""></p>
<h3 id="results">Results</h3>
<p>[Theoretical or empirical results (any main tables) ]</p>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/11e39d33-08dc-4817-8a07-308a30f0fc95/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/f2ba02c2-4063-4e60-914c-be79b829d5d5/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/1853d082-a9b8-47ac-a271-92aef95b078f/image.png" alt=""></p>
<h3 id="limitations">Limitations</h3>
<p>[Did the authors mention any limitations to their work? Do you see any limitations of their work?]</p>
<ol>
<li>normals estimation</li>
<li>reconstruction</li>
</ol>
<h3 id="confusing-aspects-of-the-paper">Confusing aspects of the paper</h3>
<p>[Is there anything that is confusing and could need better explanations or references?]</p>
<h2 id="conclusions">Conclusions</h2>
<h3 id="the-authors-conclusions">The author&#39;s conclusions</h3>
<p>[What are the authors conclusion? What do they claim about their results.]</p>
<ol>
<li>we propose a novel paradigm of learning the underly- ing manifold of a noisy point cloud from differentiably subsampled points.</li>
<li>By sampling on each patch manifold, we reconstruct a clean point cloud that captures the intrinsic structure.</li>
<li>Our network can be trained end-to-end in either a supervised or unsupervised fashion.</li>
<li>Extensive experiments demonstrate the superiority of our method compared to the state-of-the-art methods under both synthetic noise and real-world noise.</li>
</ol>
<h3 id="my-conclusion">My Conclusion</h3>
<p>[What do you think about the work presented in the article? Did the authors manage to achieve what they set out to achieve?]</p>
<h3 id="rating">Rating</h3>
<p>[Fine, Good, Great, Wow, Turing Award ]</p>
<p>Great</p>
<h2 id="possible-future-work--improvements">Possible future work / improvements</h2>
<p>[Can you think of ways to improve this paper or ideas for future work?]</p>
<h2 id="extra">Extra</h2>
<ul>
<li>Cited references to follow up on / related papers:</li>
<li>Source code/ blog/ twitter thread/ other links: <a href="https://github.com/luost26/DMRDenoise">https://github.com/luost26/DMRDenoise</a></li>
</ul>
<hr>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Review] Hierarchical Aggregation for 3D Instance Segmentation]]></title>
            <link>https://velog.io/@joker_joy00/Review-Hierarchical-Aggregation-for-3D-Instance-Segmentation</link>
            <guid>https://velog.io/@joker_joy00/Review-Hierarchical-Aggregation-for-3D-Instance-Segmentation</guid>
            <pubDate>Thu, 29 Sep 2022 12:27:41 GMT</pubDate>
            <description><![CDATA[<h1 id="title">[Title]</h1>
<h2 id="quick-look">Quick Look</h2>
<p><strong>Authors &amp; Affiliation</strong>: </p>
<p>[Authors]: Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang </p>
<p>[Affiliations]: School of EIC, Huazhong University of Science &amp; Technology,
Institute of AI, Huazhong University of Science &amp; Technology,  Horizon Robotics</p>
<p><strong>Link</strong> : <a href="https://arxiv.org/pdf/2108.02350.pdf">https://arxiv.org/pdf/2108.02350.pdf</a></p>
<p><strong>Comments:</strong>  Published at ICCV2021</p>
<p><strong>TLDR:</strong> 3D instance segmentation using hierarchical aggregation</p>
<p><strong>Relevance</strong>: 4</p>
<h2 id="research-topic">Research Topic</h2>
<ul>
<li>Category (General) : Computer Vision</li>
<li>Category (Specific) : 3D instance segmentation</li>
</ul>
<h2 id="paper-summary-what">Paper summary (What)</h2>
<p>[Summary of the paper - a few sentences with bullet points. What did they do?]</p>
<ul>
<li>a novel bottom-up framework with the hierarchical aggregation for instance segmentation on 3D point cloud</li>
<li>1st on the leaderboard of ScanNetv2, SOTA on S3DIS</li>
<li>highest efficiency among all existing methods (2021 기준), average per-frame inference time on ScanNet v2 is only 410ms</li>
</ul>
<h2 id="issues-addressed-by-the-paper-why">Issues addressed by the paper (Why)</h2>
<p>[What are the issues that the paper addresses? Describe the problem. Why did they write this paper?]</p>
<p>In extending 2D instance segmentation to 3D scenes, most existing 3D methods adopt a totally different bottom-up pipeline, which generates instances through clustering.</p>
<p>the difficulties of directly clustering a point cloud into multiple instances</p>
<p>Becuase:</p>
<ol>
<li>a point cloud usually contains a large number of points</li>
<li>the number of instances in a point cloud has large variations for different 3D scenes</li>
<li>the size of instances vary significantly</li>
<li>each point has a very weak feature </li>
</ol>
<h2 id="detailed-information-how">Detailed Information (How)</h2>
<h3 id="methodology">Methodology</h3>
<ol>
<li>The point-wise prediction network extracts features from point clouds</li>
<li>Predicts point-wise semantic labels and center shift vectors</li>
<li>The point aggregation module forms preliminary instace predictions based on point-wise prediction results</li>
<li>The set aggregation module expands incomplete instances to cover missing parts</li>
<li>The intra instance prediction network smooths instances to filter out outliers</li>
</ol>
<h3 id="assumptions">Assumptions</h3>
<p>[What assumptions were made and are these assumptions valid?]</p>
<h3 id="prominent-formulas">Prominent Formulas</h3>
<h3 id="prominent-figures">Prominent Figures</h3>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/0014ab16-8426-4e2a-b45d-c3cfaf07f1e0/image.png" alt=""></p>
<h3 id="results">Results</h3>
<p>[Theoretical or empirical results (any main tables) ]</p>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/4a90f508-2246-443a-9b47-c31e580ef19c/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/95fa99ea-b35b-4f28-ae6c-56b0e0ad1584/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/f5ee848c-4b70-4b59-a3f3-aa07400972d4/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/4f6092d5-f76a-46e6-8746-4346c3734179/image.png" alt=""></p>
<h3 id="limitations">Limitations</h3>
<p>[Did the authors mention any limitations to their work? Do you see any limitations of their work?]</p>
<ol>
<li>It was validated only for objects, not for human body or face which is more elaborate an detailed </li>
<li>They throw away some raw data in the filtering processing which is in intra-instance prediction.</li>
</ol>
<h3 id="confusing-aspects-of-the-paper">Confusing aspects of the paper</h3>
<p>[Is there anything that is confusing and could need better explanations or references?]</p>
<h2 id="conclusions">Conclusions</h2>
<h3 id="the-authors-conclusions">The author&#39;s conclusions</h3>
<p>HAIS is a concise bottom-up approach for 3D instance segmentation.</p>
<p>The effectiveness and generalization of the method are demonstrated by Experiments on ScanNet v2 and S3DIS</p>
<p>HAIS retains much better inference speed than all existing methods, showing its practicability expecially latency-sensitive ones.</p>
<p>→ HAIS is concise, effective and fast.</p>
<h3 id="my-conclusion">My Conclusion</h3>
<p>[What do you think about the work presented in the article? Did the authors manage to achieve what they set out to achieve?]</p>
<p>I agree that the extension of 2D to 3D has many difficulties as they points out.</p>
<p>In this point, the trial to address the issue and achievement of this study matter.</p>
<p>But, what I want more is that the 3D point data doesn’t loss its raw data. </p>
<p>It will have some trouble when the 3D points are rendered into 3D model</p>
<h3 id="rating">Rating</h3>
<p>Wow</p>
<h2 id="possible-future-work--improvements">Possible future work / improvements</h2>
<p>[Can you think of ways to improve this paper or ideas for future work?]</p>
<ol>
<li><p>사람 신체 구조에 대한 data set은 face parsing으로 현재 구축이 되어있으므로, 이 데이터로 instance segmentation 학습을 진행하면 우리 연구에 맞는 결과값을 도출해낼 수 있을 것 같음</p>
</li>
<li><p>raw data를 잃어서 3d reconstrunction을 위한 정보도 함께 소실되는 문제를 해결하기 위해, point-base MVS 기술을 이용해 3d depth map을 얻는 해결책을 고안해볼 수 있음.</p>
<p> (민서 리뷰 링크 추가하기)</p>
</li>
</ol>
<h2 id="extra">Extra</h2>
<ul>
<li>Cited references to follow up on / related papers: NSFC</li>
<li>Source code/ blog/ twitter thread/ other links: <a href="https://github.com/hustvl/HAIS">https://github.com/hustvl/HAIS</a></li>
</ul>
<hr>
]]></description>
        </item>
        <item>
            <title><![CDATA[Pytorch to CoreML]]></title>
            <link>https://velog.io/@joker_joy00/Pytorch-to-CoreML</link>
            <guid>https://velog.io/@joker_joy00/Pytorch-to-CoreML</guid>
            <pubDate>Wed, 28 Sep 2022 06:53:52 GMT</pubDate>
            <description><![CDATA[<h2 id="pytorch">Pytorch</h2>
<ul>
<li>python에서 사용가능한 ML framework</li>
</ul>
<h2 id="coreml">CoreML</h2>
<ul>
<li>swift에서 사용가능한 ML framework</li>
<li><a href="https://coremltools.readme.io/docs">https://coremltools.readme.io/docs</a></li>
</ul>
<h2 id="why">Why?</h2>
<ul>
<li>Swift를 이용하여 ML이 적용된 iOS app을 만들고자 할 때, Swift에서 import 가능한 framework인 CoreML을 이용해 모델을 생성, 학습, 추론을 진행할 수 있다.</li>
<li>TensorFlow framework로 학습되어진 모델을 iOS에 적용하기 위해서는 CoreML framework로 변환을 해야한다 (TensorFlow를 PyTorch로 conversion하는 것과 같은 맥락)</li>
</ul>
<h2 id="how">How?</h2>
<h3 id="in-python-code-tensorflow">In Python Code (TensorFlow)</h3>
<ul>
<li><a href="https://coremltools.readme.io/docs/unified-conversion-api">https://coremltools.readme.io/docs/unified-conversion-api</a></li>
</ul>
<ol>
<li>Load the model</li>
</ol>
<pre><code class="language-python">import torch
import torchvision

# Load a pre-trained version of MobileNetV2
torch_model = torchvision.models.mobilenet_v2(pretrained=True)
# Set the model in evaluation mode.
torch_model.eval()

# Trace the model with random data.
example_input = torch.rand(1, 3, 224, 224) 
traced_model = torch.jit.trace(torch_model, example_input)

# 여기서 sample input과 sample output에 대한 세팅이 진행된다
out = traced_model(example_input)</code></pre>
<ol>
<li>Convert the model using <code>convert()</code></li>
</ol>
<pre><code class="language-python"># Using image_input in the inputs parameter:
# Convert to Core ML using the Unified Conversion API.
model = ct.convert(
    traced_model,
    inputs=[ct.TensorType(shape=example_input.shape)]
 )</code></pre>
<ol>
<li>Save the ml Model</li>
</ol>
<pre><code class="language-python"># Save the converted model.
model.save(&quot;mobilenet.mlmodel&quot;)</code></pre>
<h3 id="in-swift-code-coreml">In Swift Code (CoreML)</h3>
<ul>
<li>앞서 변형된 mlmodel을 다음과 같이 project에 load할 수 있음</li>
</ul>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/40fc936a-ed27-4beb-9296-cd62b3f32e91/image.png" alt=""></p>
<ol>
<li>load mlmodel</li>
</ol>
<pre><code class="language-swift">guard let model = try? VNCoreMLModel(for: FaceParsing().model) else {
    fatalError(&quot;Loading CoreML Model Failed.&quot;)
}</code></pre>
<ol>
<li>inference with handler</li>
</ol>
<pre><code class="language-swift">let handler : VNImageRequestHandler = VNImageRequestHandler(ciImage: inputImg as! CIImage)

do{
    try! handler.perform([request])
}catch{
    print(&quot;error&quot;)
}</code></pre>
<ol>
<li>process images with inferenced results</li>
</ol>
<pre><code class="language-swift">let request = VNCoreMLRequest(model: model) {
    request, error in
    guard let results = request.results as? [VNCoreMLFeatureValueObservation],
            let segmentationmap = results.first?.featureValue.multiArrayValue,
            let row = segmentationmap.shape[0] as? Int,
            let col = segmentationmap.shape[1] as? Int else {
        fatalError(&quot;Model failed to process images.&quot;)
    }

    self.model_results = results
    self.model_segmentationmap = segmentationmap
}</code></pre>
<p>실제 코드</p>
<pre><code class="language-swift">guard let model = try? VNCoreMLModel(for: FaceParsing().model) else {
            fatalError(&quot;Loading CoreML Model Failed.&quot;)
        }

let request = VNCoreMLRequest(model: model) {
    request, error in
    guard let results = request.results as? [VNCoreMLFeatureValueObservation],
            let segmentationmap = results.first?.featureValue.multiArrayValue,
            let row = segmentationmap.shape[0] as? Int,
            let col = segmentationmap.shape[1] as? Int else {
        fatalError(&quot;Model failed to process images.&quot;)
    }

    self.model_results = results
    self.model_segmentationmap = segmentationmap
}

let handler : VNImageRequestHandler = VNImageRequestHandler(ciImage: inputImg as! CIImage)

do{
    try! handler.perform([request])
}catch{
    print(&quot;error&quot;)
}</code></pre>
<p>Ref) 
<a href="https://coremltools.readme.io/docs/what-are-coreml-tools">https://coremltools.readme.io/docs/what-are-coreml-tools</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[TensorFlow to CoreML]]></title>
            <link>https://velog.io/@joker_joy00/TensorFlow-to-CoreML</link>
            <guid>https://velog.io/@joker_joy00/TensorFlow-to-CoreML</guid>
            <pubDate>Wed, 28 Sep 2022 06:49:55 GMT</pubDate>
            <description><![CDATA[<h2 id="tensorflow">TensorFlow</h2>
<ul>
<li>python에서 사용가능한 ML framework</li>
</ul>
<h2 id="coreml">CoreML</h2>
<ul>
<li>swift에서 사용가능한 ML framework</li>
<li><a href="https://coremltools.readme.io/docs">https://coremltools.readme.io/docs</a></li>
</ul>
<h2 id="why">Why?</h2>
<ul>
<li>Swift를 이용하여 ML이 적용된 iOS app을 만들고자 할 때, Swift에서 import 가능한 framework인 CoreML을 이용해 모델을 생성, 학습, 추론을 진행할 수 있다.</li>
<li>TensorFlow framework로 학습되어진 모델을 iOS에 적용하기 위해서는 CoreML framework로 변환을 해야한다 (TensorFlow를 PyTorch로 conversion하는 것과 같은 맥락)</li>
</ul>
<h2 id="how">How?</h2>
<h3 id="in-python-code-tensorflow">In Python Code (TensorFlow)</h3>
<ul>
<li><a href="https://coremltools.readme.io/docs/unified-conversion-api">https://coremltools.readme.io/docs/unified-conversion-api</a></li>
</ul>
<ol>
<li>Load the model</li>
</ol>
<pre><code class="language-python">import coremltools as ct

# Load TensorFlow model
import tensorflow as tf # Tf 2.2.0

tf_model = tf.keras.applications.Xception(weights=&quot;imagenet&quot;, 
                                          input_shape=(299, 299, 3))</code></pre>
<ol>
<li>Convert the model using <code>convert()</code></li>
</ol>
<pre><code class="language-python"># Convert using the same API
model_from_tf = ct.convert(tf_model)</code></pre>
<ol>
<li>Save the ml Model</li>
</ol>
<pre><code class="language-python">model_from_tf.save(&quot;imagenet.mlmodel&quot;)</code></pre>
<ol>
<li>model을 load하는 다양한 방법들을 추가한 최종버전 (or 버전들도 가능하다는 의미)</li>
</ol>
<pre><code class="language-python">import tensorflow as tf
import coremltools as ct

tf_keras_model = tf.keras.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation=tf.nn.relu),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Pass in `tf.keras.Model` to the Unified Conversion API
mlmodel = ct.convert(tf_keras_model)

# or save the keras model in SavedModel directory format and then convert
tf_keras_model.save(&#39;tf_keras_model&#39;)
mlmodel = ct.convert(&#39;tf_keras_model&#39;)

# or load the model from a SavedModel and then convert
tf_keras_model = tf.keras.models.load_model(&#39;tf_keras_model&#39;)
mlmodel = ct.convert(tf_keras_model)

# or save the keras model in HDF5 format and then convert
tf_keras_model.save(&#39;tf_keras_model.h5&#39;)
mlmodel = ct.convert(&#39;tf_keras_model.h5&#39;)

# save converted model
mlmodel.save(&quot;trainedmodel.mlmodel&quot;)</code></pre>
<h3 id="in-swift-code-coreml">In Swift Code (CoreML)</h3>
<ul>
<li>앞서 변형된 mlmodel을 다음과 같이 project에 load할 수 있음</li>
</ul>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/73a32ae7-32ea-4dbc-9d66-14dc07ab3d6b/image.png" alt=""></p>
<ol>
<li>load mlmodel</li>
</ol>
<pre><code class="language-swift">guard let model = try? VNCoreMLModel(for: FaceParsing().model) else {
    fatalError(&quot;Loading CoreML Model Failed.&quot;)
}</code></pre>
<ol>
<li>inference with handler</li>
</ol>
<pre><code class="language-swift">let handler : VNImageRequestHandler = VNImageRequestHandler(ciImage: inputImg as! CIImage)

do{
    try! handler.perform([request])
}catch{
    print(&quot;error&quot;)
}</code></pre>
<ol>
<li>process images with inferenced results</li>
</ol>
<pre><code class="language-swift">let request = VNCoreMLRequest(model: model) {
    request, error in
    guard let results = request.results as? [VNCoreMLFeatureValueObservation],
            let segmentationmap = results.first?.featureValue.multiArrayValue,
            let row = segmentationmap.shape[0] as? Int,
            let col = segmentationmap.shape[1] as? Int else {
        fatalError(&quot;Model failed to process images.&quot;)
    }

    self.model_results = results
    self.model_segmentationmap = segmentationmap
}</code></pre>
<p>실제 코드</p>
<pre><code class="language-swift">guard let model = try? VNCoreMLModel(for: FaceParsing().model) else {
            fatalError(&quot;Loading CoreML Model Failed.&quot;)
        }

let request = VNCoreMLRequest(model: model) {
    request, error in
    guard let results = request.results as? [VNCoreMLFeatureValueObservation],
            let segmentationmap = results.first?.featureValue.multiArrayValue,
            let row = segmentationmap.shape[0] as? Int,
            let col = segmentationmap.shape[1] as? Int else {
        fatalError(&quot;Model failed to process images.&quot;)
    }

    self.model_results = results
    self.model_segmentationmap = segmentationmap
}

let handler : VNImageRequestHandler = VNImageRequestHandler(ciImage: inputImg as! CIImage)

do{
    try! handler.perform([request])
}catch{
    print(&quot;error&quot;)
}</code></pre>
<p>Ref)
<a href="https://pilgwon.github.io/blog/2017/09/18/Smart-Gesture-Recognition-CoreML-TensorFlow.html">https://pilgwon.github.io/blog/2017/09/18/Smart-Gesture-Recognition-CoreML-TensorFlow.html</a>
<a href="https://medium.com/@JMangia/swift-loves-tensorflow-and-coreml-2a11da25d44">https://medium.com/@JMangia/swift-loves-tensorflow-and-coreml-2a11da25d44</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[CodeExecution] DMRDenoise Pre-trained model]]></title>
            <link>https://velog.io/@joker_joy00/CodeExecution-DMRDenoise-Pre-trained-model</link>
            <guid>https://velog.io/@joker_joy00/CodeExecution-DMRDenoise-Pre-trained-model</guid>
            <pubDate>Wed, 28 Sep 2022 06:35:53 GMT</pubDate>
            <description><![CDATA[<p><em>본 글은 <a href="https://github.com/luost26/DMRDenoise">https://github.com/luost26/DMRDenoise</a> 에  업로드 되어있는 DMRDenoise의 Pre-trained model을 직접 실행해보고 결과값을 검증해보는 과정이 기록되어 있는 글입니다.</em></p>
<p><em>잘못 이해한 개념이나 부적절하게 사용된 자료가 있다면 댓글로 알려주시면 감사하겠습니다!</em></p>
<hr>
<h1 id="mac에서-실행">Mac에서 실행</h1>
<p>M1에서 실행을 시도했지만, NVIDIA계열이 아닌 apple 자체 생산 그래픽카드를 사용하는 M1의 특성 상 실패했다.</p>
<p>모델에서는 NVCC (NVIDIA CUDA Compiler) 설치를 필요로 하기 때문이다.</p>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/58c2adbe-fd40-4e03-9348-e0439c6705e7/image.png" alt=""></p>
<blockquote>
<p>소유 중인 pc의 그래픽 관련 사양</p>
</blockquote>
<hr>
<h1 id="windows에서-실행">Windows에서 실행</h1>
<p>우선, 해당 PC의 그래픽 카드는 RTX 3090이다. </p>
<h2 id="1-virtual-environment-생성하기">1. Virtual Environment 생성하기</h2>
<p>DMRDenoise 라는 이름의 venv 생성.</p>
<p><code>conda create --name DMRDenoise python=3.6</code></p>
<p>🚨문제 발생</p>
<ul>
<li>환경 생성 과정에서 SSL verify 불안정으로 인한 문제가 발생했다.</li>
</ul>
<blockquote>
<p><em>CondaHTTPError: HTTP 000 CONNECTION FAILED</em></p>
</blockquote>
<ul>
<li>solution</li>
</ul>
<pre><code>    `./Anaconda3/Library/bin` 경로에서 아래 파일들을 복사하여

    - `libcrypto-1_1-x64.dll`

    - `libcrypto-1_1-x64.pdb`

    - `libssl-1_1-x64.dll`

    - `libssl-1_1-x64.pdb`



 `./Anaconda3/DLLs` 경로에 붙여넣기 하면 해결이 된다.</code></pre><h2 id="2-virtual-environment-activate">2. Virtual Environment activate</h2>
<p>DMRDenoise를 activate하기</p>
<p><code>conda activate DMRDenoise</code></p>
<h2 id="3-packages-install">3. Packages install</h2>
<p>모델링에 필요한 패키지들 : pytorch, scikit-learn, cudatoolkit, torchvision, pytorch lightning, h5py</p>
<pre><code class="language-bash">conda install -y pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=9.2 -c pytorch
conda install -y scikit-learn=0.23.1
conda install -y -c conda-forge h5py=2.10.0 pytorch-lightning=0.7.6</code></pre>
<p>🚨두가지 문제가 발생했다.</p>
<ol>
<li>cudatoolkit을 설치했지만, 환경변수 PATH로 설정되지 않아 <a href="http://setup.py">setup.py</a> install 시 error 발생</li>
</ol>
<blockquote>
<p><em>OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root</em></p>
</blockquote>
<ul>
<li>solution<ol>
<li><code>conda deactivate</code> 후에 original PC에 cudatoolkit을 깔아보기</li>
<li>여전히 환경 변수 설정이 안되어 있어서 고급시스템에 접근하여 직접 설정함 (<code>./CUDA</code>)<ol>
<li>PATH 설정은 terminal로 하는 방법이 있는 것 같기도 하다 (추후 다시 알아보기)</li>
<li>환경 변수는 PC 재시동 후에 적용된다는 것을 이후에 알게 되었다. 이것 역시 해결책이 될 수 있을 것 같다.</li>
</ol>
</li>
</ol>
</li>
</ul>
<ol>
<li>깃허브에서 제안하는 cudatoolkit 10.0에게는 RTX 3090이 너무 최신이라 장치를 지원하지 못해서 <a href="http://setup.py">setup.py</a> install 시 error 발생</li>
</ol>
<blockquote>
<p><em>ValueError: Unknown CUDA arch (8.6) or GPU not supported</em></p>
</blockquote>
<ul>
<li><p>solution</p>
<ol>
<li><p>packages install 과정에서 <code>cudatoolkit=11.0</code>으로 다시 설정해서 install을 시도해보았다.</p>
</li>
<li><p>그러자 다른 package들에 비해 <code>cudatoolkit=11.0</code>의 버전이 너무 높아 conflict가 발생했다.</p>
</li>
<li><p>다른 package의 버전은 default로 설정한 후 모든 package들을 재설치했다.</p>
<ol>
<li><p>default로 설정했다는 것은, <code>pytorch=1.5.1</code>이 아니라 그냥 <code>pytorch</code>라고만 입력했다는 것. </p>
</li>
<li><p>그러고 나니, <code>pytorch=1.7.1</code>, <code>pytorch-lightning=1.4.5</code>, <code>h5py=2.10.0</code>, <code>scikit-learn=0.23.1</code>, <code>torchvision=0.8.2</code>, <code>cudatoolkit=11.0</code> 가 되었다.</p>
<pre><code class="language-bash">conda install -y pytorch torchvision cudatoolkit=11.4 -c pytorch
conda install -y scikit-learn=0.23.1
conda install -y -c conda-forge h5py pytorch-lightning</code></pre>
</li>
</ol>
</li>
<li><p><code>cudatoolkit=11.0</code>에 맞춘 패키지 버전들을 설치한 채로 진행하니 <a href="http://setup.py">setup.py</a> install에 성공했다.</p>
</li>
</ol>
</li>
</ul>
<h2 id="4-point-cloud-data-denoising하기">4. Point cloud data denoising하기</h2>
<ol>
<li>깃허브에서 제공된 pre-trained 모델 중 supervised model을 이용해 denoising을 시도하였다.</li>
<li>우선, 내가 LiDAR 센서로부터 얻은 데이터를 입력하기 전에, 깃허브에서 제공하는 test data를 입력해보고 어떤 파일형식을 input하고 output하는 지 살펴볼 필요가 있었다. </li>
<li><code>airplane_0016.obj.xyz</code> 이라는 파일을 우선 meshlab을 통해 어떤 data인지 확인해보았다.
<img src="https://velog.velcdn.com/images/joker_joy00/post/5ef35c7a-1338-417f-b0d0-cf6f26e2d04a/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/5e0c00f8-7f42-41a9-a2d3-ddc2815ddb0c/image.png" alt=""></li>
</ol>
<blockquote>
<pre><code>normal vectors가 포함되지 않은 .xyz 파일이므로 normals를 이용한 screened poisson reconstruction은 진행하지 못했고, 대신 Ball Pivoting을 함</code></pre></blockquote>
<ol start="4">
<li><p>해당 파일을 command에서 input으로 입력했다.</p>
<p>   python denoise.py --input ../dataset_test/input_full_test_50k_0.010/airplane_0016.obj.xyz --output ./output_airplane.obj.xyz --ckpt ./pretrained/supervised/epoch=153.ckpt</p>
</li>
</ol>
<pre><code>🚨 문제발생

- 앞서 `cudatoolkit=11.0`에 맞춰서 설치된 `pytorch-lightning=1.4.5`가 문제가 된 모양이다.

&gt;*TypeError: **init**() missing 1 required positional argument: &#39;hparams&#39;*</code></pre><ul>
<li><p>solution</p>
<ol>
<li><p>찾아보니, pytorch-lightning 패키지 파일에 들어있는 saving.py(정확히 기억안남)에서 hyper parameter를 모델 학습시킨 후 check point에서 load를 해주지 않아서 생긴 문제라고 한다.</p>
</li>
<li><p>그러나, 깃허브에서 <code>pytorch-lightning=0.7.6</code>을 권장했으며 issue를 찾아봐도 동일한 문제가 발생한 사람들은 없었기에 GPU 버전으로 인해 불가피하게 <code>pytorch-lightning=1.4.5</code>를 설치하면서 <code>hparams</code>가 필요한 패키지로 실행해버린 것이 아닌가라는 추측이다.</p>
</li>
<li><p>이에 대한 해답을 찾기에 큰 어려움이 있다고 판단한 후, 이 문제에 대해서 깃허브에 issue로 올린 상태이다. 따라서 해당 solution은 논문 저자의 답변이 올 때까지 대기 상태이다. (<a href="https://github.com/luost26/DMRDenoise/issues/13">https://github.com/luost26/DMRDenoise/issues/13</a>)</p>
<ol>
<li>GPU 버전 문제를 해결하기 위해서 온라인 가상머신의 역할을 하는 gradient 사용도 고려 중이다. ( <a href="https://gradient.paperspace.com/">https://gradient.paperspace.com/</a> )</li>
</ol>
</li>
<li><p>답변은 오지 않았지만 다시 에러를 해결하기 위해 여기저기 찾다보니 발견한 pytorch 공식 홈페이지에서 해결책을 찾을 수 있었다.(<a href="https://pytorch.org/get-started/locally/">https://pytorch.org/get-started/locally/</a>)</p>
<p> <code>conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge</code></p>
</li>
<li><p><code>cudatoolkit=11.1</code>을 설치하려면 <code>conda-forge</code>가 추가되야한다고 한다. </p>
</li>
<li><p>이 command로 설치를 하니 <code>pytorch-lightning=0.7.6</code> 설치도 문제 없이 설치가 되었다.</p>
</li>
</ol>
</li>
</ul>
<ol start="4">
<li><p>해당 모델 추론 결과는 다음과 같이 나왔다.
<img src="https://velog.velcdn.com/images/joker_joy00/post/a9e2cc6f-f344-411a-bfc7-1bac92faf820/image.png" alt="">
<img src="https://velog.velcdn.com/images/joker_joy00/post/6e9a3179-4cc4-48e5-ba90-9bc8c4db12cc/image.png" alt=""></p>
<blockquote>
<p>Ball Pivoting 결과</p>
</blockquote>
</li>
<li><p>50k에서 67k로 늘어난 것을 확인 할 수 있었고, 이는 reconstructed manifold에서 resampling과정을 거쳤기 때문이라고 보고있다. 80k data를 입력했을 때도 60k 대 points가 되는 것으로 보아 points가 늘어날 지 감소할 지 예측할 수 없다.</p>
</li>
</ol>
<h2 id="5-problems">5. Problems</h2>
<ol>
<li>normals estimation<ol>
<li>이 모델은 기존의 point cloud set에서 단순히 noise point만 제거하는 것이 아니라 surface를 가장 잘 표현할 수 있도록 point 제거 및 재생성을 진행하므로 기존의 normal값을 후에 matching할 수가 없다.</li>
<li>normal estimation을 진행하더라도 view point와 같이 SLAM과 관련된 parameter들을 소실한 채로 진행해야하므로 normal vector들이 부정확하다.</li>
</ol>
</li>
<li>reconstruction<ol>
<li>normals estimation에서 파생되는 문제로, normals가 정확하지 않아서 poisson reconstruction을 시행해도 다음과 같이 부정확한 형태로 출력이 된다.</li>
<li>그나마 다행인 것은, denoise 전과 후의 reconstruction 결과가 눈에 띄게 다르다는 것인데 후의 결과를 보면 모양은 이상하지만 normals estimation이 양쪽 날개와 꼬리에 대해 대칭으로 진행되어 reconstruction도 대칭으로 진행된 것을 확인할 수 있다.</li>
</ol>
</li>
</ol>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/61350cf5-71d9-4337-beb7-25d0a0f11265/image.png" alt=""></p>
<blockquote>
<p>noisy airplane</p>
</blockquote>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/558b9db6-f5e5-4a2b-8985-e8c6118f6d57/image.png" alt=""></p>
<blockquote>
<p>denoised airplane</p>
</blockquote>
<h2 id="6-tasks">6. Tasks</h2>
<ol>
<li>normals vector를 어떻게 하면 기존의 정보를 최대한 소실하지 않고 계속 보존할 수 있을 지 고민하기</li>
<li>해당 모델을 CoreML을 이용해 ios device에서 실행시킬 수 있도록 swift 이식하기.</li>
</ol>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Review] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation]]></title>
            <link>https://velog.io/@joker_joy00/Review-PointNet-Deep-Learning-on-Point-Sets-for-3D-Classification-and-Segmentation</link>
            <guid>https://velog.io/@joker_joy00/Review-PointNet-Deep-Learning-on-Point-Sets-for-3D-Classification-and-Segmentation</guid>
            <pubDate>Wed, 28 Sep 2022 06:15:23 GMT</pubDate>
            <description><![CDATA[<p><em>해당 글은 1년 전에 읽었던 &lt;PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation&gt; 이라는 제목의 논문을 읽고 서툴게 정리해놓은 리뷰글입니다.</em></p>
<p><em>아직 수정하고 보완해야할 부분이 많고 미완성의 글이지만, 기록의 목적으로 포스팅했습니다!</em></p>
<p><em>혹시 제가 잘못 이해한 개념이 있거나, 포스팅 과정에서 문제가 되는 자료가 쓰였다면 댓글로 알려주시면 감사하겠습니다!</em></p>
<hr>
<ul>
<li>Point Cloud:<ul>
<li>geomatric data structure을 이해하는 데에 필수적인 요소</li>
<li>3차원 대상의 연속적인 표면을 점으로 나타내어 입체적인 이미지 데이터를 얻어내는 표현방식</li>
</ul>
</li>
</ul>
<h2 id="abstract">Abstract</h2>
<ul>
<li>Point Net 이전, 대부분의 3D data 관련 연구에서는 Deep net 구조에 데이터를 입력하기 위해서는 불규칙한 값들을 정형화하는 과정인 rendering이 필수였다.</li>
<li>Point Net에서는 Point Cloud에서 중요한 정보만을 뽑아 3D coordinate (x,y,z)로 표현하는 과정이 포함되어 있어 Network에 feeding하기 위해 별도의 rendering이 필요하지 않다.</li>
</ul>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/676521d5-6c53-431d-8ee5-7bdfcfca0103/image.png" alt=""></p>
<blockquote>
<ol>
<li>polygon으로 표현된 Mesh</li>
<li>voxel로 표현된 Volumetric (voxel은 단위영역을 육면체로 표현한 것으로 마인크래프트를 생각하면 쉽다.) </li>
<li>point cloud (별도의 rendering이 필요하지 않다.)</li>
</ol>
</blockquote>
<h2 id="1-introduction">1. Introduction</h2>
<ul>
<li>abstract에서 언급된 바와 같이 이 연구는 3D data를 deep learning에 입력할 시에 rendering을 하지 않을 수 있는 방식을 고안한다.</li>
<li>Point Net의 핵심기능: unordered data인 point data에 순서를 부여해 별도의 렌더링 없이 neural net에 바로 투입시킬 수 있게 데이터를 처리하는 것.</li>
<li>Point Net 논문의 4가지 핵심사항<ol>
<li>3D상의 unordered point sets을 소비하는데에 적합한 새로운 deep net architecture를 디자인한다.</li>
<li>3D shape classification, shape part segmentation 그리고 scene semantic parsing tasks에 대해서 해당 network가 얼마나 잘 훈련되고 수행되는 지를 입증한다.</li>
<li>해당 방식에 대한 안정성과 효율성에 대해 경험적이고 이론적인 분석을 제공한다.</li>
<li>net 상에서 선택된 뉴런에 의해 계산되는 3D feature를 설명하고 그것의 수행에 대해 직관적인 설명을 발전시킨다.</li>
</ol>
</li>
</ul>
<h2 id="2-related-work">2. Related Work</h2>
<p>(아래 내용들에 대해서는 다음에 기회가 되면 정리하고자 한다)</p>
<ul>
<li>Point Cloud Features</li>
<li>Deep Learning on 3D Data</li>
<li>Deep Learning on Unordered Sets</li>
</ul>
<h2 id="3-problem-statement">3. Problem Statement</h2>
<p>{$P_i | i = 1, ..., n$}</p>
<ul>
<li>Point P_i: vector of its (x,y,z)</li>
<li>(x,y,z, alpha~) extra feature이 xyz 뒤에 추가될 수 있다.</li>
<li>For object classification input<ul>
<li>input point cloud를 바로 형태로부터 sampling 또는 point cloud에서 미리 sementing 처리를 함.</li>
</ul>
</li>
<li>For semantic segmentation input<ul>
<li>여러지역 segmentation에 대해 하나의 이미지를 input으로 넣거나 또는 object 지역 segmentation에 대해서 3D scene으로부터 sub-volume을 input으로 넣을 수 있다.</li>
</ul>
</li>
<li>output: n*m (n개 points, m개 semantic sub-categories)</li>
</ul>
<h2 id="4-deep-learning-on-point-sets">4. Deep Learning on Point Sets</h2>
<h3 id="41-properties-of-point-sets-in-rn-real-number">4.1. Properties of Point Sets in R^n (real number)</h3>
<ul>
<li><p>지금까지 여러 연구에서 3D data를 다루기에 앞서서 rendering을 시행했던 이유는 point set의 특성이 다음과 같기 때문이다.</p>
<ol>
<li><p>Unordered: pixel이나 voxel과 다르게 특별한 순서가 없다. 따라서, n개의 point (x,y,z)를 net에 feeding하는 경우의 수는 n!. 그러나,  deep learning에서는 모든 경우의 수에 대해서 neural network의 output이 같아야 한다.[permutation invariant]</p>
<p> <strong>→ point net의 solution: symmetric function의 일종인 max pooling을 적용하였다.</strong></p>
</li>
<li><p>Point간의 상호작용: points는 거리체계를 가진 공간으로부터 얻어진 데이터들이다. 서로의 이웃이 존재하므로 의미있는 subset이 만들어 질 수 있으므로 이러한 의미를 반영하지 않는 무작위 입력값은 비교적 좋은 결과를 낼 수 없다.</p>
<p> <strong>→ point net의 solution: point net은 local structure을 잡아낼 수 있고, 그 structure간의 combinatorial interaction을 얻을 수 있다.</strong></p>
</li>
<li><p>transformation: points를 회전시키거나 평행이동하더라도 global point cloud category나 point segmentation이 변화하면 안된다.[rigid motion invariant]</p>
<p> <strong>→ point net의 solution: STN(Spatial Transformer Networks)을 적용하여 points를 transformation해도 해당 point의 실체는 변하지 않도록 한다.</strong></p>
<ul>
<li>STN: 입력되는 이미지를 모두 동일하게 orthogonal image로 만들어줌, 이 모델에서는 T-net이 사용됨.</li>
</ul>
</li>
</ol>
</li>
</ul>
<h3 id="42-pointnet-architecture">4.2. PointNet Architecture</h3>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/7f7a2890-0f60-4a0c-9d72-d92e3840485b/image.png" alt=""></p>
<p>PointNet Architecture</p>
<ul>
<li><p>Classification network</p>
<ol>
<li>input: n points</li>
<li>applies input</li>
<li>feature transformation</li>
<li>aggregate point features by max pooling</li>
<li>output: classification scores for k classes</li>
</ol>
</li>
<li><p>Segmentation network</p>
<ol>
<li>classification net의 확장</li>
<li>global(classification)과 local feature을 연결함</li>
<li>output: point 당 scores</li>
</ol>
</li>
<li><p>Symmetry Function for Unordered Input</p>
<ul>
<li>symmetry function : 정의역의 위치가 바뀌어도 치역이 같은 함수</li>
</ul>
<ol>
<li>input에 대해 기준을 정해 순서대로 정렬함.</li>
<li>input을 RNN에 넣기 위한 sequence로 취급하지만, 모든 종류의 permutation n!에 의해 트레이닝 데이터 증가.</li>
<li>각 포인트의 정보를 모으기 위해서 간단한 symmetric function 사용</li>
</ol>
<ul>
<li><p>sorting에 관한 issue (아직 정리가 되지 않은 글):</p>
<ul>
<li><p><del>sorting이 간단한 솔루션같은 반면, 고차원 공간에서 point perturbations(혼란)에 관한 stable한 ordering은 존재하지 않는다. 즉, sorting(by symmetric function)은 ordering issue를 완전히 해결하지 못하고 ordering issue가 지속되는 한, network는 일관적인 mapping을 할 수 없다.</del> </p>
</li>
<li><p><del>이에 대한 해결책은  MLP로 각 포인트에 대한 feature을 추출한 다음 symmetric function에 feeding하는 것이다.</del></p>
<ul>
<li><del>mlp 처리하지 않고 바로 feeding하는 것보다 약간 더 나은 결과를 냈다. (궁금증: 더 나은 결과라는 것이 ordering issue를 거의 해결할 수 있게 되었다는 것인지, 단순 성능이 좋아졌다는 것인지 모르겠다.)</del></li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p><img src="https://velog.velcdn.com/images/joker_joy00/post/bdd4f2cd-8e6a-4bc3-b8fc-a9125a07b37e/image.png" alt=""></p>
<blockquote>
<p>definition of symmetry function</p>
</blockquote>
<h2 id="43-이후부터는-추후-정리-완료-예정">4.3. 이후부터는 추후 정리 완료 예정</h2>
<h2 id="ref">Ref)</h2>
<p><a href="https://pytorchhair.gitbook.io/project/introduction/semantic-segmentation">https://pytorchhair.gitbook.io/project/introduction/semantic-segmentation</a>
<a href="https://m.blog.naver.com/PostView.naver?isHttpsRedirect=true&amp;blogId=sw4r&amp;logNo=221490603530">https://m.blog.naver.com/PostView.naver?isHttpsRedirect=true&amp;blogId=sw4r&amp;logNo=221490603530</a>
<a href="https://m.blog.naver.com/lsj_dct96/221702796212">https://m.blog.naver.com/lsj_dct96/221702796212</a></p>
]]></description>
        </item>
    </channel>
</rss>