08:30 - 09:00 | Reception |
09:00 - 09:05 | Welcoming speech from Prof. Chu Duc Trinh, Vice Rector of the VNU University of Engineering and Technology |
09:05 - 09:10 | Introduction about APSIPA, Prof. Kosin Chamnongthai (King Mongkut's University of Technology Thonburi) |
09:10 - 09:50 | Keynote speaker: Prof. Nam-Ik Cho (Seoul National University) Talk Title: Deep Learning Methods for Image Denoising/Restoration |
09:50 - 10:05 | Invited speaker 1: Prof. Kosin Chamnongthai (King Mongkut's University of Technology Thonburi) Talk Title: A Method of Eye-Gaze-Based Human-Intention Detection |
10:05 - 10:20 | Invited speaker 2: Prof. Toshihisa Tanaka (Tokyo University of Agriculture and Technology) Talk Title: Automated Diagnostic Aid for Epilepsy with a Cloud-based Platform of Multi-facility Databases |
10:20 - 11:20 | Poster session Coffee break |
11:20 - 11:35 | Invited speaker 3: Prof. Nipon Theera-Umpon (Chiang Mai University) Talk Title: A.I. and Digital Technology in Biomedicine and Healthcare |
11:35 - 11:50 | Invited speaker 4: Prof. Woon-Seng Gan (Nanyang Technological University) Talk Title: Augmented/Mixed Reality Audio for Hearables: Sensing, Control and Rendering |
11:50 - 12:05 | Invited speaker 5: Prof. Darenee Hormdee (Khon Kaen University) Talk Title: Vaginal Hysterectomy Robotic System |
12:05 - 14:00 | Lunch |
14:00 - 16:00 | Committee selects papers for presentation at the 2022 APSIPA-ASC |
16:00 - 16:30 | Coffee break |
16:30 - 18:00 | Committee selects grants for author registration waiving and travel support |
18:00 - 20:00 | Dinner Announcement of selected papers for presentation at the 2022 APSIPA-ASC and selected grants |
Authors: Anh-Tu Nguyen, Thao Nguyen, Huy-Khiem Le, Huy-Hieu Pham, and Cuong Do
Title: A Novel Deep Learning-Based Approach for Sleep Apnea Detection Using Single-Lead ECG Signals Abstract: Sleep apnea (SA) is a type of sleep disorder characterized by snoring and chronic sleeplessness, which can lead to serious conditions such as high blood pressure, heart failure, and cardiomyopathy (enlargement of the muscle tissue of the heart). The electrocardiogram (ECG) plays a critical role in identifying SA since it might reveal abnormal cardiac activity. Recent research on ECG-based SA detection has focused on feature engineering techniques that extract specific characteristics from multiple-lead ECG signals and use them as classification model inputs. In this study, a novel method of feature extraction which based on the detection of S peaks is proposed to enhance the detection of adjacent SA segments using a single-lead ECG. In particular, ECG features collected from a single lead (V2) are used to identify SA episodes. On the extracted features, a CNN model is trained to detect SA. Experimental results demonstrate that the proposed method detects SA from single-lead ECG data is more accurate than existing state-of-the-art methods, with 91.13% classification accuracy, 92.58% sensitivity, and 88.75% specificity. Moreover, the further usage of features associated with the S peaks enhances the classification accuracy by 0.85%. Our findings indicate that the proposed machine learning system has the potential to be an effective method for detecting SA episodes. |
Authors: Dat T. Ngo, Hieu H. Pham, Hieu T. Nguyen, Dung B. Nguyen, and Ha Q. Nguyen
Title: Slice-Level Detection of Intracranial Hemorrhage on CT Using Deep Descriptors of Adjacent Slices Abstract: Training deep neural networks on high-resolution 3D volumes of Computed Tomography (CT) scans for diagnostic tasks pose formidable computational challenges. This raises the need of developing deep learning-based approach that is robust in learning representations in 2D images. In this paper, we propose a new strategy to train slice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis. In particular, each of which is extracted through a convolutional neural network (CNN). This method is applicable to CT datasets with per-slice labels such as the RSNA Intracranial Hemorrhage (ICH) dataset, which aims to predict the presence of ICH and classify it into 5 different sub-types. We obtain a single model in the top 4% best-performing solutions of the RSNA ICH challenge, where model ensembles are allowed. Experiments also show that the proposed method significantly outperforms the baseline model on CQ500. |
Authors: Le Quoc Anh, Luu Manh Ha, Theo van Walsum, Adriaan Moelker, Dao Viet Hang, Pham Cam Phuong, and Vu Duy Thanh
Title: Needle Localization and Segmentation for Radiofrequency Ablation of Liver Tumors under CT image Guidance Abstract: Radiofrequency ablation (RFA) of liver cancer under computer tomography (CT) guidance is a minimally invasive procedure in which CT images are utilized to guide the physician in introducing the needle into the target lesion. However, the adequate visualization of the needle and anatomy is hampered by the 2D slide based-view used in the current clinical practice. Thus, due to the lack of 3D information, the physician requires high experience and more interaction with the guidance systems to envision the needle's position in the liver, which is inconvenient in clinical practice. This study proposes a method for robust needle segmentation using CT images to improve the visualization of the needle during the intervention. The method utilizes a convolutional neural network (CNN) to detect the needle in orthogonal 2D projections of the CT image to construct the needle volume of interest (VOI). Subsequently, a patch-based 3D CNN is applied to segment the needle. We evaluate the method's accuracy using Dice score (DSC), Hausdorff distance (HD), the needle shaft error Eshaft, and needle tip error Etip. The results show that the proposed method achieves the means of DSC, HD, Etip, Eshaft and processing time of 0.89, 3.3 mm, 0.9 mm, 0.43 mm, and 2.6 seconds, respectively. We conclude that the proposed method is feasible for improving needle visualization in the interventional room. |
Authors: Huy Le and Thanh-Ha Do
Title: Automated Classification of Lung Injury from X-ray Images using Deep Learning Network Abstract: This paper presents a new approach for supporting the diagnosis of lung injury from chest X-ray images. Specifically, this paper proposes using the DarkCovidNet model for classifying the damage as Covid-19 or other causes. Since the approach also takes into account the dataset feature to train classification models, data augmentation using an interpolation algorithm was also researched and used to enrich the dataset. The experimented results done on different databases show that the proposed method help to improves the the accuracy of lung injury classification models. |
Authors: Tuan-Cuong Vuong, Hung Tran, Mai Xuan Trang, Vu-Duc Ngo, and Thien Van Luong
Title: A Comparison of Feature Selection and Feature Extraction in Network Intrusion Detection Systems Abstract: Internet of Things (IoT) has been playing an important role in many sectors such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection technique for reducing the dimension of input data before being fed to machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in intrusion detection systems. This paper provides a comprehensive comparison between these two methods in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy as well as runtime complexity, where the modern dataset called UNSW-NB15 is used. We note that such comparison between feature selection and feature extraction methods has been overlooked in the literature. Furthermore, based on this comparison, we provide a useful guideline on selecting a suitable intrusion detection type for each specific scenario. |
Authors: Dang-Y Hoang, Tien-Hoa Nguyen, Vu-Duc Ngo, Trung Tan Nguyen, Nguyen Cong Luong, and Thien Van Luong
Title: Deep Learning-Based Signal Detection for Dual-Mode Index Modulation 3D-OFDM Abstract: In this paper, we propose a deep learning-based signal detector called DuaIM-3DNet for dual-mode index modulation-based three-dimensional (3D) orthogonal frequency division multiplexing (DM-IM-3D-OFDM). Herein, DM-IM-3DOFDM is a subcarrier index modulation scheme which conveys data bits via both dual-mode 3D constellation symbols and indices of active subcarriers. Thus, this scheme obtains better error performance than the existing IM schemes using the conventional maximum likelihood (ML) detector that suffers from high computational complexity, especially when the system parameters increase. In order to address this fundamental issue, we propose the usage of a deep neural network (DNN) at the receiver to jointly and reliably detect both symbols and index bits of DM-IM-3D-OFDM under Rayleigh fading channels in a datadriven manner. Simulation results demonstrate that our proposed DNN detector achieves near-optimal performance at significantly lower runtime complexity compared to the ML detector. |
Authors: Toan Gian, Vu-Duc Ngo, Tien-Hoa Nguyen, Trung Tan Nguyen, and Thien Van Luong
Title: Deep Neural Network-Based Detector for Single-Carrier Index Modulation NOMA Abstract: In this paper, a deep neural network (DNN)-based detector for a uplink single-carrier index modulation nonorthogonal multiple access (SC-IM-NOMA) system is proposed. SC-IM-NOMA allows users to use the same set of sub-carriers for transmitting their data modulated by the sub-carrier index modulation technique. More particularly, users of SC-IM-NOMA simultaneously transmit their SC-IM data at different power levels which are then exploited by their receivers to perform successive interference cancellation (SIC) multi-user detection. While the existing SIC-based detector designed for SC-IMNOMA, named maximum likelihood SIC (ML-SIC), suffers from high complexity, we propose a DNN-based detector whose structure relies on the model-based SIC for jointly detecting both M-ary symbols and index bits of all users after trained with sufficient simulated data. The simulation results demonstrate that the proposed DNN-based detector attains near-optimal error performance and significantly reduced runtime complexity in comparison with the conventional ML-SIC detector. |
Authors: Can Quang Truong, Nguyen Thanh Tung, Pham Minh Bao, Nguyen Tien Dat, and Dinh Thi Thai Mai
Title: Applying Machine Learning Method to Detect DDoS Attacks in SDN Practical Environment Abstract: In this paper, we will take a look at Software Define Network (SDN) architecture - a potential architecture network for the future and how DDoS attacks can affect controller resources. After that, we will then propose a method that can detect DDoS attacks based on the Machine learning method (specific to Support Vector Machine - SVM), which is on various network parameters as a high accuracy method. Also, with surveying in simulation and practical environment, we find out that our method is completely able to detect quickly DDoS attacks, then we compare the result to simulation. |
Authors: Minh Nghia Pham, Duc Huy Phan, and Phuong Nam Nguyen
Title: Building a Deep Siamese Model to Inrease the Accuracy of Single-Object Detection and Tracking for UAVs Abstract: Single target detection and tracking has been the improtant task in the field of computer vision applied to UAVs. With the rapid development of deep learning, along with the available of large single-target datasets, deep learning networks with Siamese structure have achieved relatively good results when compared with traditional methods. The paper presents an improved and broader Siamese-based approach to improve the target-detection and tracking-accuracy. The model has designed a new residual insertion block to overcome the size of the padding, resulting in simpler weight functions and better suited for complex Siamese structures. |
Authors: Bui Son Tung, Phung The Ngoc, Do Duy Thanh, and Nguyen Hong Thinh
Title: Applied AI in Video Analysis for Traffic Monitoring Abstract: Video surveillance is widely used including traffic management. However, video data from these surveillance cameras is extremely huge. When it requires to extract information about an object or event from the surveillance camera but without knowing the exact time, it is very challenging to immediately identify them of interest from billions of video frames. Thus, how to effectively extract information from video surveillance has a strategically important role in practical applications. In this paper, we introduce a system to manage and retrieve surveillance video based on indexing of moving objects. The system contains object detection and features extraction, video indexing and retrieval processing mechanism, and fast previewing. Specifically, this system uses information from moving objects to index video content. As consequent, in the retrieval phase, the system allows both text-based retrieval and image-based retrieval.The system is evaluated on traffic video surveillance and shows very promising results. |
Authors: Ha-Dang Ho, Hong-Quan Nguyen, Thuy-Binh Nguyen, Sinh-Thuong Vu, and Thi-Lan Le
Title: Dynamic Hand Gesture Recognition from Egocentric Videos based on SlowFast Architecture Abstract: Recently, thanks to a large number of lightweight digital recording devices used in different applications, the amount of egocentric data has increased overtime. Compared with videos captured by ambient cameras, egocentric videos have their own challenges as they may contain large, non-linear and unpredictable motion. This paper presents an approach for hand gesture recognition from egocentric videos based on SlowFast network architecture. The model involves Slow and Fast pathway in which a Slow pathway operates at low frame rate while a Fast pathway performs at high frame rate. In egocentric videos, some hand movements happen faster or slower than others depending on the actor or surrounding context. In order to retain egocentric-based attribute of the video, we perform extensive experiments on two branches of information by dividing input frames into Slow pathway and Fast pathway. As a result, our method has achieved better classification accuracy scores in EgoGesture dataset compared with other state-of-theart frameworks such as VGG-16 + LSTM, C3D+LSTM+RSTTM models while remain a better and slighter weight of model. |
Authors: Duc-Huy Pham, Quang-Anh Do, Thanh Thi-Hien Duong, Thi-Lan Le, and Phi-Le Nguyen
Title: End-to-end Visual-guided Audio Source Separation with Enhanced Losses Abstract: Visual-guided Audio Source Separation (VASS) refers to separating individual sound sources from an audio mixture of multiple simultaneous sound sources by using additional visual features that guide the separation process. For the VASS task, visual features and correlation of audio and visual play an important role, based on which we manage to estimate more effective audio masks to improve the performance. In this paper, we propose an approach to jointly train the components of a cross-modal retrieval framework with video data and enable the network to find optimal features. The proposed end-to-end framework is updated with three loss functions: 1) separation loss to eliminate separated magnitude spectrograms discrepancy, 2) object-consistency loss enforces the consistency of separated spectrogram with the visual of the sound source object, and 3) cross-modal loss to maximize the correlation of audio and visual of the same object and also maximize the difference between the audio and visual of different objects. The proposed VASS model was evaluated on the benchmark dataset MUSIC, which contains a large number of videos of people playing instruments in different combinations. Experiment results confirmed the advantages of our model over previous VASS models. |
Authors: Tran Quang-Huy, Luong Thi Theu, Nguyen Canh Minh, Duc-Nghia Tran, and Duc-Tan Tran
Title: Adaptive Filtering-based Heavy-Noise Removal in Born Iterative Method Abstract: Based on scattered ultrasound measurements, cross-sectional images showing the spatial distribution of some physical properties of the target of interest can be generated in ultrasonic tomography. These measurements can obtain dense datasets at various transmitter and receiver locations. The Born approximation approach, which provides a simple linear relationship between the goal function and the scattering field, was used to solve the inverse scattering problem. The Born iterative method (BIM), which employs the first-order Born approximation, is a helpful diffraction tomography approach. In BIM, the scattering data obtained by the probes will be affected by Gaussian noise, especially noise appearing in the background environment. In this work, we propose to apply the adaptive filter, specifically the least mean squares (LMS) filter, to reduce noise. Simulation results show that when the scattered signal is heavily affected by noise, the LMS filter can still successfully remove noise from the noisy signal. The relative residual error (RRE) parameter is used to assess the performance of the proposed approach. |
Authors: Long Hai Ngo and Quang Duc Pham
Title: Vibration Measurement Using Spatial Shifting Coherent Digital Holography Abstract: In this research, we proposed a new digital coherent holographic configuration for accurately measuring the three-dimensional (3D) vibration of the object. The vibration was indirectly measured by the displacement of the three mirrors attached on the object. The hologram recorded by the camera consisting of 6 sub-holograms can be separated by Fourier transform and appropriated spatial band-pass filters. Three phase sets extracted from 3 sub-holograms of the reference mirror and 3 object mirrors were used to calculate the displacement of the object in 3D directions. The relation between the displacement of the object and the phases of the sub-holograms was related to the wavelength of the light source, therefore this allows observing the vibration of the object with nano-scale accuracy in z direction and much smaller than the pixel size of the camera accuracy in x and y directions. |
Authors: Le Trung Thanh, Karim Abed-Meraim, Nguyen Linh Trung, and Adel Hafiane
Title: Robust Online Tucker Dictionary Learning from Multidimensional Data Streams Abstract: Big data streaming analytics has recently attracted much attention in the signal and information processing communities due to the fact that massive streaming datasets have been collected over the years. Among them, many modern data streams are represented as multidimensional arrays (aka tensors), and thus, streaming tensor decomposition or tensor tracking has become a popular processing tool to analyze such streaming data. In this paper, we propose a novel online algorithm called ROTDL for the problem of robust tensor tracking under the Tucker format. ROTDL is not only capable of tracking the underlying Tucker dictionary of multidimensional data streams over time, but also robust to sparse outliers. The proposed algorithm is specifically designed by using the alternating direction method of multipliers, block-coordinate descent, and recursive least-squares filtering techniques. Several experiments demonstrate the effectiveness of ROTDL for robust tensor tracking. |
Authors: Ta Minh Thanh
Title: Blockmarking: Hybrid Model of Blockchain and Watermarking Technique for Copyright Protection Abstract: Digital contents have become more popular and rapid in recent years. The need of techniques for digital content protection is also quickly required. Various watermarking techniques are proposed for protecting the digital contents. Blockchain technology has also become very popular and applied on many solutions. This paper proposes a new hybrid model based on the combination of blockchain and watermarking method. The main purpose is not only to achieve the goal of image copyright protection but also storage the image into the blockchain network. Since our proposed method employs the blockchain mechanism, the digital contents authentication mechanism does not need third party resources. Our experimental results demonstrate that the proposed method successfully achieved the goal of digital copyright protection. |