빠르게 발전하는 이미지 인식 기술에도 불구하고 표 형식의 문서와 수기로 작성된 문서를 완벽하게 디지털화하기에는 아직 어려움이 따른다. 본 연구는 표 형식의 수기 문서인 선박 항해일지를 작성하는 데에 사용되는 규칙을 이용하여 보정 작업을 수행함으로 써 OCR 결과물의 정확도를 향상시키고자 한다. 이를 통해 OCR 프로그램을 통하여 추출된 항해일지 데이터의 정확성과 신뢰성을 높일 것 으로 기대된다. 본 연구는 목포해양대학교 실습선 새누리호의 2023년에 항해한 57일간의 항해일지 데이터를 대상으로 OCR 프로그램 인 식 후 발생한 오류를 보정하여 그 정확도를 개선하고자 하였다. 이 모델은 항해일지 기재 시 고려되는 몇 가지 규칙을 활용하여 오류를 식별한 후, 식별된 오류를 보정하는 방식으로 구성하였다. 모델을 활용하여 오류를 보정 후, 그 효과를 평가하고자 보정 전과 후의 데이터 를 항차별로 구분한 후, 같은 항차의 같은 변수끼리 비교하였다. 본 모델을 활용하여 실제 셀 오류율은 약 11.8% 중 약 10.6%의 오류를 식 별하였고, 123개의 오류 중 56개를 개선하였다. 본 연구는 항해일지 중 항해정보를 기입하는 Dist.Run부터 Stand Course까지의 정보만을 대 상으로 수행하였다는 한계점이 있으므로, 추후 항해정보 뿐만 아니라 기상정보 등 항해일지의 더 많은 정보를 보정하기 위한 연구를 진 행할 예정이다.
Recently, the status of North Korea’s denuclearization has become an international issue, and there are also indications of potential nuclear proliferation among neighboring countries. So, the need for establishment of nuclear activity verification technology and strategy is growing. In terms of ensuring verification completeness, sample collection-based analysis is essential. The concepts of Chain of Custody (CoC) and Continuity of Knowledge (CoK) can be defined in the process of sample extraction as follows: CoC is interpreted as the ‘system for managing the flow of information subjected by the examinee’, and CoK is interpreted as the ‘Continuity of information collection through CoC subjected by the inspector’. In the case of sample collection process in unreported areas for nuclear activity verification, there are additional risks such as worker exposure/kidnapping or sample theft/tampering. Therefore, the introduction of additional devices might be required to maintain CoC and CoK in the unreported area. In this study, an Environmental Geometrical Data Transfer (EGDT) was developed to ensure the safety of workers and the CoC/CoK of the samples during the collection process. This device was designed for achieving both mobility and rechargeability. It is categorized into two modes based on its intended users: sample mode and worker mode. Through the sensors, which is positioned in the rear part of device, such as radiation, gyroscope, light, temperature, humidity and proximity sensors, it can be easily achievable various environmental information in real-time. Additionally, GPS information can also be received, allowing for responsiveness to various hazardous scenarios. Moreover, the OLED display positioned on the front gives us for checking device information such as the current status of the device such as the battery level, the connectivity of wifi, and etc. Finally, an alarm function was integrated to enable rapid awareness during emergency situations. These functions can be updated and modified through Arduino-based firmware, and both the device and the information collected through it can be remotely controlled via custom software. Based on the presented design conditions, a prototype was developed and field assessments were conducted, yielding results within an acceptable margin of error for various scenarios. Through the application of the EGDT developed in this study to the sample collection process for nuclear activity verification purposes, it is expected to achieve a stable maintenance of CoC/CoK through more accurate information transmission and reception.
머신러닝 기법의 발달과 함께 기계에서 발생하는 다양한 종류(진동, 온도, 유량 등)의 데이터를 활용하여 기계의 상태를 진단하고 이상 탐지 및 비정상 분류 연구도 활발히 진행되고 있다. 특히 진동 데이터를 활용한 회전 기계의 상태 진단은 전통적인 기계 상태 모니터링 분야로 오랜 기간 동안 연구가 진행되었고, 연구 방법 또한 매우 다양하다. 본 연구에서는 가정용 에어컨에 사용되는 로터리 압축기에 가속도계를 직접 설치하여 진동 데이터를 수집하는 실험을 진행하였다. 데이터 부족 문제를 해결하기 위해 데이터 분할을 수행하였으며, 시간 영역에서의 진동 데이터로부터 통계적, 물리적 특징들을 추출한 후, Chi-square 검증을 통해 고장 분류 모델의 주요 특징을 추출하였다. SVM(Support Vector Machine) 모델은 압축기의 정상 혹은 이상 유무를 분류하기 위해 개발되었으며, 파라미터 최적화를 통해 분류 정확도를 개선하였다.
People write reviews of numerous products or services on the Internet, in their blogs or community bulletin boards. These unstructured data contain important emotions and opinions about the author's product or service, which can provide important information for future product design or marketing. However, this text-based information cannot be evaluated quantitatively, and thus they are difficult to apply to mathematical models or optimization problems for product design and improvement. Therefore, this study proposes a method to quantitatively extract user’s opinion or preference about a specific product or service by utilizing a lot of text-based information existing on the Internet or online. The extracted unstructured text information is decomposed into basic unit words, and positive rate is evaluated by using existing emotional dictionaries and additional lists proposed in this study. This can be a way to effectively utilize unstructured text data, which is being generated and stored in vast quantities, in product or service design. Finally, to verify the effectiveness of the proposed method, a case study was conducted using movie review data retrieved from a portal website. By comparing the positive rates calculated by the proposed framework with user ratings for movies, a guideline on text mining based evaluation of unstructured data is provided.
With the advent of the digital age, production and distribution of web pages has been exploding. Internet users frequently need to extract specific information they want from these vast web pages. However, it takes lots of time and effort for users to find a specific information in many web pages. While search engines that are commonly used provide users with web pages containing the information they are looking for on the Internet, additional time and efforts are required to find the specific information among extensive search results. Therefore, it is necessary to develop algorithms that can automatically extract specific information in web pages. Every year, thousands of international conference are held all over the world. Each international conference has a website and provides general information for the conference such as the date of the event, the venue, greeting, the abstract submission deadline for a paper, the date of the registration, etc. It is not easy for researchers to catch the abstract submission deadline quickly because it is displayed in various formats from conference to conference and frequently updated. This study focuses on the issue of extracting abstract submission deadlines from International conference websites. In this study, we use three machine learning models such as SVM, decision trees, and artificial neural network to develop algorithms to extract an abstract submission deadline in an international conference website. Performances of the suggested algorithms are evaluated using 2,200 conference websites.
디지털 기술의 발달로 세계가 정보 및 지식이 주도하는 사회로 급변하고, 지식 재산권의 발전이 급속하게 진행되면서, 각 기업 및 국가들은 그들의 경쟁력을 키우기 위해 지식재산권에 대한 중요성을 강조하고 있다. 이와 같이 지식재산권의 중요성이 강조되는 현실에서 지식재산권의 확보는 기업의 경쟁력을 좌우하는 요소라 할 수 있다. 따라서 본 논문에서는 빅데이터 분석 도구인 R을 이용하여 빠른 시간 안에 사용자가 목적으로 하고 있는 특허검색 결과를 효율적으로 도출할 수 있는 검색어 추출에 관한 연구를 진행하였다. 이를 위해 다섯 단계의 특허 검색 프로세스를 제안하였고 프로그램으로 구현하여 검색목적에 맞는 특허의 검색에 필요한 시간을 대폭 단축시키면서 목표로 하는 특허 검색을 효율적으로 할 수 있었다.
줄눈 콘크리트포장(JCP-Jointed Concrete Pavement)의 경우 포장단면 온 습도차에 의해 슬래브 내의 컬링이 발생하게 된다. 컬링에 의해서 발생하는 슬래브의 휨 형상 및 크기는 포장내 응력 및 평탄성에 영향을 미치기 때문에 포장의 구조적, 기능적 성능에 중요한 영향을 줄 수 있다. 슬래브의 휨 형상을 직접 측정하는 것은 시공초기에 포장내에 다량의 계측기를 매설하여야 하며, 시공초기부터 거동의 정교한 측정이 필요하고 고가의 실험비용 및 기술적 인 어려움을 갖고 있다. 본 연구에서는 현장에서 손쉽게 획득할 수 있는 프로파일 데이터(Profile Data)에 파워 스펙트럼 분석(Power Spectrum Density Analysis) 및 역 푸리에 변환(Inverse Fast Fourier Transform) 기법 등을 적용하여 임의의 시간에서 줄눈 콘크리트포장의 슬래브 휨 형상을 측정하는 방법을 개발하였다. 개발한 기법의 실효성을 검증하기 위하여 국내 중부내륙에 위치한 시험도로 줄눈 콘크리트 포장에서 시간대별 프로파일을 측정하고 이 데이터로부터 슬래브 휨 형상을 도출하여 그 결과를 검토하였다. 또한 동 시간대의 연속철근 콘크리트포장(CRCP- Continuous Reinforced concrete Pavement) 구간에서의 프로파일 데이터를 분석하고 이를 줄눈 콘크리트포장에서의 결과와 비교하여 개발한 슬래브 휨 형상 추출기법의 타당성을 검증하였다.
This paper deals with the effect of spatial distribution of material properties on its statistical characteristics and numerical estimation method of reliability of fatigue sensitive structures with respect to the fatigue crack growth. A method is proposed to determine experimentally the probability distribution functions of material parameters of Paris law. da/dN=C(ΔK/K sub(0) ) super(m), using stress intensity factor controlled fatigue tests. The result with a high tensile strength steel shows that the distribution of the parameter m is approximately normal and that of 1/C, is a 3-parameter Weibull. The main result obtained are : (1) The theoretical autocorrelation of the resistance, 1/C, to fatigue crack growth are almost same for different lengths. (2) The variance decreases with the increasing a averaging length. When spatial correlation length is very small. the variane decreases significantly were the averaging length. (3) The probability distribution of load cycles or the number for a crack to reach a certain length can be estimated using these functions by simulation of non-Gaussian(expecially Weibull) Stochastic Process.
This study examined the efficiency of satellite images in terms of detecting wheat cultivation areas, and then analyzed the possibility of climate change through an correlation analysis of time series climate data from the western regions of Gyeongnam province, Korea. Furthermore, we analyzed the effect of climate change on wheat production through a multiple regression analysis with the time series wheat production and climate data. A relatively accurate distribution was achieved on the wheat cultivation area extracted through satellite image classification with an error rate of less than 10% in comparison to the statistical data. Upon correlation analysis with time series climate data, significant results were displayed in the following changes: the monthly mean temperature of the seedling stage, the monthly mean duration of sunshine, the monthly mean temperature of the growing period, the monthly mean humidity, the monthly mean temperature of the ripening stage, and the monthly mean ground temperature. Accordingly, in the study area, the monthly mean temperature, precipitation, and ground temperature generally increased whereas the monthly mean duration of sunshine and humidity decreased. The monthly mean wind speed did not display a particular change. In the multiple regression analysis results, the greatest effect on the production and productivity of wheat as climate factors included the annual mean humidity of the seedling stage, the annual mean temperature of the wintering period, and the annual mean ground temperature of the ripening stage. These results demonstrate that there is a change in wheat production depending on the climate change in the study area. in addition, it is determined that this study will be used as important basic data in the resolution of food security problems based on climate change.
In this paper, we propose a new algorithm of the guidance line extraction for autonomous agricultural robot based on vision camera in paddy field. It is the important process for guidance line extraction which finds the central point or area of rice row. We are trying to use the central region data of crop that the direction of rice leaves have convergence to central area of rice row in order to improve accuracy of the guidance line. The guidance line is extracted from the intersection points of extended virtual lines using the modified robust regression. The extended virtual lines are represented as the extended line from each segmented straight line created on the edges of the rice plants in the image using the Hough transform. We also have verified an accuracy of the proposed algorithm by experiments in the real wet paddy.
Outdoor mobile robots are faced with various terrain types having different characteristics. To run safely and carry out the mission, mobile robot should recognize terrain types, physical and geometric characteristics and so on. It is essential to control appropriate motion for each terrain characteristics. One way to determine the terrain types is to use non‐contact sensor data such as vision and laser sensor. Another way is to use contact sensor data such as slope of body, vibration and current of motor that are reaction data from the ground to the tire. In this paper, we presented experimental results on terrain classification using contact sensor data. We made a mobile robot for collecting contact sensor data and collected data from four terrains we chose for experimental terrains. Through analysis of the collecting data, we suggested a new method of terrain feature extraction considering physical characteristics and confirmed that the proposed method can classify the four terrains that we chose for experimental terrains. We can also be confirmed that terrain feature extraction method using Fast Fourier Transform (FFT) typically used in previous studies and the proposed method have similar classification performance through back propagation learning algorithm. However, both methods differ in the amount of data including terrain feature information. So we defined an index determined by the amount of terrain feature information and classification error rate. And the index can evaluate classification efficiency. We compared the results of each method through the index. The comparison showed that our method is more efficient than the existing method.
융설 모형을 이용하여 융설 기간 동안의 하천유출량을 모의하기 위해서는 융설 관련 매개변수의 정립이 반드시 필요하다. 우리나라의 경우 관측 자료의 부족으로 인하여 적설분포면적, 적설심, 적설분포면적 감소곡선과 같은 융설 관련 매개변수의 추출이 불가능하였다. 본 연구에서는 1997년부터 2003년까지의 겨울철(11월-4월) NOAA/AVHRR 위성영상을 이용하여 한반도의 적설분포도를 추출하고 기상청의 69개소 유인지상기상관측소의 기상자료 중 최심적설심 자료