HTML Format Tables Extraction with Differentiating Cell Content as Property Name

Purnamasari, Detty and Banowosari, Lintang Yuniar and Wicaksana, I Wayan Simri and Harmanto, Suryadi (2011) HTML Format Tables Extraction with Differentiating Cell Content as Property Name. American Scientific Publishers. ISSN 1936-6612


Download (481Kb) | Preview


Website presents data in various forms and formats, one of them in the form of a table. Tables on the Internet can be taken such way by copy and paste, but this way is not easy if done on many tables then from extracted result they have been merged with the other tables. This article discussed the research on extraction of HTML tables which stored into a database form. The approach used was algorithm to perform the search process the number of rows and number of columns from the table, and algorithms to perform matching the contents of the table cell extraction results with a Property Name database, so it is unknown whether the extracted table has property in the row/column/table without property. Table and Property Name database displays the data in the Indonesian Language. At pre processing stage Property Name database which is also prepared the techniques to enrich the instance of the Property Name database. The tables in the extract is a table HTML format with a simple table where the form is not found of any merger of the rows and columns in the row position merge 1/column 1. This research provides techniques to enrich the instance of a database, and with the use of illustrations, and then an approach to do the extraction of tabular HTML format can be done in a semi-automatic. In addition to that property in the table which is extracted can be distinguished from the contents of the cell which is a data table.

Item Type: Article
Subjects: T Technology > T Technology (General)
Divisions: Fakultas Ilmu Komputer dan Teknologi Informasi > Program Studi Sistem Informasi
Depositing User: Mr Reza Chandra
Date Deposited: 25 May 2016 02:34
Last Modified: 25 May 2016 02:34

Actions (login required)

View Item View Item