Knowledge Discovery in Road Accidents Database - Integration of Visual and Automatic Data Mining Methods
Abstract
Road accident statistics are collected and used by a large number of  users and this can   result in a huge volume of data which requires  to  be explored in order to ascertain the hidden knowledge. Potential  knowledge may be hidden because of  the accumulation of  data, which  limits the exploration task for the road safety expert and, hence,  reduces the utilization of the database. In order to assist in solving  these problems, this paper explores Automatic and Visual Data Mining  (VDM) methods. The main purpose is to study VDM methods and their  applicability to knowledge discovery in a road accident databases. The  basic feature of VDM is to involve the user in the exploration process.  VDM uses direct interactive methods to allow the user to obtain an  insight into and recognize different patterns in the dataset.  In this  paper, I apply a range of methods and techniques, including a paradigm  for VDM, exploratory data analysis, and clustering methods, such as  K-means algorithms, hierarchical agglomerative clustering (HAC),  classification trees, and self-organized-maps (SOM). These methods  assist in integrating VDM with automatic data mining algorithms. Open  source VDM tools offering visualization techniques were used. The first  contribution of this paper lies in the area of discovering clusters and  different relationships (such as the relationship between socioeconomic  indicators and fatalities, traffic risk and population, personal risk  and car per capita, etc.) in the road safety database. The methods used  were very useful and valuable for detecting clusters of countries that  share similar traffic situations. The second contribution was the  exploratory data analysis where the user can explore the contents and  the structure of the data set at an early stage of the analysis. This is  supported by the filtering components of VDM. This assists expert users  with a strong background in traffic safety analysis to be able to  intimate assumptions and hypotheses concerning future situations. The  third contribution involved interactive explorations based on brushing  and linking methods; this novel approach assists both the experienced  and inexperienced users to detect and recognize interesting patterns in  the available database. The results obtained showed that this approach  offers a better understanding of the contents of road safety databases,  with respect to current statistical techniques and approaches used for  analyzing road safety situations.
		Full Text:
PDFRefbacks
- There are currently no refbacks.