Web personalization and Web directories have been proposed as potential solutions to the problem of information overload on the WWW. The essence of personalization is the adaptability of information systems to the needs of their users, whilst Web directories are an attempt to thematically organize information that is available on the Web. We present a novel framework that combines Web personalization and Web directories, in order to provide a new weapon against information overload. This combination results in the concept of Community Web Directories. Community Web directories is a novel form of personalization performed on Web directories, such as the Open Directory Project (ODP). They correspond to “segments” of the directory hierarchy, representing the interests and preferences of user communities and thus provide a personalized view of the Web. In this manner, community Web directories constitute a new objective of Web personalization.
The proposed approach is based on Web usage mining, which is a valuable source of ideas and methods for the implementation of personalization functionality. However, in contrast to most of the work on Web usage mining, the usage data that are analyzed here correspond to user navigation throughout the Web, rather than a particular Web site, exhibiting as a result a high degree of thematic diversity.
For the construction of Community Web Directories, we present three novel techniques that combine the users’ browsing behavior with thematic information from the Web directories. These techniques are based on clustering and probabilistic data modeling and they are evaluated both on a specialized artificial and a general-purpose Web directory, indicating their potential value to the Web user. The experiments that have been performed assess also the effectiveness of different machine learning techniques on the task.
Finally, we present OurDMOZ, a system that builds and maintains community Web directories by employing a Web usage mining framework. OurDMOZ exploits Web directories to extend personalization functionalities, such as adaptive services and Web page recommendation, to a larger part of the Web, outside the scope of a single Web site. An initial user evaluation of the system indicates the potential value of the system to the end user, in terms of an enhanced personalized Web experience.