Author Archives: Marcin

Categorization of Multilingual Scientific Documents by a Compound Classification System

Abstract

The aim of this study was to propose a classification method for documents that include simultaneously text parts in various languages. For this purpose, we constructed a three-leveled classification system. On its first level, a data processing module prepares a suitable vector space model. Next, in the middle tier, a set of monolingual or multilingual classifiers assigns the probabilities of belonging each document or its parts to all possible categories. The models are trained by using Multinomial Naive Bayes and Long Short-Term Memory algorithms. Finally, in the last component, a multilingual decision module assigns a target class to each document. The module is built on a logistic regression classifier, which as the inputs receives probabilities produced by the classifiers. The system has been verified experimentally. According to the reported results, it can be assumed that the proposed system can deal with textual documents which content is composed of many languages at the same time. Therefore, the system can be useful in the automatic organizing of multilingual publications or other documents.

Detection of the Innovative Logotypes on the Web Pages

Abstract

The aim of this study was to describe a found method for detection of logotypes that indicate innovativeness of companies, where the images originate from their Internet domains. For this purpose, we elaborated a system that covers a supervised and heuristic approach to construct a reference dataset for each logotype category that is utilized by the logistic regression classifiers to recognize a logotype category. We proposed the approach that uses one-versus-the-rest learning strategy to learn the logistic regression classification models to recognize the classes of the innovative logotypes. Thanks to this we can detect whether a given company’s Internet domain contains an innovative logotype or not. More- over, we find a way to construct a simple and small dimension of feature space that is utilized by the image recognition process. The proposed feature space of logotype classification models is based on the weights of images similarity and the textual data of the images that are received from HTMLs ALT tags.

A Diversified Classification Committee for Recognition of Innovative Internet Domains

Abstract

The objective of this paper was to propose a classification method of innovative domains on the Internet. The proposed approach helped to estimate whether companies are innovative or not through analyzing their web pages. A Naïve Bayes classification committee was used as the classification system of the domains. The classifiers in the committee were based concurrently on Bernoulli and Multinomial feature distribution models, which were selected depending on the diversity of input data. Moreover, the information retrieval procedures were applied to find such documents in domains that most likely indicate innovativeness. The proposed methods have been verified experimentally. The results have shown that the diversified classification committee combined with the information retrieval approach in the preprocessing phase boosts the classification quality of domains that may represent innovative companies. This approach may be applied to other classification tasks.

The hybrid decision support system for Fire Service – chosen project’s problems

Abstract

This article presents the design process of a hybrid decision support system (HSWD) for the State Fire Service (PSP). The Design for Trustworthy Software (DFTS) methodology was chosen to ensure system reliability. The paper focuses particularly on the requirements planning stage and the overall platform design. The study identifies key challenges in the early project stages, primarily stemming from methodology, environment, and user-related factors. These elements play a crucial role at the start of the design process, whereas aspects such as software, hardware, and measurement have a lesser initial impact. The authors analyze the causes of these challenges and propose solutions to address them. By outlining the lack of specific information solutions in the current State Fire Service infrastructure, this research highlights the importance of a structured approach in decision support system development. The findings contribute to the design of a robust and reliable platform that enhances decision-making in emergency response scenarios.