Academic and professional background

  • Current role: Senior Researcher / OPI PIB
  • Specialization: artificial intelligence, natural language processing
  • Core competencies: solution architecture, requirements analysis
  • Profile: researcher and information systems engineer
2002–2010

Studies and the beginning of an engineering career

This stage covered technical studies, work on information systems, international experience through the Erasmus programme, and the first programming projects as well as self-employment.


Bialystok University of Technology — Electronics and Telecommunications, MSc Eng.

Studies completed with an MSc Eng. degree; the diploma thesis concerned a system for collecting and processing medical measurement data.

The project involved building a system for collecting and analysing medical data from uroflowmeters. During the studies, an Erasmus exchange at VŠB was also completed.


ProFind — self-employment and software development contracts

Software development, CMS, e-commerce, and internet-data-based projects.

This stage covered the full execution cycle: from architecture and implementation to maintenance and client collaboration.

2007–2013

Doctoral training, PhD research, and text analysis

The doctoral stage began in 2007 at Bialystok University of Technology and concluded with the dissertation defense in 2013 at the Faculty of Computer Science. The work covered text processing, knowledge representation, and information systems for public administration and emergency services.


Work on the doctoral dissertation — analysis of fire service reports and information system design

The research focused on transforming unstructured operational reports into data usable within an information system.

The core of the work included text segmentation, information extraction, and the design of knowledge representation rules for fire incident documentation.


PhD defense

Formal completion of the research stage devoted to text data analysis in information systems.

The dissertation covered information systems, text mining, and the analysis of domain-specific documents.

2012–2014 (publication in 2018)

IPI PAN and web information extraction

Formal collaboration took place between 2012 and 2014. The work focused on extracting information from semi-structured web pages, with an emphasis on larger-scale and more general-purpose problems. The main publication from this period — the BigGrams system — was published in 2018 with IPI PAN affiliation.


Institute of Computer Science, Polish Academy of Sciences — Systems Engineer

Work on information extraction from web data and semi-structured HTML documents.

This stage extended earlier domain-focused analyses toward methods used more broadly in web mining and information extraction.


BigGrams — language-agnostic information extraction from HTML

A publication devoted to information extraction from semi-structured web pages.

The publication focuses on combining processing scale, practical applicability, and relative independence from both language and page layout.

2014–2023

OPI PIB — development of a research-and-implementation profile

A long-term research and project stage covering text classification, document analysis, web mining, information extraction, and systems applied in public and analytical practice.


OPI PIB / AI Lab — Senior Researcher

Main center of scientific and project activity, combining research with implementation-oriented work.

This stage combines research activity with the design and development of solutions used in organizational practice.


Identification of innovative companies based on their websites

Automatic classification of companies in terms of innovativeness based on the content of their websites.

The project applied text classification and web mining to the automatic analysis of large collections of company websites.


Publications: text classification, fire service reports, SNN

Publications covering a review of text classification, information extraction from fire service reports, and biologically inspired models of text representation.
  • ESWA 2018 — a synthetic review of text classification.
  • Fire Technology 2019 — a publication on information extraction from fire reports.
  • PPSN 2020 — a publication on neuromorphic and biologically inspired approaches to text representation.

ANSI / INFOSTRATEG III — detecting dual quality of products

A system analyzing multilingual online reviews in terms of product quality, safety, and dual quality issues.

The project involved crawling, data extraction, and the analysis of product reviews in the context of quality assessment and consumer protection.

2024–2026

Current stage: document AI, LLM, and research synthesis

The most recent stage combines the development of LLM- and RAG-based systems with publications synthesizing knowledge on document classification, multimodality, and research standards.


Neural Networks and IEEE Access

Development of biologically inspired NLP methods and work focused on research quality in document classification.

Publications from this period combine the development of new NLP methods with the analysis of reporting standards and research reproducibility.


ACL Industry Track — detecting dual quality in product reviews

A publication resulting from the ANSI project, devoted to the analysis of multilingual product reviews.

The publication combines a research result with practical application in the analysis of multilingual product reviews.


Current direction: LLM, RAG, document analysis, and knowledge transfer through the blog

The current direction covers the design of LLM-based solutions and knowledge sharing around modern AI systems.

This direction extends earlier experience in document analysis, information structure, model evaluation, and decision-support systems.


Information Fusion — meta-analytical summary of document classification

A systematic review and quantitative synthesis of research on information fusion in document classification.

The publication synthesizes earlier threads concerning methodology, data representation, and multiview learning in the form of a quantitative literature review.