I'm a software engineer, currently working at Google Warsaw.CV
My master thesis @ University of Warsaw.
Among other benefits of the rapid development in deep learning, language modelling (LM) systems have excelled at producing relatively long text samples that are (almost) indistinguishable from human-written text. This work categorizes conditional text generation systems into three paradigms: generation with placeholders, prompted generation, adversarial/reinforcement learning and provides an overview of each paradigm along with experiments – both machine- and human-judged. Example corpora of football news are used to discuss how a fast, domain-specific named entity recognition (NER) system can be built without much manual labour for English and Polish. The NER module is evaluated on manually labelled texts in both languages. It is then used not only to build fine-tuning sets for the language model, but also to aid its generation procedure, resulting in samples more compliant with provided control codes. Finally, a simple tool EDGAR for prompt-driven generation is presented. Two demos are made for the reader to experiment with and compare the proposed solutions with simply finetuned GPT-2 model.
In this work I have researched and categorised approaches of putting language models to controlled use-cases, on a toy domain of football news. The best results were obtained by unconventional use of GPT-2.
More interesting artifacts left from this research:
Researched and developed in a team of 4, in collaboration with Samsung. We have used Microsoft Kinect to build a POC of an authentication system based on face recognition.
We collected the database by filming people at our faculty (all GDPR compliant) and experimented with different models, ensembles of models and normalisation techniques (using IR signal to rotate the face to the 90° angle).
Our research also touched upon liveness detection issues (such as skin recognition) and correlation between collected frames (i.e. time to unlock) and model's accuracy.