Tomek Garbus

Contact me!

About me

I'm a software engineer, currently working at Google Warsaw.

CV

Projects

Controlled machine text generation of football articles Master's Thesis

My master thesis @ University of Warsaw.

Abstract:
Among other benefits of the rapid development in deep learning, language modelling (LM) systems have excelled at producing relatively long text samples that are (almost) indistinguishable from human-written text. This work categorizes conditional text generation systems into three paradigms: generation with placeholders, prompted generation, adversarial/reinforcement learning and provides an overview of each paradigm along with experiments – both machine- and human-judged. Example corpora of football news are used to discuss how a fast, domain-specific named entity recognition (NER) system can be built without much manual labour for English and Polish. The NER module is evaluated on manually labelled texts in both languages. It is then used not only to build fine-tuning sets for the language model, but also to aid its generation procedure, resulting in samples more compliant with provided control codes. Finally, a simple tool EDGAR for prompt-driven generation is presented. Two demos are made for the reader to experiment with and compare the proposed solutions with simply finetuned GPT-2 model.

In this work I have researched and categorised approaches of putting language models to controlled use-cases, on a toy domain of football news. The best results were obtained by unconventional use of GPT-2.

More interesting artifacts left from this research:

  • Web quiz for distinguishing between real and machine-generated articles. I posted it on football Reddits (r/soccer, r/football) and, it turned out, people had surprisingly hard time detecting the fakes!
  • EDGAR, an end-to-end system that can be used to replicate my results or run controlled text generation on a different dataset. Unfortunately my notebooks have disappeared from Google Colab 😔

Read the paper 📜 Take the quiz! ⚽
Repositories:
EDGAR Experimental

Face authentication with liveness detection using depth and IR camera Bachelor's Thesis

Researched and developed in a team of 4, in collaboration with Samsung. We have used Microsoft Kinect to build a POC of an authentication system based on face recognition.

We collected the database by filming people at our faculty (all GDPR compliant) and experimented with different models, ensembles of models and normalisation techniques (using IR signal to rotate the face to the 90° angle).

Our research also touched upon liveness detection issues (such as skin recognition) and correlation between collected frames (i.e. time to unlock) and model's accuracy.

Read the paper 📜
View source:
Github GitLab

Preprocessing a car dataset for GANs

Read the paper 📜
View source:
Github

Contact

Mail: tomasz.garbus1[at]gmail.com

Github: github.com/tomaszgarbus