Curriculum Vitæ

Work Experience

Dec 2016 –

Junior Data Scientist, Marketing Team

  • Built a churn prediction system that resulted in a 21% retention improvement on A/B testing
  • Trained a semi-supervised customer classifier used for personalised marketing communication
  • Implemented a customer lifetime value model that guided marketing spending and customer segmentation
  • Built an internal time series processing system for trend & aberration analysis
  • Refactored and extended the ETL pipeline based on Python / Luigi / Amazon Redshift

London, United Kingdom

Jun 2015 – Sep 2015

Software Engineering Intern, gTech

  • Built a client application for Mopsus, internal time series forecasting platform written in Java
  • Enabled supervised machine learning in Mopsus by integrating it with Google Prediction API models
  • Discovered and fixed 7 critical bugs in Mopsus and Prediction API out of scope of the intern project

London, United Kingdom

Aug 2013 – Jun 2015

Software Engineer, OpenStack Engineering Performance

  • Submitted 123 commits across 3 projects in OpenStack, an open-source cloud-computing platform written in Python: Nova (resource management), Rally (benchmark system), Murano (app catalog)
  • As a core developer in Rally, implemented critical system components (benchmark runners, data processors, Rally API), accomplished 963 code reviews, was the top #2 contributor
  • Optimized the compute node listing algorithm in Nova, achieving a 5x performance improvement
  • Mentored the open-source community of Rally and prepared Rally documentation from scratch

Moscow, Russia

Jan 2013 – Jun 2015
(2 years 6 months)

Higher School of Economics
Research Assistant, Laboratory for Intelligent Systems and Structural Analysis (part-time)

  • Was a leading developer in a research group working on text analysis (headed by B. Mirkin, prof.)
  • Built a Python package for approximate string matching using Annotated Suffix Trees
  • Designed and implemented LM Monitor – graph-based, client-server text visualization system

Moscow, Russia


Sep 2015 – Sep 2016

Université Paris-Est Marne-la-Vallée
M.Sc. in Computer Science (double degree)

Paris, France

Sep 2010 – Sep 2016

National Research University Higher School of Economics
M.Sc. in Data Science / B.Sc. in Software Engineering (GPA 9.6/10, diploma with honours)

Moscow, Russia

Summer Schools:
Moscow, Russia
Ekaterinburg, Russia

Summer School on Bioinformatics Data Structures (2016)

Helsinki, Finland


Segalovich Scholarship from Yandex (2015), Russian President Scholarship (2012)


Areas of Expertise:

Machine Learning, Natural Language Processing, Bioinformatics

Computer Languages:

Python, Java, SQL (proficient), R, Matlab, C#, C++ (prior experience), Scala, F# (basic knowledge)

Natural Languages:

English (fluent, IELTS 8.5/9.0), German (fluent, DSD C1), French (intermediate), Russian (native)

Selected Publications


Dubov M. "Text Analysis with Enhanced Annotated Suffix Trees: Algorithms and Implementation" // Analysis of Images, Social Networks and Texts. Fourth International Conference, AIST 2015, Yekaterinburg, Russia, April 9-11, 2015, Revised Selected Papers. Communications in Computer and Information Science, Vol. 542, Springer.


Dubov M., Mirkin B., Shal A. "Automatic Russian Text Processing System" // Open Systems. DBMS – 2014. – v.22 №10, pp.15-17

Competitions & Awards


International Piano Competition for Outstanding Amateurs – 3d prize & Press award

Paris, France


Yandex Hackathon on Open Data – 3d prize

Moscow, Russia


"Claviarium" Competition for Amateur Pianists – 2nd prize

Moscow, Russia


All-Russian Olympiad in German Language – 1st prize

Nizhny Novgorod, Russia