Highload fwdays'18 is a large-scale developer conference that gathers more than 600 participants for the second year in a row. It is dedicated to the development of high loaded technological projects in real-world conditions, as well as work with architecture and microservices, databases, Big Data etc. Among our speakers are experienced Ukrainian and international developers and experts. Each speaker goes through several stages of selection and rehearsals, thus the highest possible level of the program is guaranteed.

Saturday 15th September 2018

10:40
Rated 0
0
Make Your Data FABulous
Talk by Philipp Krenn in Main stage (40 minutes)

The CAP theorem is widely known for distributed systems, but it's not the only tradeoff you should be aware of. For datastores, there is also the FAB theory and just like with the CAP theorem you can only pick two: Fast: Results are real-time or near real-time instead of batch-oriented. Accurate: Answers are exact and don't have a margin of error. Big: You require horizontal scaling and need to distribute your data. While Fast and Big are relatively easy to understand, Accurate is a bit harder to picture. This talk shows some concrete examples of accuracy tradeoffs Elasticsearch can take for terms aggregations, cardinality aggregations with HyperLogLog++, and the IDF part of the full-text search. Or how to trade some speed or the distribution for more accuracy.

Rated 0
0
Sports Betting evolution
Talk by Konstantin Obraztsov, Betlab in Track A (40 minutes)

The primary target is to present a journey of one of the oldest sports betting platforms in the CIS region. It is a story about evolution from a small betting platform with 200 bets per day to a leading market contributor with 1 000 000 stakes per day. We will explore critical moments in the platform’s growth and scaling abilities. Main fails vs. wins and lessons learned. Finally, the decision to apply new architecture approaches to the new sophisticated platform: expectations and risks.

Rated 0
0
Universal highload patterns on a specific example of a game server on Haskell
Talk by Maksym Bezuglyi, Attracti S.a.r.l in Track C (40 minutes)

Analysis of the architecture of the game server on Haskell. From a high-level model to code features - transactional memory, immutable data structures, actors, queues, parallel and concurrent computing. The model of dynamic scaling, optimization, solved problems and trade-offs. - Reasons for choosing Haskell - What technologies were considered and why they were not used (no spoilers) - Full stack of used technologies and their highload potential - Universal design patterns

Rated 0
0
How to process 80 million events per day and build relational graphs with Postgres
Talk by Evgen Kostenko, SPS Commerce in Track B (1 hour)

We will review the evolutionary process of development and design of a highly-loaded service from the beginning stage to full functioning. This is a story about processing events, graphs, trees, horizontal scaling, fails and successful solutions.

11:50
Rated 0
0
Surviving Elasticsearch
Talk by Vitaliy Kharytonskiy, Prom.ua in Track A (40 minutes)

We've been using Elasticsearch at prom.ua for more than 4 years now. During this timespan we evolved our cluster from a couple of machines to almost hundred and successfully scaled its traffic from hundreds of queries per second to tens of thousands. In this talk I will try to share the bittersweet lessons learned from this adventure and tell about tools and approaches we use daily to work with this search engine. As a bonus part, I will tell you how we built a logging system which is able to process 700 million records per day for free and what came out of it.

Rated 0
0
Automated Machine Learning: building a conveyor
Talk by Mikhail Ovchinnikov, Badoo in Main stage (40 minutes)

What is the most difficult part of the machine learning process? Data collection? Feature Engineering? Model selection and tuning? Deploy and monitoring? What if you have a whole bunch of models, and business requires you to continuously improve, experiment, re-train and integrate models? And what if you are not even a Data Scientist? In this talk: How to not be drown in chaos, and build structured ML-integration process in a large company Taking a close look at what can be automated (spoiler: everything) Discussing "conveyor" taking ideas as input can make a great impact on business metrics, through fast and convenient machine learning integration What can we achieve by using very basic and simple models

Rated 0
0
DevOps in the Enterprise: what I have learned so far
Talk by Jose Quaresma in Track B (40 minutes)

In this presentation, I will share what I have learned in the last years working in the DevOps area with big enterprises at Accenture. I will argue for why Agile and DevOps practices are so important, why it matters for software developers, will share some of the Agile and DevOps lessons that I have learned until now, and will also discuss some anti-patterns. Finally, I will take you through some real-life examples to show you some of the DevOps work that I have been doing.

Rated 0
0
Scaling tech processes and team
Talk by Dmytro Voloshyn, Preply.com in Track C (40 minutes)

Dmytro will guide you through the pitfalls and share the insights of scaling an engineering team from 10 to 30 people. You will learn how to scale a startup team, hire and retain the best people, build and maintain processes and culture while in the growth stage.

12:40
Rated 0
0
Архитектура вокруг поиска
Talk by Andrew Aksyonoff, Sphinx Technologies in Main stage (40 minutes)

Начиная с определенного масштаба, вокруг любого базового поискового движка плюс рядом с ним неизбежно вырастает изрядная куча всяких интересных прослоек и сервисов. Особенно, когда одним лишь поиском по ключевым словам (либо вообще булевым, либо с простеньким ранжированием по формуле) дело ограничиваться перестает. Расскажу, как сегодня выглядит архитектура сервисов “вокруг и около поиска” у нас в Авито (числа и слова для привлечения внимания: 40M+ активных объявлений, тысячи RPS, ML ранжирование, пляски с анализом и доставкой данных, и всё такое).

Rated 0
0
MySQL Query Optimization Best Practices and Indexing
Talk by Alkin Tezuysal in Track C (40 minutes)

Talk is going to be about mainly query optimization and identifying slow queries in MySQL. We will talk about analyzing queries causing bottlenecks and how to find a solution with proper indexing. We will also talk about how to improve indexing and MySQL behavior on them. I will cover open source tooling and new developments in this subject.

Rated 0
0
Running Functions at the Edge
Talk by Dmytro Lavrinenko, SoftServe in Track B (40 minutes)

Let's talk about Serverless paradigm, applicable to the Edge (IoT Edge, SP Edge and Programmable Edge in general). This is not the Cutting Edge, say Hello to the journey of the day next to tomorrow.

Rated 0
0
Data science from the trenches
Talk by Vsevolod Solovyov, Prophy Science in Track A (40 minutes)

There are more than 100 million scientific papers and the pace of publishing is ever-increasing. More than one million articles are authored by someone named Wang. Working with the product of academia can be really far from working in academia. Come to learn anecdotes and stories about parsing, analyzing and extracting value from this kind of data.

14:30
Rated 0
0
Migrating Etsy infrastructure from On-premises to Google Cloud Platform
Talk by Chris Bohn in Track C (40 minutes)

Etsy is one of the largest and best-known specialty online marketplaces worldwide, with gross sales in 2017 exceeding $3 Billion. Etsy was founded in 2005, before the emergence of viable cloud platforms. Until recently, all of Etsy's critical systems -including production and analytics data stacks - were hosted and managed on premises. In 2017, the decision was made to migrate all infrastructure to Google Cloud Platform (GCP), to become operational in 2018. This talk describes the migration, with a focus on moving Etsy's analytics data systems. The Etsy Analytics Data Stack consists of Hadoop for large batch jobs, Vertica for data analysis, and Kafka for clickstream and production data distribution, as well as custom tools for Data Science projects and ETL processes. In addition to migrating legacy technologies to GCP, Etsy has also integrated native GCP data products such as Big Query (big data processing) and Airflow (workflow management replacing Oozie). The technical challenges and cloud economics of the migration will be discussed. This has been a very large project that has gone well, due to good planning and building the right teams. Anyone considering migrating infrastructure to the cloud, especially to GCP, will benefit from hearing about Etsy's challenges and solutions.

Rated 0
0
Unsupervised Real-Time Stream-Based Novelty Detection Technique
Talk by Anna Vergeles, Oracle, Nataliia Manakova, Oracle in Main stage (40 minutes)

High-load systems produce lots of telemetry information in every time slot. That is quite a challenge to say if the working load has changed significantly right now or everything runs as expected. This presentation covers the novelty detection technique used for cloud systems that combines non-real-time learning with real-time estimation ensemble.

Rated 0
0
PropTech product enhancement using ML
Talk by Vladimir Kubytskyi, LUN | Flatfy in Track A (40 minutes)

Is there any place for ML in Real Estate product? Yes! I'll tell our story – starting from the very few heuristic modules trying to determine property owners vs brokers. Until today, when we have tens of successfully running ML models in production. Ten-time less user's complaints we've got after delivering ML-based decision instead of heuristics one.

Rated 0
0
Dev(depression)ops
Talk by Vsevolod Polyakov in Track B (40 minutes)

Мало хто задоволений своєю роботою на 100% і це не тому, що ми, сучасні інженери, такі собі ніженки, а тому, що є комплексні проблеми, про які мовчать. Як з цим боротися? Ну, треба структурувати і пошукати рішення, чим я і зайнявся. Так що, на різних прикладах проблем та їх рішень я буду розповідати, як жити щасливішим. Буде корисним не тільки людям з тайтлом DevOps, а й загалом інженерам, тому що проблеми є загальними. Для затравки: «У чому проблема універсальних рішень», «Що потрібно бізнесу - потрібно і мені? Чи ні?», «Токсичні люди навколо, вони оточують», «що ми хочемо від роботи і що робота хоче від нас» та інші теми.

15:20
Rated 0
0
Measuring performance variability of EC2
Talk by Henrik Ingo, MongoDB in Track A (40 minutes)

Working in the MongoDB Server Performance Testing team, we use Amazon EC2 for system level testing. This allows us to flexibly deploy and tear down MongoDB clusters of various topologies, day after day. On the other hand, using a public cloud for performance testing can be challenging for repeatability of test results - to put it mildly. We, therefore, ended up spending several months just benchmarking EC2 itself. We compared combinations of different instance types and disks (ephemeral SSD vs PIOPS EBS). In the end, we found that the largest impact in reducing variability came from the same configuration options that we use on physical HW as well: turning off hyperthreading, using numactl and turning off CPU power saving states. Thus, you could argue that blaming "the cloud" for our performance trouble was wrong. It's possible to get similar performance characteristics from EC2 as physical hardware when used correctly, and when used incorrectly, both physical and cloud hardware will perform poorly. With the new configuration, we've been able to greatly lower variability of our daily performance tests, and increase trust in the test results. For WiredTiger tests, even the worst case is less than 10% min-max range, and MMAPv1 is close to that. We consider this to be below the threshold of performance change that most end users are able to observe anyway, hence it is sufficient for our performance testing purposes. The results also emphasized a golden rule of performance engineering: measure everything, assume nothing. It turned out the configuration, that was originally used for our performance testing, actually had the worst variability of all configurations we tested!

Rated 0
0
Survivorship Fairy Tales or When 1% Matters
Talk by Nikita Galkin, Independent Contractor in Track B (40 minutes)

During this talk, we will cover several cases on different projects with high load or without. We will talk about: metrics versioning microservices CI/CD dependencies testing

Rated 0
0
Building the perfect infrastructure with Kubernetes
Talk by Dmytro Nemesh, Lalafo in Main stage (1 hour)

Every company comes to a point where it’s infrastructure no longer fits team and business needs, and kludges are not working anymore. That’s the time to re-think and redesign the whole infrastructure. This is exactly where our company was half a year ago. I will talk about our experience dealing with this challenge while balancing between existing technology, costs, today’s reality and future needs.

Rated 0
0
Government bigdata for all
Talk by Aleksey Ivankin, Opendatabot in Track C (1 hour)

The history of the creation of Opendatabot, a platform for accessing and monitoring the government big data in chat bots and information systems. It is now used by 200,000 users in bots and 500 companies through API. The peculiarities of working with government data, creating microservices, chat bots and integrating with other services, using Elasticsearch.

16:30
Rated 0
0
Handling large amounts of traffic on the Edge
Talk by Helen Tabunshchyk in Track A (40 minutes)

Keeping good performance with increased amounts of traffic requires intelligent load balancing, transport affinity, and DDoS protection. In this talk, Helen will give an overview of how to design your network flow to process network packets in the most efficient way. You will learn about different techniques of L4 load balancing, BPF and XDP, software and hardware offload, and what future a new protocol QUIC will bring.

Rated 0
0
To Build My Own Cloud with Blackjack…
Talk by Sergey Dzyuban, SBTech in Track B (40 minutes)

Cloud providers like Amazon or Google have a great user experience to create and manage PaaS. But is it possible to reproduce the same experience and flexibility locally, in the on-premise datacenter? What if your own infrastructure grows to fast and your team can’t deal with it in the old way? What does Jenkins, .NET microservices and TVs for daily meetings have in common? This talk shares our experience using DC/OS (datacenter operating system) for building flexible and stable infrastructure. I will show the evolution of private cloud from the first steps with Vagrant to the hybrid cloud with instance groups in Google Cloud, the benefits it gives us and the problems we get instead.

Rated 0
0
It Scales Until It Doesn’t
Talk by Dmitry Tiagulskyi, Grammarly, Yaroslav Yermilov, Grammarly in Main stage (40 minutes)

We are used to thinking that “high-load” means distributed systems, computing power, application, and kernel profiling. But sometimes you can’t simply scale your cluster. Maybe your hashmaps don’t fit in the server memory. Maybe you need single-digit millisecond latency. Maybe the cost is too high. Or your server is a … mobile phone. In this talk, we will show how popular and lesser-known algorithms, data structures, and systems tuning helped us to overcome these blockers. Who said you don’t need to know algorithms nowadays?

Rated 0
0
Fear of Freedom - the pros and cons of building a system on open source
Talk by Maksim Korzhenevsky, Prozorro.ua in Track C (1 hour)

Risk Assessment in enterprise level systems builds on open source products and how usual SLAs could be replaced. Why open source products more secure and more modern? Quick access to innovation, an ongoing motivating process for the team, budget redirection from undifferentiated infrastructure technology to new initiatives, a huge market of product with the possibility of point-based customization.

17:20
Rated 0
0
Make it fast
Talk by Alexander Solovyov, Kasta in Main stage (40 minutes)

Any project passes “we need more features” and “omg why is everything so slow” phases during its life. They intertwine, they are not always pronounced, but they are there. Weirdly enough Kasta gets phase #2 regularly; few times a year. Some of the iterations are especially remarkable. This story is going to tell you how we quickly (heh) discover them and, for the lack of a better word - solve them.

Rated 0
0
From Legacy to High-Load: Evolution of web-application
Talk by Yevgen Lysenko, Concert.ua in Track A (40 minutes)

We’ll talk about a path that we’ve gone through @ Concert.ua: from a ticketing start-up to a leader of the ticketing market of the country. DDOS, f*ck-ups and victories. How did we manage to overcome a monolith piece of software and got to a distributed web-application architecture with high peak loads.