Software Engineering on Koby Bibas

Software Engineering on Koby Bibashttps://kobybibas.github.io/tags/software-engineering/Recent content in Software Engineering on Koby BibasHugo -- gohugo.ioenFri, 12 Jun 2026 00:00:00 +0000[Summary] MapReduce for Software Engineershttps://kobybibas.github.io/posts/20260612_mapreduce_for_software_engineers/summary/Fri, 12 Jun 2026 00:00:00 +0000https://kobybibas.github.io/posts/20260612_mapreduce_for_software_engineers/summary/TL;DR Processing large-scale data sequentially is slow. MapReduce is a framework for parallel batch processing: you write map and reduce, and the system handles splitting the work, grouping intermediate results, retries, and execution across machines. The Problem Suppose we have a very large set of web server logs and want to count how many times each URL was accessed. On one machine, the logic is simple: Read every log line. Extract the URL.