{"id":5075,"date":"2024-06-04T10:37:22","date_gmt":"2024-06-04T01:37:22","guid":{"rendered":"https:\/\/blog.since2020.jp\/?p=5075"},"modified":"2024-06-04T10:37:22","modified_gmt":"2024-06-04T01:37:22","slug":"apache-beam-dataflow","status":"publish","type":"post","link":"https:\/\/since2020.jp\/media\/apache-beam-dataflow\/","title":{"rendered":"Apache Beam (Dataflow) \u306e\u57fa\u672c\u7528\u8a9e\u3068\u30c7\u30fc\u30bf\u306e\u6d41\u308c"},"content":{"rendered":"\n<p>\u30d3\u30c3\u30b0\u30c7\u30fc\u30bf\u6642\u4ee3\u306e\u30c7\u30fc\u30bf\u51e6\u7406\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3068\u3057\u3066\u3001Apache Beam\u306f\u975e\u5e38\u306b\u5f37\u529b\u306a\u30c4\u30fc\u30eb\u3067\u3059\u3002Google Cloud Dataflow\u306f\u3001Apache Beam\u3067\u5b9a\u7fa9\u3055\u308c\u305f\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u3092\u5b9f\u884c\u3059\u308b\u305f\u3081\u306e\u30de\u30cd\u30fc\u30b8\u30c9\u30b5\u30fc\u30d3\u30b9\u3067\u3059\u3002\u672c\u8a18\u4e8b\u3067\u306f\u3001Apache Beam\u306e\u57fa\u672c\u7528\u8a9e\u3068\u30c7\u30fc\u30bf\u306e\u6d41\u308c\u3092\u56f3\u89e3\u3068\u3068\u3082\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002<\/p>\n\n\n<h2>\u306f\u3058\u3081\u306b<\/h2>\n<p><span style=\"font-family: arial, helvetica, sans-serif\">\u30d3\u30c3\u30b0\u30c7\u30fc\u30bf\u6642\u4ee3\u306e\u30c7\u30fc\u30bf\u51e6\u7406\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3068\u3057\u3066\u3001Apache Beam\u306f\u975e\u5e38\u306b\u5f37\u529b\u306a\u30c4\u30fc\u30eb\u3067\u3059\u3002Google Cloud Dataflow\u306f\u3001Apache Beam\u3067\u5b9a\u7fa9\u3055\u308c\u305f\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u3092\u5b9f\u884c\u3059\u308b\u305f\u3081\u306e\u30de\u30cd\u30fc\u30b8\u30c9\u30b5\u30fc\u30d3\u30b9\u3067\u3059\u3002\u672c\u8a18\u4e8b\u3067\u306f\u3001Apache Beam\u306e\u57fa\u672c\u7528\u8a9e\u3068\u30c7\u30fc\u30bf\u306e\u6d41\u308c\u3092\u56f3\u89e3\u3068\u3068\u3082\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002<\/span><!-- notionvc: a46ce6e3-5831-4d5c-98b0-c91557546c44 --><\/p>\n\n<h2>Apache Beam\u3068\u306f<\/h2>\n<p><span style=\"font-family: arial, helvetica, sans-serif\">Apache Beam\u306f\u3001\u30d0\u30c3\u30c1\u51e6\u7406\u304a\u3088\u3073\u30b9\u30c8\u30ea\u30fc\u30e0\u51e6\u7406\u306e\u4e21\u65b9\u3092\u30b5\u30dd\u30fc\u30c8\u3059\u308b\u7d71\u4e00\u30e2\u30c7\u30eb\u3092\u63d0\u4f9b\u3059\u308b\u30c7\u30fc\u30bf\u51e6\u7406\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3067\u3059\u3002Beam\u306f\u3001\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u306e\u4f5c\u6210\u3001\u5909\u63db\u3001\u304a\u3088\u3073\u5b9f\u884c\u3092\u7c21\u5358\u306b\u884c\u3046\u305f\u3081\u306eAPI\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/span><!-- notionvc: 53c7a69d-fe1c-4d01-9b52-f2a7e4ce3069 --><\/p>\n\n<h2>\u57fa\u672c\u7528\u8a9e<\/h2>\n<b><span style=\"font-family: arial, helvetica, sans-serif\">PCollection<\/span><\/b>\r\n<p><span style=\"font-family: arial, helvetica, sans-serif\">PCollection\u306f\u3001Apache Beam\u306b\u304a\u3051\u308b\u30c7\u30fc\u30bf\u306e\u57fa\u672c\u5358\u4f4d\u3067\u3042\u308a\u3001\u30d0\u30c3\u30c1\u30c7\u30fc\u30bf\u307e\u305f\u306f\u30b9\u30c8\u30ea\u30fc\u30e0\u30c7\u30fc\u30bf\u3092\u542b\u3080\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/span><\/p>\r\n<b><span style=\"font-family: arial, helvetica, sans-serif\">PTransform<\/span><\/b>\r\n<p><span style=\"font-family: arial, helvetica, sans-serif\">PTransform\u306f\u3001PCollection\u306b\u5bfe\u3059\u308b\u5909\u63db\u64cd\u4f5c\u3092\u5b9a\u7fa9\u3057\u307e\u3059\u3002\u4f8b\u3048\u3070\u3001\u30d5\u30a3\u30eb\u30bf\u30ea\u30f3\u30b0\u3001\u30de\u30c3\u30d4\u30f3\u30b0\u3001\u30b0\u30eb\u30fc\u30d7\u5316\u306a\u3069\u306e\u64cd\u4f5c\u304c\u3042\u308a\u307e\u3059\u3002<\/span><\/p>\r\n<b><span style=\"font-family: arial, helvetica, sans-serif\">Pipeline<\/span><\/b>\r\n<p><span style=\"font-family: arial, helvetica, sans-serif\">Pipeline\u306f\u3001PCollection\u3068PTransform\u3092\u7d44\u307f\u5408\u308f\u305b\u305f\u30c7\u30fc\u30bf\u51e6\u7406\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u3092\u5b9a\u7fa9\u3057\u307e\u3059\u3002<\/span><\/p>\r\n<b><span style=\"font-family: arial, helvetica, sans-serif\">Runner<\/span><\/b>\r\n<p><span style=\"font-family: arial, helvetica, sans-serif\">Runner\u306f\u3001\u5b9a\u7fa9\u3055\u308c\u305f\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u3092\u5b9f\u884c\u3059\u308b\u30d0\u30c3\u30af\u30a8\u30f3\u30c9\u3092\u6307\u3057\u307e\u3059\u3002Google Cloud Dataflow\u306f\u3001\u305d\u306e\u4e00\u3064\u306e\u5b9f\u88c5\u3067\u3059\u3002<\/span><\/p>\r\n<p><!-- notionvc: cbdf5c44-0c73-445a-a7b5-55e8d0f3c6c2 --><\/p>\n\n<h2>Apache Beam\u306e\u30c7\u30fc\u30bf\u6d41\u308c<\/h2>\n<p><span style=\"font-family: arial, helvetica, sans-serif\">\u4ee5\u4e0b\u306b\u3001Apache Beam\u306e\u30c7\u30fc\u30bf\u6d41\u308c\u3092\u56f3\u89e3\u3057\u307e\u3059\u3002<\/span><!-- notionvc: 29f7d047-d522-40cd-b402-7335954bb8e7 --><\/p>\r\n<div class=\"hcb_wrap\">\r\n<pre class=\"prism line-numbers lang-plain\" data-lang=\"Plain Text\"><code>\u00a0+---------------------+\r\n| Input Data |\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| PCollection |\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| PTransform (Map) |\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| PCollection |\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| PTransform (Filter)|\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| PCollection |\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| PTransform (Group) |\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| PCollection |\r\n+---------------------+\r\n|\r\nv\r\n+---------------------+\r\n| Output Data |\r\n+---------------------+\r\n<\/code><\/pre>\r\n<\/div>\r\n<p><span style=\"font-family: arial, helvetica, sans-serif\">\u3053\u306e\u56f3\u3067\u306f\u3001\u30c7\u30fc\u30bf\u306fPCollection\u3068\u3057\u3066\u8868\u73fe\u3055\u308c\u3001PTransform\u3092\u901a\u3058\u3066\u5909\u63db\u3055\u308c\u307e\u3059\u3002\u6700\u7d42\u7684\u306b\u3001\u51e6\u7406\u7d50\u679c\u306f\u518d\u3073PCollection\u3068\u3057\u3066\u8868\u73fe\u3055\u308c\u3001\u6700\u7d42\u7684\u306a\u51fa\u529b\u30c7\u30fc\u30bf\u3068\u3057\u3066\u4fdd\u5b58\u3055\u308c\u307e\u3059\u3002<\/span><!-- notionvc: 262fd341-00e4-420e-908a-c28f515e799c --><\/p>\n\n<h2>Dataflow\u306e\u7279\u5fb4<\/h2>\n<p><span style=\"font-family: arial, helvetica, sans-serif\">Google Cloud Dataflow\u306f\u3001Apache Beam\u3067\u5b9a\u7fa9\u3055\u308c\u305f\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u3092\u5b9f\u884c\u3059\u308b\u305f\u3081\u306e\u30de\u30cd\u30fc\u30b8\u30c9\u30b5\u30fc\u30d3\u30b9\u3067\u3059\u3002Dataflow\u306e\u7279\u5fb4\u3068\u3057\u3066\u4ee5\u4e0b\u304c\u3042\u308a\u307e\u3059\u3002<\/span><\/p>\r\n<ul>\r\n\t<li><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30d5\u30eb\u30de\u30cd\u30fc\u30b8\u30c9\u30b5\u30fc\u30d3\u30b9<\/strong>: \u30a4\u30f3\u30d5\u30e9\u7ba1\u7406\u3092\u4e0d\u8981\u306b\u3057\u3001\u30b9\u30b1\u30fc\u30e9\u30d3\u30ea\u30c6\u30a3\u3068\u53ef\u7528\u6027\u3092\u81ea\u52d5\u3067\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/span><\/li>\r\n\t<li><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30b9\u30b1\u30fc\u30e9\u30d3\u30ea\u30c6\u30a3<\/strong>: \u30c7\u30fc\u30bf\u91cf\u306b\u5fdc\u3058\u3066\u81ea\u52d5\u7684\u306b\u30ea\u30bd\u30fc\u30b9\u3092\u30b9\u30b1\u30fc\u30eb\u30a2\u30a6\u30c8\u3057\u307e\u3059\u3002<\/span><\/li>\r\n\t<li><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u51e6\u7406<\/strong>: \u30d0\u30c3\u30c1\u51e6\u7406\u3068\u30b9\u30c8\u30ea\u30fc\u30e0\u51e6\u7406\u306e\u4e21\u65b9\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3001\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u30c7\u30fc\u30bf\u51e6\u7406\u3092\u5b9f\u73fe\u3057\u307e\u3059\u3002<\/span><\/li>\r\n<\/ul>\r\n<p><!-- notionvc: 84a66704-0e81-4cdc-a1fd-9d50ed303953 --><\/p>\n\n<h2>Windowing (\u30a6\u30a3\u30f3\u30c9\u30a6\u51e6\u7406)<\/h2>\n<p><span style=\"font-family: arial, helvetica, sans-serif\">\u30b9\u30c8\u30ea\u30fc\u30e0\u51e6\u7406\u3067\u306f\u3001\u7121\u9650\u306e\u30c7\u30fc\u30bf\u30b9\u30c8\u30ea\u30fc\u30e0\u3092\u30a6\u30a3\u30f3\u30c9\u30a6\u3068\u547c\u3070\u308c\u308b\u6709\u9650\u306e\u30b5\u30a4\u30ba\u306b\u5206\u5272\u3057\u3066\u51e6\u7406\u3057\u307e\u3059\u3002\u4ee5\u4e0b\u306b\u3001\u4e00\u822c\u7684\u306a\u30a6\u30a3\u30f3\u30c9\u30a6\u306e\u7a2e\u985e\u3092\u7d39\u4ecb\u3057\u307e\u3059\u3002<\/span><\/p>\r\n<ul>\r\n\t<li><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30bf\u30f3\u30d6\u30ea\u30f3\u30b0\u30a6\u30a3\u30f3\u30c9\u30a6 (Tumbling Window)<\/strong>: \u56fa\u5b9a\u30b5\u30a4\u30ba\u306e\u30a6\u30a3\u30f3\u30c9\u30a6\u3067\u30c7\u30fc\u30bf\u3092\u5206\u5272\u3057\u307e\u3059\u3002\u5404\u30a6\u30a3\u30f3\u30c9\u30a6\u306f\u91cd\u306a\u3089\u305a\u3001\u9023\u7d9a\u3057\u3066\u3044\u307e\u3059\u3002<\/span><\/li>\r\n\t<li><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30b9\u30e9\u30a4\u30c7\u30a3\u30f3\u30b0\u30a6\u30a3\u30f3\u30c9\u30a6 (Sliding Window)<\/strong>: \u56fa\u5b9a\u30b5\u30a4\u30ba\u306e\u30a6\u30a3\u30f3\u30c9\u30a6\u3092\u4e00\u5b9a\u9593\u9694\u3067\u30b9\u30e9\u30a4\u30c9\u3055\u305b\u3066\u30c7\u30fc\u30bf\u3092\u5206\u5272\u3057\u307e\u3059\u3002\u30a6\u30a3\u30f3\u30c9\u30a6\u306f\u91cd\u306a\u308b\u3053\u3068\u304c\u3042\u308a\u307e\u3059\u3002<\/span><\/li>\r\n\t<li><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30db\u30c3\u30d4\u30f3\u30b0\u30a6\u30a3\u30f3\u30c9\u30a6 (Hopping Window)<\/strong>: \u56fa\u5b9a\u30b5\u30a4\u30ba\u306e\u30a6\u30a3\u30f3\u30c9\u30a6\u3092\u4efb\u610f\u306e\u9593\u9694\u3067\u30b8\u30e3\u30f3\u30d7\u3055\u305b\u3066\u30c7\u30fc\u30bf\u3092\u5206\u5272\u3057\u307e\u3059\u3002\u30a6\u30a3\u30f3\u30c9\u30a6\u304c\u91cd\u306a\u308b\u3053\u3068\u304c\u3042\u308a\u307e\u3059\u3002<\/span><\/li>\r\n\t<li><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30bb\u30c3\u30b7\u30e7\u30f3\u30a6\u30a3\u30f3\u30c9\u30a6 (Session Window)<\/strong>: \u4e00\u9023\u306e\u95a2\u9023\u3059\u308b\u30a4\u30d9\u30f3\u30c8\u3092\u30b0\u30eb\u30fc\u30d7\u5316\u3057\u3001\u7279\u5b9a\u306e\u30a2\u30a4\u30c9\u30eb\u6642\u9593\u304c\u7d4c\u904e\u3059\u308b\u3068\u30a6\u30a3\u30f3\u30c9\u30a6\u304c\u9589\u3058\u307e\u3059\u3002<\/span><\/li>\r\n<\/ul>\r\n<p><span style=\"font-family: arial, helvetica, sans-serif\"><strong>\u30a6\u30a3\u30f3\u30c9\u30a6\u51e6\u7406\u306e\u56f3\u89e3<\/strong><\/span><!-- notionvc: 9a5bdd6b-4fb7-4e26-880f-f5d6c483faf3 --><\/p>\r\n<div class=\"hcb_wrap\">\r\n<pre class=\"prism line-numbers lang-plain\" data-lang=\"Plain Text\"><code>+-------------------+ +-------------------+\r\n| Tumbling Window | | Sliding Window |\r\n+-------------------+ +-------------------+\r\n| [0-5) [5-10) [10-) | | [0-5) [2-7) [4-9) |\r\n+-------------------+ +-------------------+\r\n\r\n+-------------------+ +-------------------+\r\n| Hopping Window | | Session Window |\r\n+-------------------+ +-------------------+\r\n| [0-5) [3-8) [6-11) | | [0-3) [4-8) [9-12)|\r\n+-------------------+ +-------------------+<\/code><\/pre>\r\n<\/div>\r\n<p><span style=\"font-family: arial, helvetica, sans-serif\">\u3053\u306e\u56f3\u3067\u306f\u3001\u5404\u30a6\u30a3\u30f3\u30c9\u30a6\u306e\u30c7\u30fc\u30bf\u306e\u5206\u5272\u65b9\u6cd5\u3092\u8996\u899a\u7684\u306b\u793a\u3057\u3066\u3044\u307e\u3059\u3002<\/span><!-- notionvc: d148fe9d-cc7c-43f5-afff-a7c6d25234c0 --><\/p>\r\n<p>&nbsp;<\/p>\r\n<p><!-- notionvc: cb614849-84e6-40dd-8e3d-d64b244c5e0f --><\/p>\n\n<h2>\u7d50\u8ad6<\/h2>\n<p><span style=\"font-family: arial, helvetica, sans-serif\">Apache Beam (Dataflow)\u306f\u3001\u5f37\u529b\u3067\u67d4\u8edf\u306a\u30c7\u30fc\u30bf\u51e6\u7406\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3067\u3059\u3002\u57fa\u672c\u7528\u8a9e\u3067\u3042\u308bPCollection\u3001PTransform\u3001Pipeline\u3001Runner\u3092\u7406\u89e3\u3059\u308b\u3053\u3068\u3067\u3001\u52b9\u7387\u7684\u306a\u30c7\u30fc\u30bf\u51e6\u7406\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u3092\u69cb\u7bc9\u3067\u304d\u307e\u3059\u3002\u307e\u305f\u3001\u30a6\u30a3\u30f3\u30c9\u30a6\u51e6\u7406\u3092\u6d3b\u7528\u3059\u308b\u3053\u3068\u3067\u3001\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u30c7\u30fc\u30bf\u30b9\u30c8\u30ea\u30fc\u30e0\u306e\u51e6\u7406\u3082\u5bb9\u6613\u306b\u884c\u3048\u307e\u3059\u3002Dataflow\u306e\u30d5\u30eb\u30de\u30cd\u30fc\u30b8\u30c9\u30b5\u30fc\u30d3\u30b9\u3092\u6d3b\u7528\u3059\u308b\u3053\u3068\u3067\u3001\u30b9\u30b1\u30fc\u30e9\u30d6\u30eb\u304b\u3064\u4fe1\u983c\u6027\u306e\u9ad8\u3044\u30c7\u30fc\u30bf\u51e6\u7406\u304c\u53ef\u80fd\u3068\u306a\u308a\u307e\u3059\u3002\u3053\u308c\u304b\u3089\u306e\u30d3\u30c3\u30b0\u30c7\u30fc\u30bf\u6642\u4ee3\u306b\u304a\u3044\u3066\u3001Apache Beam\u3068Dataflow\u306f\u6b20\u304b\u305b\u306a\u3044\u30c4\u30fc\u30eb\u3068\u306a\u308b\u3067\u3057\u3087\u3046\u3002<\/span><!-- notionvc: 3894b4d2-dc8c-4e99-b36e-7dc4409520fc --><\/p>","protected":false},"excerpt":{"rendered":"<p>\u30d3\u30c3\u30b0\u30c7\u30fc\u30bf\u6642\u4ee3\u306e\u30c7\u30fc\u30bf\u51e6\u7406\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3068\u3057\u3066\u3001Apache Beam\u306f\u975e\u5e38\u306b\u5f37\u529b\u306a\u30c4\u30fc\u30eb\u3067\u3059\u3002Google Cloud Dataflow\u306f\u3001Apache Beam\u3067\u5b9a\u7fa9\u3055\u308c\u305f\u30d1\u30a4\u30d7\u30e9\u30a4\u30f3\u3092\u5b9f\u884c\u3059\u308b\u305f\u3081\u306e\u30de\u30cd\u30fc\u30b8\u30c9\u30b5\u30fc [&hellip;]<\/p>\n","protected":false},"author":87,"featured_media":4955,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","swell_btn_cv_data":"","footnotes":"","_wp_rev_ctl_limit":""},"categories":[1246],"tags":[634,614],"class_list":["post-5075","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-infrastructure","tag-apache-beam","tag-cloud-dataflow"],"_links":{"self":[{"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/posts\/5075","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/users\/87"}],"replies":[{"embeddable":true,"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/comments?post=5075"}],"version-history":[{"count":0,"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/posts\/5075\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/media\/4955"}],"wp:attachment":[{"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/media?parent=5075"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/categories?post=5075"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/since2020.jp\/media\/wp-json\/wp\/v2\/tags?post=5075"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}