Apache Spark Streamingの論文まとめ（１章〜５章

Spark

こんにちは。Apache Spark Streamingに関する論文「Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing」（http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf）について、概念と動作概要が書かれた１章〜５…

2013-09-29

Apache Spark Streamingの論文を読んでみます（５章

Spark

以下論文を読んでみようの続きで、今回は第５章です。「Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing」（http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf）内容としては、「Fault and Straggler Recov…

2013-09-29

Apache Spark Streamingの論文を読んでみます（４章

Spark

こんにちは。以下論文を読んでみようの続きで、今回は第４章です。「Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing」（http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf）内容としては、「System Archit…

2013-09-28

Apache Spark Streamingの論文を読んでみます（３章

Spark

こんにちは。以下論文を読んでみようの続きで、今回は第３章です。「Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing」（http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf）内容としては、「Discretized S…

2013-09-28

Apache Spark Streamingの論文を読んでみます（２章

Spark

こんにちは。以下論文を読んでみようの続きで、今回は第２章です。「Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing」（http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf）内容としては、「 Goals and Ba…

2013-09-28

Apache Spark Streamingの論文を読んでみます（１章

Spark

こんにちは。ここ何回かでApache Spark自体の機構は大体わかりました。そのため、今度はApache Sparkを基にしたストリーム処理基盤であるApache Spark Streamingについても確認してみます。読んだ論文は以下です。「Discretized Streams: A Fault-Tolerant…

2013-09-24

Resilient Distributed Datasetsに関する論文まとめ（１章〜５章

Spark

こんにちは。Resilient Distributed Datasetsに関する論文「Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing」（http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf）について、概念と動作概…

2013-09-23

Resilient Distributed Datasetsに関する論文を読んでみます（５章

Spark

以下論文を読んでみようの続きで、今回は第５章です。「Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing」（http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf）内容としては、「Implementati…

2013-09-17

Resilient Distributed Datasetsに関する論文を読んでみます（４章

Spark

こんにちは。以下論文を読んでみようの続きで、今回は第４章です。「Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing」（http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf）内容としては、「…

2013-09-16

Resilient Distributed Datasetsに関する論文を読んでみます（３章

Spark

こんにちは。以下論文を読んでみようの続きで、今回は第章です。「Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing」（http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf）内容としては、「Sp…

2013-09-11

Resilient Distributed Datasetsに関する論文を読んでみます（２章

Spark

こんにちは。以下論文を読んでみようの続きで、今回は第２章です。「Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing」（http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf）内容としては、「…

2013-09-09

SpringXD＝リアルタイム解析やバッチ処理同士をつなぐ汎用、分散、拡張可能なデータ統合基盤？

SpringXD

こんにちは。最近色々手を出し過ぎな気もしますが、気になったものがあったのでちとまとめてみます。それは、Spring XDです。 http://www.springsource.org/spring-xdぱっとトップページを見てみるとやたらと使えそうに見えたので、実際にチュートリアルを…

2013-09-08

Resilient Distributed Datasetsに関する論文を読んでみます（１章

Spark

こんにちは。前回、前々回でApache Spark、Spark Streamingの概要がわかりました。ですが、内部で使用している共有分散メモリ機構であるResilient Distributed Datasets(RDDs)が鍵となる割に概要しか資料からはわからなかったため、論文を読むことでもう一…