1. Apache Spark - Introduction

INTRODUCTION

Apache Spark is a distributed and highly scalable in-memory data analytics system, providing the ability to develop applications in Java, Scala, Python, as well as languages like R.

SPARK SUB-MODULES

Apache Spark provides 4 main sub modules :
  1. SQL
  2. MLlib
  3. GraphX
  4. Streaming