Apache Zeppelin is a new and incubating multi-purposed web-based notebook which brings data ingestion, data exploration, visualization, sharing and collaboration features to Hadoop and Spark. Apache Zeppelin interpreter concept allows any language/data-processing-backend to be plugged into Zeppelin. Currently, Apache Zeppelin supports many interpreters such as Apache Spark, Python, JDBC, Markdown and Shell.
CDH is Cloudera's software distribution containing Apache Hadoop and related projects. CDH is Cloudera’s 100% open source platform distribution, including Apache Hadoop and built specifically to meet enterprise demands. CDH delivers everything you need for enterprise use right out of the box. By integrating Hadoop with more than a dozen other critical open source projects, Cloudera has created a functionally advanced system that helps you perform end-to-end Big Data workflows.
Apache Spark is one of the most admired Open source projects in Apache Software Foundation. Due to its features, it is now considered as one of the Key technologies in Big Data Analytics Projects. Spark has evolved a lot since its inception. It has added a lot of interesting features like Support for R Language, lot of Machine Learning Algorithms, Real-time processing providing sub-second latency etc.
Developing and maintaining an integrated platform for reliably producing and deploying any machine learning models, requires the subsequent orchestration of many components—a learner unit for generating models based on train dataset, modules for validating both data as well as models, and finally an infrastructure for serving models in production phase. This becomes challenging when the data has velocity and veracity over time and fresh models need to be developed continuously.
Deep Learning is fun and amazing. Deep Learning (DL) is a subarea of Machine Learning which deals with the Deep Neural Networks (with emphasis on learning through successive “hidden layers”) and significant algorithms for the preprocessing of data and model regularization. DL has proved to be an amazing field that helps create great solutions and solve problems in the data science world with computer vision and NLP problems.