Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

Hadoop online training  Slide 1 Hadoop online training  Slide 2 Hadoop online training  Slide 3 Hadoop online training  Slide 4 Hadoop online training  Slide 5 Hadoop online training  Slide 6 Hadoop online training  Slide 7 Hadoop online training  Slide 8 Hadoop online training  Slide 9 Hadoop online training  Slide 10 Hadoop online training  Slide 11 Hadoop online training  Slide 12 Hadoop online training  Slide 13 Hadoop online training  Slide 14 Hadoop online training  Slide 15 Hadoop online training  Slide 16 Hadoop online training  Slide 17 Hadoop online training  Slide 18 Hadoop online training  Slide 19 Hadoop online training  Slide 20 Hadoop online training  Slide 21 Hadoop online training  Slide 22 Hadoop online training  Slide 23 Hadoop online training  Slide 24 Hadoop online training  Slide 25 Hadoop online training  Slide 26 Hadoop online training  Slide 27 Hadoop online training  Slide 28 Hadoop online training  Slide 29 Hadoop online training  Slide 30
Upcoming SlideShare
Interactively Search and Visualize Your Big Data
Next
Download to read offline and view in fullscreen.

3 Likes

Share

Download to read offline

Hadoop online training

Download to read offline

Hadoop Training is cover Hadoop Administration training and Hadoop developer by Keylabs. we provide best Hadoop classroom & online-training in Hyderabad&Bangalore.
http://www.keylabstraining.com/hadoop-online-training-hyderabad-bangalore
Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment.

hadoop training, hadoop online training, hadoop training in bangalore, hadoop training in hyderabad, best hadoop training institutes, hadoop online training in chicago, hadoop training in mumbai, hadoop training in pune, hadoop training institutes ameerpet

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Hadoop online training

  1. 1. HADOOP TRAINING
  2. 2. HISTORY OF HADOOP Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open source web search engine, itself a part of the Lucene project.
  3. 3.  The name Hadoop is not an acronym; it’s a made-up name. The project’s creator, Doug Cutting, explains how the name came about: The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria. Kids are good at generating such. Googol is a kid’s term.  Subprojects and “contrib” modules in Hadoop also tend to have names that are unre-lated to their function, often with an elephant or other animal theme (“Pig,” for example). Smaller components are given more descriptive (and therefore more mun-dane) names. This is a good principle, as it means you can generally work out what something does from its name. For example, the jobtracker9 keeps track of MapReduce jobs.
  4. 4. INTRODUCTION TO HADOOP Hadoop is an open source software framework that supports data- intensive distributed applications. It is licensed under the Apache v2 license, and generally known as Apache Hadoop. Hadoop has been developed based on a paper originally written by Google on MapReduce system and applies concepts of functional programming; It is written in Java programming language and is the highest-level Apache project being constructed and used by a global community of contributors.
  5. 5. INTRODUCTION TO HADOOP Big giants like Yahoo and Facebook are using Hadoop as an integral part of their functioning – in 2008, Yahoo! Inc. established the world’s largest Hadoop production application. Also, the Yahoo! Search Webmap is a Hadoop application that runs on over 10,000 core Linux clusters, generating data that is now widely used in every Yahoo! Web search query. On the other hand, Facebook uses Apache Hadoop to keep track of its billions of user profiles as well as all the data related to them like their images, posts, comments, videos, etc.
  6. 6. Hadoop is not a database: Hadoop an efficient distributed file system and not a database. It is designed specifically for information that comes in many forms, such as server log files or personal productivity documents. Anything that can be stored as a file can be placed in a Hadoop repository.
  7. 7. Hadoop is used for:  Search - Yahoo, Amazon, Zvents  Log processing - Facebook, Yahoo  Data Warehouse - Facebook, AOL  Video and Image Analysis - New York Times, Eyealike
  8. 8. WHY HADOOP ? Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. Because Hadoop is open source and can run on commodity hardware, the initial cost savings are dramatic and continue to grow as your organizational data grows. It is part of the Apache project sponsored by the Apache Software Foundation.
  9. 9. WHY HADOOP ? Single Source of Truth:- With the enterprise data warehouse approach, organizations find their data scattered across many systems and silos. This decentralized environment can result in slow processing and inefficient data analysis. Hadoop makes it possible to consolidate your data and business intelligence capabilities within an Enterprise Data Hub. The ability to save all organizational data at its lowest level of granularity and bring all archive data into an Enterprise Data Hub gives business users greater and faster access to data.
  10. 10. WHY HADOOP ?
  11. 11. WHY HADOOP ? Faster Data Processing:- In legacy environments, traditional ETL and batch processes can take hours, days, or even weeks, in a world where businesses require access to data in minutes or seconds or even sub-seconds. Hadoop excels at high-volume batch processing. Because of its parallel processing, Hadoop can perform batch processes 10 times faster than on a single thread server or on the mainframe.
  12. 12. WHY HADOOP ? Get More for Less:- The true beauty of Hadoop is its ability to cost-effectively scale to rapidly growing data demands. With its distributed computing power, Hadoop configures across a cluster of commodity servers, or nodes. By augmenting its EDW environment with Hadoop, the enterprise can decrease its cost per terabyte of storage. With cheaper storage, organizations can keep more data that was previously too expensive to warehouse. This allows for the capture and storage of data from any source within the organization while decreasing the amount of data that is “thrown away” during data cleansing.
  13. 13. HADOOP INTERNAL SOFTWARE ARCHITECTURE
  14. 14. COMPONENTS OF HADOOP The current Apache Hadoop ecosystem consists of the Hadoop kernel, MapReduce, the Hadoop distributed file system (HDFS) and a number of related projects such as Apache Hive, HBase and Zookeeper. MapReduce and Hadoop distributed file system (HDFS) are the main component of Hadoop. MapReduce: The framework that understands and assigns work to the nodes in a cluster
  15. 15. COMPONENTS OF HADOOP Hadoop distributed file system (HDFS): HDFS is the file system that spans all the nodes in a Hadoop cluster for data storage. It links together the file systems on many local nodes to make them into one big file system. HDFS assumes nodes will fail, so it achieves reliability by replicating data across multiple nodes.
  16. 16. HADOOP ECOSYSTEM
  17. 17. ADVANTAGE OF HADOOP  Hadoop is Scalable  Hadoop is Cost effective  Hadoop is Flexible  Hadoop is Fault tolerant
  18. 18. PREREQUISITE TO LEARN HADOOP ? There is no strict prerequisite to start learning Hadoop. However, if you want to become an expert in Hadoop and make an excellent career, you should have at least basic knowledge of Java and Linux
  19. 19. IS JAVA REQUIRED TO LEARN HADOOP? Knowing Java is an added advantage, but Java is not strictly a prerequisite for working with Hadoop. Why Java is not strictly a prerequisite: Tools like Hive and Pig that are built on top of Hadoop offer their own high-level languages for working with data on your cluster. If you want to write your own MapReduce code, you can do so in any language (e.g. Perl, Python, Ruby, C, etc.) that supports reading from standard input and writing to standard output with Hadoop Streaming
  20. 20. IS JAVA REQUIRED TO LEARN HADOOP? Added advantage of Java in Hadoop: Although you can use Streaming to write your map and reduce functions in the language of your choice, there are some advanced features that are (at present) only available via the Java API.
  21. 21. LINUX IS EXTRA BENEFIT WHILE LEARNING HADOOP? Hadoop can run on Windows, it was built initially on Linux and Linux is the preferred method for both installing and managing Hadoop. Having a solid understanding of getting around in a Linux shell will also help you tremendously in digesting Hadoop, especially with regards to many of the HDFS command line parameters
  22. 22. COURSE CONTENT Hadoop Introduction and Overview: • What is Hadoop? • History of Hadoop • Building Blocks – Hadoop Eco-System • Who is behind Hadoop? • What Hadoop is good for and what it is not Hadoop Distributed File System (HDFS): • HDFS Overview and Architecture • HDFS Installation • Hadoop File System Shell • File System Java API
  23. 23. COURSE CONTENT Map/Reduce: • Map/Reduce Overview and Architecture • Installation • Developing Map/Red Jobs • Input and Output Formats • Job Configuration • Job Submission • HDFS as a Source and Sink • HBase as a Source and Sink • Hadoop Streaming
  24. 24. COURSE CONTENT HBase: • HBase Overview and Architecture • HBase Installation • HBase Shell • CRUD operations • Scanning and Batching • Filters • HBase Key Design
  25. 25. COURSE CONTENT Pig: • Pig Overview • Installation • Pig Latin • Pig with HDFS Hive: • Hive Overview • Installation • Hive QL
  26. 26. COURSE CONTENT Sqoop: • Sqoop Overview • Installation • Imports and Exports Zoo Keeper: • Zoo Keeper Overview • Installation • Server Mantainace Putting it all together: • Distributed installations
  27. 27. PLEASE CHECK THE LINK http://www.keylabstraining.com/hadoop-online- training-hyderabad-bangalore
  28. 28. PLEASE CONTACT:  +91-9550-645-679 (India)  +1-908-366-7933 (USA)  Skype id : keylabstraining  Email id : info@keylabstraining.com
  • salesforce-training

    Dec. 30, 2014
  • hadoop-training

    Dec. 30, 2014
  • suresh575

    Dec. 30, 2014

Hadoop Training is cover Hadoop Administration training and Hadoop developer by Keylabs. we provide best Hadoop classroom & online-training in Hyderabad&Bangalore. http://www.keylabstraining.com/hadoop-online-training-hyderabad-bangalore Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. hadoop training, hadoop online training, hadoop training in bangalore, hadoop training in hyderabad, best hadoop training institutes, hadoop online training in chicago, hadoop training in mumbai, hadoop training in pune, hadoop training institutes ameerpet

Views

Total views

728

On Slideshare

0

From embeds

0

Number of embeds

8

Actions

Downloads

27

Shares

0

Comments

0

Likes

3

×