• Error when configuring Hadoop on CentOS

    I've been configuring Hadoop on one of our servers that's running CentOS. When I run start-dfs.sh, I keep getting this error: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable So I'm not sure what to do here. Has anyone...

    ITKE363,905 pointsBadges:
  • ElasticSearch for real-time statistics

    We have to log millions and millions of small log documents on a weekly basis. This includes: Ad hoc queries for data mining Joining and filtering values Full-text search with Python We thought about using HBase and running Hadoop jobs to generate stat results. But the problem is that the results...

    ITKE363,905 pointsBadges:
  • Hadoop error when accessing application through HTML

    We have installed Hadoop on our cluster and now we're installing HTTPFS to access the HDFS content using HTTP protocol. We're able to access the normal page but when we tried to access HDFS, we're getting an error: {"RemoteException":{"message":"User: ubantu is not allowed to impersonate ubantu",...

    ITKE363,905 pointsBadges:
  • Out of memory error when installing Hadoop

    I recently tried to install Hadoop following a document my friend gave me. When I tried to execute this: bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' I got this exception: java.lang.OutOfMemoryError: Java heap space Has anyone seen this before? Like I said, I'm pretty new to...

    ITKE363,905 pointsBadges:
  • JAVA_HOME is not set correctly when installing Hadoop on Ubuntu

    I've been trying to install Hadoop on Ubuntu 11.10. I just set the JAVA_HOME variable in the file conf/hadoop-env.sh to: # export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk Then I tried to execute these commands: $ mkdir input $ cp conf/*.xml input $ bin/hadoop jar hadoop-examples-*.jar grep input...

    ITKE363,905 pointsBadges:
  • Hadoop: What’s the difference between Pig and Hive?

    I'm pretty new to the Hadoop world (been using it for about a month) and I've started to get into Hive, Pig and Hadoop using Cloudera's Hadoop VM. Is there a difference between Pig and Hive? I understand they have similar commands so I'm trying to figure out the big differences.

    ITKE363,905 pointsBadges:
  • What’s the difference between S3 and S3N in Hadoop?

    When we recently connected our Hadoop cluster to our Amazon storage and downloaded a file to HDFS, we noticed that s3:// didn't work but when we tried out S3N, it worked. Why didn't it work with S3? Is there a difference between the two?

    ITKE363,905 pointsBadges:
  • Hadoop: Safemode recovery is taking too long

    We have a Hadoop cluster with 18 data nodes. We recently restarted the name node about three hours ago and it's still in safe mode! We're not sure if we should try to restart it. We looked online and found this to try: dfs.namenode.handler.count 3 true Should we try this? If not, has anyone seen...

    ITKE363,905 pointsBadges:
  • Hadoop: How to handle data streams in real-time

    I've recently been working with Hadoop and now I'm using it to handle data streams in real-time. For this, I would like to build a meaningful POC around it so I could showcase it. I'm pretty limited in resources so any help would be appreciated.

    ITKE363,905 pointsBadges:
  • How to run Hadoop job without JobConf

    I'm trying to submit a Hadoop job that doesn't use the deprecated JobConf class. But my friend told me that JobClient only supports methods that take a JobConf parameter. Does anyone know how I can submit a Hadoop job using only the configuration class? Is there a Java code for it?

    ITKE363,905 pointsBadges:
  • Big data: How to get started

    We've been using R for several years and now we're starting to get into Python. We've been using RDBMS systems for data warehousing and R for number-crunching. Now, we think it's time to get more involved with big data analysis. Does anyone know how we should get started (basically how to use...

    ITKE363,905 pointsBadges:
  • How to compress large files in Hadoop

    I need to process a huge file and I'm looking to use Hadoop for it. From what my friend has told me, the file would get split into several different nodes. But if the file is compressed, then the file won't be split and would need to be processed a single node (and I wouldn't be able to use...

    ITKE363,905 pointsBadges:
  • Free space in HDFS

    Would there be a HDFS command to see if there's available free space in HDFS. I'm able to see it through the the browser using master:hdfsport. But unfortunately, I can't access it and I need a command. I can see disk usage but not free space. Appreciate the help.

    ITKE363,905 pointsBadges:
  • What framework should I use for fast Hadoop real-time data analysis?

    I'm trying to do some real-time data analysis on data in HDFS but I'm not sure which framework I should use. I'm deciding between Cloudera, Apache and Spark. Which one would best suite me? Thanks!

    ITKE363,905 pointsBadges:
  • Pass mapped data to multiple reduce functions in Hadoop

    I currently have a large datasest that I need to analyze with multiple reduce functions. What I would like to do is read the dataset only once and then pass the mapped data to multiple reduce functions. Is there a way I can do this in Hadoop? Thank you!

    ITKE363,905 pointsBadges:
  • Getting warning message when starting Hadoop cluster

    I just started a Hadoop cluster but I keep getting this warning message: $HADOOP_HOME is deprecated. But when I add export HADOOP_HOME_WARN_SUPPRESS="TRUE" into hadoop-env.sh, I don't get the message anymore (when I start the cluster). When I run this: hadoop dfsadmin -report, I see the message...

    ITKE363,905 pointsBadges:
  • How to install Mahout on Hadoop cluster

    We recently created a Hadoop cluster (that has 3 slaves and 1 master using Ambari server/Hortonworks). Now we're trying to install mahout 0.9 in the master machine so we can run mahout jobs in the cluster. Is there a way to do that?

    ITKE363,905 pointsBadges:
  • Hadoop: The difference between jobconf and job objects

    I'm currently working in Hadoop but I'm having difficulty finding the difference between jobconf and job objects. This is how I'm submitting my job as of today: JobClient.runJob(jobconf); But then my friend send me this for submitting jobs: Configuration conf = getConf(); Job job = new Job(conf,...

    ITKE363,905 pointsBadges:
  • How do I produce big data in Hadoop?

    I've been working with Hadoop and Nutch over the past few weeks and I need the a massive amount of data. I'm trying to start with 20 GB would like to reach between 1-2 TB at some point. But, as of right now, I don't have that much data but would like to produce it. The data could be anything...

    ITKE363,905 pointsBadges:
  • Big data books to start a career

    I apologize if this isn't the right area to ask but I'm looking to get into the big data field (I would like to work in the industry) so would anyone happen to know of some great books on big data? I'm looking for anything on Hadoop or HBase. Thanks so much!

    ITKE363,905 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following